Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore integrating with Cloudflare's Domain Intelligence API #143

Open
rviscomi opened this issue Sep 26, 2022 · 1 comment
Open

Explore integrating with Cloudflare's Domain Intelligence API #143

rviscomi opened this issue Sep 26, 2022 · 1 comment

Comments

@rviscomi
Copy link
Member

https://api.cloudflare.com/#domain-intelligence-properties

This API could help with categorizing websites based on type, eg Travel, Technology, News, etc. It's been on our wish list for a long time and would unlock new kinds of analysis.

Need to look into what the requirements/limitations are. Is it free? Can we get enough quota for our crawl rate? Is it only supported for websites that use Cloudflare? Is our use case aligned with the TOS?

We'll also need to assess how it would be integrated with the crawl and how the data would be exposed. cc @pmeenan

@pmeenan
Copy link
Member

pmeenan commented Sep 27, 2022

It looks like the domain intelligence is for ~100k domains (or at least the rank info is). We wouldn't want to call the API directly as part of the crawl.

The Cloudflare API sets a maximum of 1,200 requests in a five minute period.

I can ping the team to see if they would be interested in offering the raw dataset to HA to merge with the crawl data but that feels like it would basically be dumping and exposing their full database every month and I'm not sure that's fair to their IP (happy to ask though).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants