Tag Archive

OpenAI Scales Up Crawling & Bots For The Holidays

Published on 2025/11/28 By admin

OpenAI is reportedly scaling up its crawling infrastructure for the holiday shopping season. The folks at Merj noticed OpenAI adding a lot of new IP ranges for its bots and crawlers.

Cloudflare To Block AI Crawlers By Default & Pay Per Crawl Model

Published on 2025/07/01 By admin

Cloudflare, a company that powers about 20% of all web pages on the web, has announced it is now blocking AI crawlers by default. Plus, it is offering a new model to allow AI services to pay content creators to crawl their content, named Pay Per Crawl.

Bing Crawl Consumption Not Showing Return On Investment

Published on 2018/05/31 By admin

Joost de Valk, the founder of Yoast, posted some interesting data on Twitter yesterday around crawlers and how much they consume of their site, how active they are and if there is any return on investment. The big one is that Bing crawled ~84…

Is Your Local Pricing Strategy Blocking Search Engine Spiders?

Published on 2013/02/12 By admin

eMarketer’s recent report on Global e-commerce growth showed online sales globally exceeded $1 trillion in 2012. They further indicate that global e-commerce will grow by an additional 19% in 2013, with the Asia-Pacific region surpassing North America in online sales. This reemphasizes the…

SpiderDuck: The Realtime Twitter Spider

Published on 2011/11/18 By admin

Twitter has unveiled their URL fetcher, which they named SpiderDuck…

Google: Remove The Robots.txt File Completely

Published on 2011/01/06 By admin

Believe it or not, I am not a huge fan of placing robots.txt files on sites unless you want to specifically block content and sections from Google or other search engines. It just always felt redundant to tell a search engine they can crawl your site when they will do so unless you tell them not to.