Global cloud platform company Cloudflare has accused Perplexity AI of using “stealth, undeclared crawlers” to intentionally ignore websites’ no-crawl directives.
In other words, Perplexity is getting around regulations aimed at stopping its AI web crawlers from receiving access to certain websites, according to this new report.
For AI critics, this news might serve as another indicator that AI tools are built off the backs of unauthorized data. It’s a claim that, oddly enough, OpenAI CEO Sam Altman himself has aligned with, in a 2024 legal filing that claimed AI requires copyrighted material in order to train itself.
What’s Perplexity AI Up To?
According to the Cloudflare report, Perplexity is responding to websites’ network blocks by, apparently, “obscur[ing] their crawling identity in an attempt to circumvent the website’s preferences.”
The AI company is reportedly modifying their user agent, changing their source ASNs, and even ignoring or not fetching “robots.txt” files, all in an attempt to disguise unauthorized activity.
This just in! View
the top business tech deals for 2025 👨💻
Cloudflare is having none of it. According to its report, the company has de-listed Perplexity as a verified bot and has added heuristics to their managed rules to block even stealth crawling.
How Cloudflare Ran Tests
The issue was first brought to Cloudflare’s attention by its own customers, who issued multiple complaints about unauthorized Perplexity crawling, according to the report.
“We received complaints from customers who had both disallowed Perplexity crawling activity in their robots.txt files and also created WAF rules to specifically block both of Perplexity’s declared crawlers: PerplexityBot and Perplexity-User. These customers told us that Perplexity was still able to access their content even when they saw its bots successfully blocked. We confirmed that Perplexity’s crawlers were in fact being blocked on the specific pages in question, and then performed several targeted tests to confirm what exact behavior we could observe.” -Cloudflare
That purported behavior? A new user agent that claimed it was “Google Chrome on macOS” — and not PerplexityBot or Perplexity-User, the only crawlers that Perplexity should be using.
For its part, Perplexity appears to disagree with Cloudflare’s findings. In a statement to The Verge, a Perplexity spokesperson called the report a “publicity stunt,” and said that “there are a lot of misunderstandings in the blog post.”
Perplexity’s Partnerships and Deals
Perplexity has been having a busy few months, even prior to weathering these new accusations of unscrupulous web crawling activity.
First, in early June Bloomberg reported a potential deal between Samsung and the AI startup that would see the tech giant invest in Perplexity while integrating its product into future Samsung devices.
Then, later that month, reports surfaced of a potential acquisition: Tech heavyweight Apple was internally discussing the benefits of buying Perplexity outright.
Perplexity’s AI tools are used by 15 million people and the company reached a $1 billion valuation last year. Now, it appears to be working towards closing lucrative deals with deep-pocketed companies like Apple and Samsung. Let’s hope it isn’t willing to break website regulations to come out ahead.