Amazon Secures Court Order Halting Perplexity’s AI Shopping Agent
In a significant legal victory, Amazon has obtained a temporary restraining order (TRO) from a federal judge in the Western District of Washington, effectively blocking Perplexity AI’s shopping agent, known as Comet, from accessing Amazon.com. This ruling, issued on November 18, 2024, stems from an ongoing lawsuit filed by Amazon against Perplexity in October 2024, accusing the AI startup of systematically scraping vast amounts of data from its website in violation of its terms of service and robots.txt file.
The dispute highlights escalating tensions between e-commerce giants and AI companies over web scraping practices. Amazon alleges that Perplexity employs hidden, sophisticated crawlers to harvest product listings, pricing information, reviews, and other proprietary data at an industrial scale. Such activities, according to Amazon, undermine its investments in content creation and site infrastructure while enabling Perplexity to build and train AI models without permission or compensation.
Central to the case is Comet, Perplexity’s experimental AI agent launched in late October 2024. Designed as a “deep research” tool, Comet autonomously browses the web to identify shopping deals, compare prices across retailers, and even execute purchases using linked credit cards. Users can instruct it via natural language prompts, such as finding the best deals on specific items or negotiating prices. In demonstrations, Comet navigated Amazon product pages, extracted details like stock availability and discounts, and simulated buying processes. Perplexity marketed Comet as a revolutionary agent capable of handling complex tasks beyond traditional search, positioning it as a step toward fully autonomous AI shopping assistants.
Amazon’s complaint details how Perplexity’s operations evade detection. The company purportedly rotates IP addresses, spoofs user agents to mimic human browsers, and ignores robots.txt directives, which explicitly prohibit automated access to certain site areas. Court filings include forensic evidence from Amazon’s systems, such as server logs showing Perplexity’s crawler, dubbed “proxygen,” querying millions of pages daily. Proxygen is described as a custom-built scraper that strips away JavaScript rendering to rapidly fetch raw HTML, far exceeding what public APIs allow.
Perplexity has mounted a vigorous defense, denying any scraping of Amazon. In blog posts and legal responses, the company asserts that Comet relies solely on Amazon’s official product APIs, which are publicly available for developers. It claims these APIs provide structured data on products, prices, and availability without needing to scrape web pages. Perplexity’s CEO, Aravind Srinivas, publicly stated that the agent uses “direct integrations” and does not violate terms of service. Following Amazon’s lawsuit, Perplexity voluntarily disabled Comet’s access to Amazon as a precautionary measure, emphasizing its commitment to ethical AI development.
However, Amazon countered these claims with concrete evidence during the TRO hearing. Investigators traced Comet’s interactions back to Perplexity’s infrastructure, revealing that the agent frequently falls back on proxygen when API limits are hit or data is incomplete. Screenshots and network traces submitted to the court show Comet loading full Amazon pages, parsing unstructured content, and bypassing rate limits. One exhibit highlighted Comet’s behavior in real-time deal hunting, where it systematically crawled categories like electronics to aggregate offers, actions inconsistent with API-only usage.
U.S. District Judge John C. Coughenour granted the TRO after reviewing the submissions, finding “good cause” based on Amazon’s likelihood of success on the merits, irreparable harm from data theft, and the balance of equities favoring protection of intellectual property. The order prohibits Perplexity, its officers, and agents from accessing Amazon’s site using automated means, including Comet and proxygen. It mandates preservation of relevant records and allows for expedited discovery. A follow-up hearing is scheduled to determine if the TRO becomes a preliminary injunction.
This development underscores broader challenges in the AI era. Web scraping has long been a gray area, but court precedents like hiQ Labs v. LinkedIn have affirmed website owners’ rights to control access via terms of service. For AI firms, reliance on public data for training large language models has drawn scrutiny from publishers and platforms alike. Perplexity, valued at over $9 billion after recent funding, faces similar suits from News Corp and Forbes, signaling a pattern of aggressive data acquisition.
The TRO’s immediate impact confines Comet’s operations, limiting it to other retailers like Best Buy or Walmart. Perplexity must now pivot, potentially enhancing API integrations or seeking partnerships. Amazon’s action serves as a deterrent, warning AI developers that autonomous agents scraping commercial sites risk swift judicial intervention.
As litigation progresses, the case could set precedents for AI agent governance, defining boundaries between innovation and infringement. It raises questions about the viability of “agentic AI” in consumer applications and the need for standardized data access protocols.
Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.
What are your thoughts on this? I’d love to hear about your own experiences in the comments below.