Columbia University Unveils Comprehensive Tracker for AI-Media Deals and Litigation
In a significant development for the intersection of artificial intelligence and media industries, Columbia University’s Knight First Amendment Institute has launched an interactive online tracker dedicated to monitoring deals between media companies and AI developers, as well as related lawsuits. Announced on October 10, 2024, the “AI Media Deals and Litigation Tracker” serves as a centralized resource to document the evolving landscape of commercial partnerships and legal battles surrounding AI’s use of journalistic content.
The tracker emerges amid growing tensions between traditional media outlets and AI companies, particularly over the unlicensed use of copyrighted material to train large language models. Media organizations argue that such practices undermine their business models, while AI firms contend that transformative use falls under fair use doctrines. This tool aims to provide transparency, enabling researchers, journalists, policymakers, and industry stakeholders to navigate the complex web of agreements and disputes.
Key Features of the Tracker
The platform, accessible via the Knight Institute’s website, presents data in an intuitive, searchable format. Users can filter entries by category—deals or litigation—date, parties involved, and key terms such as licensing, revenue sharing, or attribution requirements. Each entry includes summaries, links to original announcements, court filings, and relevant press coverage, ensuring a verifiable trail of information.
As of its launch, the tracker catalogs over 20 deals and a comparable number of lawsuits. It is designed to be dynamic, with plans for regular updates as new developments arise. The Institute emphasizes that the resource is non-exhaustive but strives for comprehensiveness by focusing on publicly available information from major players.
Spotlight on Media-AI Deals
The deals section highlights a surge in partnerships since early 2024, reflecting media companies’ strategic pivot toward monetizing their content archives. Notable examples include:
-
News Corp and OpenAI: A multiyear agreement granting OpenAI access to content from The Wall Street Journal, New York Post, and other properties. In return, News Corp receives revenue sharing and integration of its journalism into ChatGPT.
-
Reddit and Google: A $60 million annual licensing deal providing Google with real-time access to Reddit’s user-generated content for AI training and display in search results.
-
Apple and Condé Nast: Apple licensed content from Vogue, Wired, and other titles to enhance its AI-powered features, such as image generation in Apple Intelligence.
-
Amazon and Associated Press: Focused on licensing AP’s editorial content for training Amazon’s AI models, with provisions for attribution.
Other partnerships feature Microsoft with The Atlantic, Perplexity AI with Time magazine, and a consortium deal involving Axel Springer (publisher of Politico and Business Insider) with OpenAI. Many agreements emphasize “attribution guarantees,” where AI outputs credit original sources, alongside revenue-sharing models that could range from flat fees to usage-based royalties.
The tracker reveals patterns: Legacy publishers like Gannett and The New York Times are actively negotiating, while some opt for collective bargaining through organizations like the News/Media Alliance. Deal values remain opaque, but public disclosures suggest figures in the tens of millions annually for top-tier content providers.
Litigation Landscape
Parallel to these collaborations, the tracker documents a wave of lawsuits accusing AI companies of systematic copyright infringement. High-profile cases include:
-
The New York Times v. OpenAI and Microsoft: Filed in December 2023, alleging millions of articles were ingested without permission. The suit seeks damages and an injunction against future use.
-
Associated Press v. OpenAI and Anthropic: AP claims unauthorized scraping of its wire stories, highlighting tensions even among those pursuing deals.
-
Authors Guild and others v. OpenAI: Representing writers like John Grisham and George R.R. Martin, this class action focuses on book content but underscores broader media concerns.
Additional suits involve Thomson Reuters, Chicago Tribune, and international plaintiffs like Le Monde. Defendants argue fair use, transformative purpose, and the public benefit of AI innovation. To date, no case has reached a final judgment, with many in discovery phases. Settlements, such as those hinted in smaller disputes, often include licensing retroactively.
The tracker notes jurisdictional nuances, with U.S. federal courts dominant, alongside Canadian and U.K. filings. It also tracks related regulatory scrutiny, such as FTC inquiries into market dominance.
Broader Implications and Institute’s Role
Curated by Knight Institute Digital Fellow Justin Hendrix and Research Director Ramya Krishnan, the tracker underscores the Institute’s mission to protect free expression in the digital age. By mapping these dynamics, it illuminates risks to journalistic independence—such as AI firms influencing coverage through financial ties—and opportunities for sustainable revenue.
As AI models proliferate, the tool could inform policy debates on copyright reform, AI transparency mandates, and opt-out mechanisms for content creators. It also aids smaller outlets excluded from big deals, fostering equitable access to bargaining power.
Columbia’s initiative arrives at a pivotal moment. With generative AI reshaping information ecosystems, transparent tracking is essential for accountability. The Knight First Amendment Institute invites contributions to refine the database, positioning the tracker as a collaborative public good.
This resource not only chronicles transactions and trials but also signals a maturing industry dialogue. Media executives, AI ethicists, and legal scholars alike stand to benefit from its rigorous, real-time insights.
Gnoppix is the leading open-source AI Linux distribution and service provider. Since implementing AI in 2022, it has offered a fast, powerful, secure, and privacy-respecting open-source OS with both local and remote AI capabilities. The local AI operates offline, ensuring no data ever leaves your computer. Based on Debian Linux, Gnoppix is available with numerous privacy- and anonymity-enabled services free of charge.
What are your thoughts on this? I’d love to hear about your own experiences in the comments below.