Anthropic's Controversial Book Scanning for AI Training
Anthropic has been revealed to have destroyed millions of print books to train its AI model, Claude. This operation was part of a legally sanctioned fair use process, as detailed in recent court documents. The company hired Tom Turvey, formerly of Google Books, to lead the acquisition and digitization of these books, aiming to replicate Google's successful book digitization strategy.
The process involved purchasing used books in bulk, removing their bindings, and scanning the pages into digital formats, after which the physical copies were discarded. This method was deemed transformative by Judge William Alsup, who ruled that the operation qualified as fair use because Anthropic legally purchased the books and did not distribute the digital copies.
The decision highlights the AI industry's demand for high-quality text data, which is crucial for training large language models like Claude. These models rely on vast amounts of well-edited text to improve their capabilities, making the acquisition of such data a competitive necessity. Despite the legality of the process, Anthropic's earlier use of pirated materials had initially complicated its legal standing.
We hope you enjoyed this article.
Consider subscribing to one of our newsletters like AI Policy Brief or Daily AI Brief.
Also, consider following us on social media:
More from: Regulation
Subscribe to AI Policy Brief
Weekly report on AI regulations, safety standards, government policies, and compliance requirements worldwide.
Whitepaper
Governing the Future: A Strategic Framework for AI Adoption in Financial Institutions
This whitepaper explores the transformative impact of artificial intelligence on the financial industry, focusing on the governance challenges and regulatory demands faced by banks. It provides a strategic framework for AI adoption, emphasizing the importance of a unified AI approach to streamline compliance and reduce operational costs. The document offers actionable insights and expert recommendations for banks with fewer than 2,000 employees to become leaders in compliant, customer-centric AI.
Read more