Doubling down on its accusations of AI scaremongering, Microsoft is hitting back at two concurrent lawsuits over its own involvement in AI and specifically large language models, calling them either “doomsday hyperbole” or “doomsday futurology.”
The latter phrasing is used in a published motion to dismiss (pdf warning) the case the New York Times has brought against Microsoft and OpenAI, where the NYT holds the two defendants “responsible for the billions of dollars they owe for the unlawful copying and use of The Times's uniquely valuable works.”
In that original claim the Times states that while it had been working “for months” to come to an agreement with the defendants over terms for its contribution to the training of OpenAI's large language models, it was now pursuing compensation through the courts instead.
OpenAI itself has given particular weight to the NYT articles that it has trained its models with, and contends that “by OpenAI’s own admission high-quality content, including content from The Times, was more important and valuable for training the GPT models as compared to content taken from other, lower-quality sources.”
Microsoft, however, is likening the NYT's copyright claim to the entertainment industry's attempt to halt the rise of the VCR when it too cited copyright infringement as a thing working against Hollywood. “The Court ultimately rejected the alarmism and voted for technological innovation and consumer choice,” reads the filing.
The motion to dismiss then goes onto describing LLMs and the machine learning algorithm that has driven the recent explosion in more effective chatbots—the transformer—and claims the datasets used to train these LLMs “does not supplant the market for the works, it teaches the models language. It is the essence of what copyright law considers transformative—and fair—use.”
The main thrust of Microsoft's motion, however, isn't about the training per se, but aims to address the allegations made by the NYT that the public's usage of LLM, and particularly GPT-based products, is causing harm to the publication “and thus poses a mortal threat to 'independent journalism' at 'a cost to society [that] will be enormous.'”
It claims The Times used unrealistic prompts in order to coax the LLMs to output text that matched NYT articles, stating that it's not actually based on the ways that normal people, or a reasonable person, would use these tools.
“Nowhere does The Times allege that anyone other than its legal team would actually do any of this,” it states, “and certainly not on a scale that merits the doomsday futurology it pushes before this Court and has boosted to its readers.”
This could end up as a key point in this claim. The idea of 'the reasonable person' is a vital point of law, and could be used to decide whether the NYT's claims that it could be damaged by LLMs should be measured by what a reasonable person might do with the technology. Likewise, maybe Microsoft and OpenAI ought to be held to what a reasonable person might expect when it comes to privacy in a digital, always online age.
Because the second case (pdf warning) is a class action which apparently “devotes 198 pages to doomsday hyperbole about AI as a threat to civilization,” and sees 13 plaintiffs demanding the courts enjoin the defendants (OpenAI and Microsoft) from “their ongoing violations of the privacy and property rights of millions and [are] required to take immediate action to implement proper safeguards and regulations for the Products, their users, and all of society.”
Among other things, the original filing states that the scraping of the plaintiffs' digital footprints means the defendants can “misappropriate our skill sets and encourage our own professional obsolescence. This would obliterate privacy as we know it.”
To which Microsoft retorts (pdf warning): “Plaintiffs do not plead any facts plausibly showing they have been affected by any of the supposed 'scraping,' 'intercepting,' and 'eavesdropping' they allege. Nowhere do they say what of their private information Microsoft ever improperly collected or used; nor do they identify any harm they individually suffered from anything that Microsoft allegedly did. Plaintiffs cannot state a claim based on the hypothetical experiences of others. This deficiency alone requires dismissal of all of Plaintiffs' claims.”
OpenAI has provided a similar stance in its own motions to dismiss, and the plaintiffs in the privacy case have explicitly responded, as noted by The Register.
“OpenAI gave no notice to the world that, for years, it was secretly harvesting from the internet everything ever created and shared online, anywhere, by hundreds of millions of Americans.
What is artificial general intelligence?: We dive into the lingo of AI and what the terms actually mean.
“That, for a decade plus, every consumer's use of the internet thus operated as a gratuitous donation to OpenAI: of our insights, talents, artwork, personally identifiable information, copyrighted works, photographs of our families and children, and all other expressions of our personhood—for products that stand to concentrate the country's wealth in even fewer corporate behemoths, displace jobs at scale, and risk the future of mission-critical industries like art, music, and journalism, while creating dangerous new industries like the high-speed spawning of child pornography.”
There are seemingly as many AI-focused lawsuits as there are claims against former presidents right now, and while we might be looking at these claims and counterclaims as merely forests worth of paper filled with legalese and battling big tech, they will have an effect on all our lives. For good or ill. They will set pivotal precedents that will inevitably change the development of both AI and the internet.
And possibly, if we're sticking with the hyperbole theme, humanity itself.