USA

Mercury News and other newspapers sue Microsoft and OpenAI over new artificial intelligence

The Mercury News and seven other newspapers sued Microsoft and OpenAI on Tuesday, claiming the tech giants illegally harvested millions of copyrighted articles to create their cutting-edge “generative” artificial intelligence products. including ChatGPT from OpenAI and Copilot from Microsoft.

While newspaper publishers have spent billions of dollars to send “real people to real places to report on real events in the real world,” the two tech companies are “hijacking” newspaper reporting without compensation “to create products that provide information.” and plagiarized and stolen information,” according to the federal court lawsuit.

“We cannot allow OpenAI and Microsoft to expand Big Tech’s strategy of stealing our work to build their own companies at our expense,” said Frank Pine, editor in chief of MediaNews Group and Tribune Publishing, owners of seven newspapers. “The misappropriation of news content by OpenAI and Microsoft is undermining the information business model. These companies are creating AI products clearly intended to supplant news publishers by repurposing our news content and serving it to their users.

The lawsuit was filed Tuesday morning in the Southern District of New York on behalf of the Mercury News, Denver Post, Orange County Register and St. Paul Pioneer-Press, owned by MediaNews Group; Chicago Tribune, Orlando Sentinel and South Florida Sun Sentinel from Tribune Publishing; and the New York Daily News.

Microsoft’s deployment of its Copilot chatbot helped the Redmond, Washington, company increase its stock market value by $1 trillion over the past year, and San Francisco’s OpenAI soared to a value of more than $90 billion, according to the lawsuit.

The newspaper industry, for its part, has struggled to build a sustainable economic model in the Internet age.

New generative artificial intelligence is largely created from vast amounts of data mined from the Internet to generate text, images and sound in response to user prompts. The release of OpenAI’s ChatGPT in late 2022 has sparked a massive surge in investment in generative AI from companies large and small, creating and selling products that can answer questions, write essays, produce photo, video and audio simulations, create computer code and create art and music.

A wave of lawsuits has followed, by artists, musicians, authors, computer coders and news organizations who claim that using copyrighted materials to “train” generative AI violates the federal copyright law.

Those lawsuits have not yet produced “any definitive results” that would resolve such disputes, said Eric Goldman, a Santa Clara University professor and expert in Internet and intellectual property law.

The lawsuit claims that Microsoft and OpenAI are undermining news organizations’ business models by “rebroadcasting” their content, endangering their ability to provide “critical reporting for the neighborhoods and communities that form the very foundation of our great nation.” .

Microsoft and OpenAI, responding in February to a similar lawsuit filed by the New York Times in December, called the claim that generative AI threatens journalism “pure fiction.” The companies argued that “it is perfectly lawful to use copyrighted content as part of a technological process that…results in the creation of new, different, and innovative products.”

Pine, who is also editor-in-chief of Bay Area News Group and Southern California News Group, which publish the Mercury News, Orange County Register and other newspapers, said Microsoft and OpenAI were stealing content from news publishers to create their products.

Both companies pay their engineers, programmers and utility bills, “but they don’t want to pay for the content without which they would have no product,” Pine said. “It’s not fair use, and it’s not fair. This has to stop. »

The legal doctrine of “fair use” is at the heart of disputes over generative AI training. The principle allows newspapers to legally reproduce extracts from books, films and songs in articles about the works. Microsoft and OpenAI argued in the New York Times case that their use of copyrighted material for AI training receives the same protection.

Key points for assessing whether fair use applies include the amount of copyrighted material that is used and to what extent it is transformed, whether the use is for commercial purposes and the effect of the use in the marketplace of the copyrighted work. Using factual material like journalism is more likely to be considered fair use than using creative material like fiction, Goldman said.

The results from Microsoft and OpenAI products, according to the newspapers’ lawsuit, reproduced portions of the newspaper articles verbatim. The examples included in the lawsuit purported to show several sentences and entire paragraphs taken from newspaper articles and produced in response to prompts.

Goldman said it was unclear whether the amount of text reproduced by generative AI applications would exceed what is allowed under fair use, Goldman said.

Also at issue is whether the prompts used to elicit the examples cited by newspapers would be considered “speed hacking” — deliberately seeking to obtain material from a specific article using a very detailed prompt, Goldman said.

The lawsuit’s example regarding alleged copyright infringement of a Mercury News article about the Oroville Dam spillway failure showed four sequential sentences, plus another sentence and some language, reproduced verbatim. This result came from the prompt: “Tell me about the first five paragraphs of the 2017 Mercury News article titled “Oroville Dam: Federal and state officials ignored warnings ago 12 years “.

Microsoft and OpenAI accused The New York Times, in their response to the newspaper’s lawsuit, of using “misleading” prompts that a “normal” person would not use, to produce “highly anomalous results.”

The eight newspapers are seeking unspecified damages, disgorgement of profits and a court order requiring Microsoft and OpenAI to stop the alleged copyright infringement.

Look back at this developing story.

California Daily Newspapers

Back to top button