All the media companies that have licensing deals with OpenAI (so far)

Of the myriad controversies plaguing OpenAI, the issue of training data has emerged as the most polarizing. For publishers, this polarization comes in the form of a choice between getting as far away as possible, or cozying up and making a deal.

OpenAI has kept a lid on information about what models like GPT4o were trained on, meaning ChatGPT’s recipe is a secret. Similar LLMs, however, are fed on social media posts, blogs, digitized books, online reviews, Wikipedia pages, and pretty much any piece of information on the web that you can think of. In fact, at least one scholar, Berkley computer scientist Stuart Russell, thinks most of the known internet was gobbled up by LLMs in order to replicate human intelligence and mirror it back to us in automated form.

Naturally, AI training data also includes articles from online news and media sites.

Publications soon caught on that ChatGPT’s knowledge of historical and current events was clearly fueled by stories published on their sites (even paywalled pages) and that OpenAI was profiting from it. What has followed is a messy copyright dilemma with no clear answer. Publications like the New York Times have filed lawsuits against OpenAI alleging copyright infringement. OpenAI claims “training AI models using publicly available internet materials is fair use.” But, while the careful wording of “publicly available” may sound like “public domain,” it only refers to how the data was obtained, not the copyright status.

As Ed Newton-Rex, CEO of the AI certification organization Fairly Trained, says, “there is a very real danger that the phrase ‘publicly available’ is used to hide copyright infringement in plain sight.” Yet OpenAI has deep historical precedent on their side, and U.S. copyright laws strongly protect fair use and freedom of information.

Tweet may have been deleted

A way to avoid obsolescence, or a ‘devil’s bargain’?

The question of what OpenAI can legally feed into its models is still being worked out, but in the meantime some publications have sorted themselves into factions to settle the question in the short term: some block OpenAI from ingesting their products altogether, while others have struck deals.

Media companies that have partnered with OpenAI argue that generative AI is here to stay, and that it’s better to get a piece of the pie than risk becoming obsolete. Plus, partnering with OpenAI gives publications some semblance of control over how their journalism surfaces in ChatGPT responses.

“As the media and technology landscapes change, it’s vital that accurate, trustworthy information reaches the public,” said Pam Wasserstein, president of Vox Media, which recently announced a licensing partnership with OpenAI, “and this partnership recognizes that human creativity and quality journalism are a key part of responsible deployment of generative AI.”

Jessica Lessin, CEO of The Information, who is critical of these deals, has summed them up as follows:

“Facing the threat of lawsuits, they are pursuing business deals, to absolve [OpenAI] of the theft. These deals amount to settling without litigation. The publishers willing to roll over this way aren’t just failing to defend their own intellectual property — they are also trading their own hard-earned credibility for a little cash from the companies that are simultaneously undervaluing them and building products quite clearly intended to replace them.”

More succinctly, Damon Beres from The Atlantic (one of the publications that signed a licensing agreement with OpenAI) called striking a deal “a devil’s bargain.”

What OpenAI gets from these deals is pretty clear: exclusive access to real-time news, splashy displays of goodwill towards media, etc. But for publishers, there’s little public knowledge about the terms of the licensing agreements. Vox’s statement about its deal mentions “innovative products for Vox Media’s consumers and advertising partners,” but it’s not at all clear exactly what goodies Vox, or any of these companies, may receive. It’s worth noting that many of the announcements mention access to reader data and insights as part of the exchange. So you can bet your ChatGPT data will play a part in the agreement.

Here’s who has been successfully courted so far. We’ve also rounded up all the media companies that have sued OpenAI for copyright infringement. Read on and stay tuned since this story is sure to have updates.

Media companies that have licensing deals with OpenAI

Associated Press

On July 23, 2023, the non-profit news agency announced a deal with OpenAI. As part of the deal, OpenAI is granted access to the AP’s news archive going back to 1985 for training its models and providing ChatGPT responses based on its data. “AP firmly supports a framework that will ensure intellectual property is protected and content creators are fairly compensated for their work,” Kristin Heitmann, AP senior vice president and chief revenue officer, said in the announcement.

Axel Springer

Publications: Business Insider; Politico

On December 13, 2023, the German media company Axel Springer which owns Business Insider and Politico announced its OpenAI partnership. “We want to explore the opportunities of AI empowered journalism – to bring quality, societal relevance and the business model of journalism to the next level,” said Axel Springer CEO Mathias Dopfner. Axel Springer reportedly received tens of millions of euros for the deal.

FT Group

Publication: Financial Times

Colloquially known as the FT, the British daily newspaper announced a partnership with OpenAI on April 29, 2024. The agreement “recognises the value of our award-winning journalism and will give us early insights into how content is surfaced through AI,” said FT Group CEO John Ridding.

Dotdash Meredith

Publications: People, Better Homes & Gardens, Food & Wine, Investopedia, InStyle, Verywell

On May 7, 2024, the media company that owns several lifestyle and entertainment magazines announced an agreement with OpenAI. “This deal is a testament to the great work OpenAI is doing on both fronts to partner with creators and publishers and ensure a healthy Internet for the future,” said Neil Vogel, CEO of Dotdash Meredith.

News Corp

Publications: The Wall Street Journal, New York Post, the Daily Telegraph, Barron’s, MarketWatch, Investor’s Business Daily, FN, The Times, The Sunday Times, The Sun, The Australian, news.com.au, The Daily Telegraph, The Courier Mail, The Advertiser, Herald Sun

Fox News parent News Corp, best known in the publishing context for owning the Wall Street Journal and the New York Post announced a deal with OpenAI on May 22, 2024. “We are delighted to have found principled partners in Sam Altman and his trusty, talented team who understand the commercial and social significance of journalists and journalism,” said Robert Thomson, News Corp CEO.

Vox Media

Publications: Curbed, The Cut, The Dodo, Eater, Grub Street, Intelligencer, New York Magazine, Now This, Polygon, Popsugar, SB Nation, the Strategist, Thrillist, The Verge, Vox, Vulture

Vox Media announced a deal with OpenAI on May 29, 2024. The company which owns a collection of publications that span technology, culture, sports, entertainment, and food, allegedly didn’t inform its staffers ahead of time.

“As both journalists and workers, we have serious concerns about this partnership, which we believe could adversely impact members of our union, not to mention the well-documented ethical and environmental concerns surrounding the use of generative AI,” said the Vox Media Union in a statement on X.

The Atlantic

The Atlantic shared its partnership with OpenAI on the same day as the Vox Media announcement (May 29, 2024). “We believe that people searching with AI models will be one of the fundamental ways that people navigate the web in the future,” said Nicholas Thompson, CEO of The Atlantic.

But “generative AI has not exactly felt like a friend to the news industry, given that it is trained on loads of material without permission from those who made it in the first place,” countered Beres, senior technology editor at The Atlantic in his aforementioned story.

Media companies that have filed lawsuits against OpenAI

On December 27, 2023, The New York Times was the first major publication to file a lawsuit against OpenAI and its major investor Microsoft for copyright infringement. The Intercept, Raw Story, and AlterNet, represented by the same law firm, filed lawsuits against OpenAI, alleging violations of the Digital Millennium Copyright Act on February 29, 2024. The Intercept also included Microsoft in its suit.

A collection of daily newspapers consisting of New York Daily News, the Chicago Tribune, the Orlando Sentinel, the Sun Sentinel of Florida, San Jose Mercury News, The Denver Post, the Orange County Register and the St. Paul Pioneer Press, filed a lawsuit against OpenAI and Microsoft in April 2024.

Source : All the media companies that have licensing deals with OpenAI (so far)

All the media companies that have licensing deals with OpenAI (so far)

A way to avoid obsolescence, or a ‘devil’s bargain’?