In July 2023, the Associated Press announced that it had reached an agreement with OpenAI to license its archive of news stories, which goes back to 1985. The AP will gain access to OpenAI's technology and product expertise as part of the deal, for which financial details were not disclosed. The AP deal "wasn't about display of content," said Tom Rubin, OpenAI’s head of IP and Content.
In December 2023, Axel Springer, the holding company for brands like Politico, Business Insider, Bild, and Welt, announced a similar, multi-year content deal with OpenAI for archived materials. Axel Springer content has a “favorable position” within ChatGPTs search results.
In March 2024, Le Monde signed a multi-year agreement with OpenAI, the first of its kind between OpenAI and a major French media organization. Under the agreement, Le Mond will be able to draw on OpenAI technologies to develop projects and functionalities using AI and OpenAI to use the breadth of its “corpus” with references to Le Monde articles to systematically be accompanied by a logo, hyperlinks, and titles of the articles it uses as references.
Getty Images (NYSE: GETY) reported in it's June 30, 2024 10-Q that it is a plaintiff in several lawsuits against Stability AI arising out of Stability's alleged unauthorized reproduction of approximately 12 million images from Getty's websites. Nonetheless, Getty reported that 2.6% of its revenue for the first half of 2024 could be attributed to "music licensing, digital asset management, distribution services, print sales, and data licensing.
Getty lists data licensing under "other" revenue - with a year over year increase of 86.5% (i.e. $3.2 million) attributed primarily to data licensing.
On it's Q3 2024 Earnings Call, Craig Peter's Getty CEO stated that "in Q2, we announced some small data licensing with an existing partner. We’ve done a little bit of data licensing in Q3, again, with an existing long-standing partner of ours. Those are deals where we have a belief that we will do deals that are aligned to the interest of the business and the interest of our creators over the long haul. And that’s not every deal that’s out there. We have passed on a large number of deals that we don’t think align to the long-term interest of this company and to our creators."
HarperCollins, one of the biggest publishers in the world, made a deal with an “artificial intelligence technology company” and is giving authors the option to opt in to the agreement or pass.
Author Daniel Kibblesmith, who wrote the children’s book Santa’s Husband and published it with HarperCollins was one of the first authors to break the news, providing screenshots, seemingly from his agent, which indicated that the publisher was seeking permission to include his book in a deal that it's making with an unnamed "large tech company". As per the below screenshot, the publisher is paying $2,500 per title.
This story was first broke by 404media and can be found here: https://www.404media.co/harpercollins-ai-deal/
It has been widely reported that Reddit (NYSE: RDDT) has a $60M per year access deal for ongoing and historical data with Google and Cision, among others.
Its Feb 22, 2024 S-1 report indicates that Reddit believes that, given the value of its data in sentiment analysis and trend identification, it is well suited to tap into what it estimates to be a $1.0 trillion dollar AI market, listing data licensing second only to advertising in its market opportunity analysis.
The same report indicated that as of January 2024 they have entered into certain data licensing arrangements with an aggregate contract value of $203.0 million, with a minimum of $66.4 million of revenue being recognized during the 2024 fiscal year.
Reddit's Q1 8-K further solidified its bullish position with respect to data licensing with a separate call out for data licensing updates.
In it's October 30th 10-K, Reddit reported that the aggregate amount of its long term contracts, consisting primarily of long term data licensing contracts exceeding one year is $294.8 million—but it's still early days—reddit explained, stating that they are "in the early stages of [its] data licensing efforts.
In July 2023, Shutterstock (NYSE: SSTK) announced a six-year agreement with OpenAI to provide high-quality training data to OpenAI, and existing collaborations with NVIDIA, Meta, LG and others to develop foundational generative AI tools and standards for creators across 3D, images, and text. Initial data licensing deals ranged from $25 million to $50 million each, but have since been expanded.
Established a contributor fund to compensate Shutterstock contributors directly if there IP is used in the development of AI-generative models through licensing of data from Shutterstock’s library.
In March 2024, the Verge reported that Perplexity AI, a generative AI-based search engine, licenses data from Yelp (NYSE: Yelp), although it is not clear whether the licensing deal differs from other pre-generative AI deals that Yelp has previously entered into to distribute restaurant and service business reviews.
In its 2023 10-K, Yelp did not specifically report data licensing as a separate revenue item, but the other types of revenue increased from $21M in 2020 to $47M in 2023, which some suggest could be a generative AI bump of $25M.
Reuters (NYSE: TRI) reported in its May 2024 6-K that revenue increases of 15% between 2023 and 2024 were “driven primarily by generative AI-related content licensing revenue that was largely transactional.”
CEO Ted Leonard revealed that he’s in talks with multiple technology companies to license Photobucket’s 13 billion photos and videos to be used to train generative AI models at rates of between 5 cents and $1 dollar per photo and more than $1 per video. Photobucket’s content library is orders of magnitude larger than Shutterstock.
Planet Labs (NYSE:PL) "the leading provider of daily data and insights about Earth" described in their September, 2024 10-Q that their primary revenue stream consists of revenue generated from selling licensing to its data and analytics via fixed price and usage-based contracts. Data licensing subscriptions and minimum commitment usage-based contracts provide a large recurring revenue base for their business with a low incremental cost to serve each additional customer. In furtherance of that end, they have developed advanced data processing capabilities to enable them to produce "AI-ready" data sets.
Struck agreements with two large tech companies to license the majority of its archive of 200 million images at 2 to 4 cents per image, suggesting at least two $6M licensing deals for its content.
There are five more similar deals in the pipeline, said CEO Joaquin Cuenca Abela, declining to identify buyers.
NVIDIA (NASDAQ: NVDA) reported in their Q4 2023 earnings call that they have seen a significant boost in data center revenue, partly due to licensing agreements for their AI training data through the NVIDIA NeMo™ Retriever, contributing to a record $47.5 billion in data center revenue for fiscal 2023.
Notably - the same statement was not reflected in subsequent reports.
During a April 2024 shareholder call, Alphabet (NASDAQ: GOOG) highlighted their engagement with Hollywood studios to license video content for AI-generated video technology, marking substantial new investments in their AI capabilities.
In May 2024 News Corp (NASDAQ: NWSA), owner of Fox News and NY Post, among others, agreed to let OpenAI use content from its publications in a deal potentially worth over $250 million over the course of five years.
Warner Brothers (NASDAQ: WBD) indicated that they are open to licensing specific programs, but not its entire library.
C3.ai's (NYSE: AI) Q3 2024 earnings call disclosed several new data licensing deals, contributing to an 18% year-over-year revenue growth and highlighting the expanding market for enterprise AI applications.
IBM's (NYSE: IBM) 2Q 2024 Earnings reveal a multi-year licensing agreement with several AI firms to supply enterprise data, significantly impacting their AI segment revenue.
X, formerly known as Twitter, offers Firehose access, providing full, real-time access to all public tweets. Firehose access is reported to cost $42,000 per month or $2.5 million per year. This service is aimed at companies needing comprehensive and up-to-date data for sentiment analysis, trend tracking, and other AI-driven applications. This level of access is particularly valuable for AI training purposes, as it includes a constant stream of data that can be used to train and refine machine learning models.
Stack Overflow has partnered strategically with Google to integrate Google Gemini into its platform and vice versa. This collaboration aims to enhance AI-powered capabilities for developers by combining Stack Overflow's extensive developer knowledge base with Google's advanced AI technologies and provide Google with Stack Overflow’s developer knowledge base.
SoundHoundAI (NASDAQ:SOUN) a voice-based conversational AI company recognized $3.6 million in licensing revenues related to a non-recurring voice data licensing agreement with a US based semiconductor customer - according to its 23'-24' annual report.
Tempus AI (NASDAQ: TEM), a precision medicine company reported 40% year over year data licensing revenue growth in its August 2024 8-K, which accounted for a portion of its 32.5% year over year revenue growth. Accrued data licensing fees accounted for $3.7 million dollars of 2024 revenue as of June 2024.