An anonymous reader shared this report from the Washington Post:
For years, tech companies like Open AI have freely used news stories to build data sets that teach their machines how to recognize and respond fluently to human queries about the world. But as the quest to develop cutting-edge AI models has grown increasingly frenzied, newspaper publishers and other data owners are demanding a share of the potentially massive market for generative AI, which is projected to reach to $1.3 trillion by 2032, according to Bloomberg Intelligence.
Since August, at least 535 news organizations — including the New York Times, Reuters and The Washington Post — have installed a blocker that prevents their content from being collected and used to train ChatGPT. Now, discussions are focused on paying publishers so the chatbot can surface links to individual news stories in its responses, a development that would benefit the newspapers in two ways: by providing direct payment and by potentially increasing traffic to their websites. In July, Open AI cut a deal to license content from the Associated Press as training data for its AI models. The current talks also have addressed that idea, according to two people familiar with the talks who spoke on the condition of anonymity to discuss sensitive matters, but have concentrated more on showing stories in ChatGPT responses.
Other sources of useful data are also looking for leverage. Reddit, the popular social message board, has met with top generative AI companies about being paid for its data, according to a person familiar with the matter, speaking on the condition of anonymity to discuss private negotiations. If a deal can’t be reached, Reddit is considering blocking search crawlers from Google and Bing, which would prevent the forum from being discovered in searches and reduce the number of visitors to the site. But the company believes the trade-off would be worth it, the person said, adding: “Reddit can survive without search.”
“The moves mark a growing sense of urgency and uncertainty about who profits from online information,” the article argues. “With generative AI poised to transform how users interact with the internet, many publishers and other companies see fair payment for their data as an existential issue.”
They also cite James Grimmelmann, a professor of digital and information law at Cornell University, who suggests Open AI’s decision to negotiate “may reflect a desire to strike deals before courts have a chance weigh in on whether tech companies have a clear legal obligation to license — and pay for — content.”