Ex-OpenAI researcher says OpenAI mass-violated copyright law

Remmelt

This is a linkpost for https://suchir.net/fair_use.html

This got into the New York Times, but I actually recommend the thoughtful analysis in Suchir Balaji’s blog post:

While generative models rarely produce outputs that are substantially similar to any of their training inputs, the process of training a generative model involves making copies of copyrighted data. If these copies are unauthorized, this could potentially be considered copyright infringement, depending on whether or not the specific use of the model qualifies as “fair use”. Because fair use is determined on a case-by-case basis, no broad statement can be made about when generative AI qualifies for fair use. Instead, I’ll provide a specific analysis for ChatGPT’s use of its training data, but the same basic template will also apply for many other generative AI products.

Effective Altruism Forum
EA Forum

Ex-OpenAI researcher says OpenAI mass-violated copyright law

11

11

Reactions