OpenAI has publicly responded to a copyright lawsuit by The New York Times, calling the case “without merit” and saying it still hoped for a partnership with the media outlet.

In a blog post, OpenAI said the Times “is not telling the full story.” It took particular issue with claims that its ChatGPT AI tool reproduced Times stories verbatim, arguing that the Times had manipulated prompts to include regurgitated excerpts of articles. “Even when using such prompts, our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts,” OpenAI said.

OpenAI claims it’s attempted to reduce regurgitation from its large language models and that the Times refused to share examples of this reproduction before filing the lawsuit. It said the verbatim examples “appear to be from year-old articles that have proliferated on multiple third-party websites.” The company did admit that it took down a ChatGPT feature, called Browse, that unintentionally reproduced content.

  • TWeaK@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    I have had to go through a high profile copyright claim for my work where this was the exact premise. We were developing a game and were using copyrighted images as placeholders while we worked on the game internally, we presented the game to the company as a pitch and they tried to sue us for using their assets.

    That’s interesting, if only because the judgement flies in the face of the actual legislation. I guess some judges don’t really understand it much better than your average layman (there was always a huge amount of confusion over what “transformative” meant in terms of copyright infringement, for a similar example).

    I can only rationalise that your test version could be considered as “research”, thus giving you some fair use exemption. The placeholder graphics were only used as an internal placeholder, and thus there was never any intent to infringe on copyright.

    ChatGPT is inherently different, as you can specifically instruct it to infringe on copright. “Write a story like Harry Potter” or “write an article in the style of the New York Times” is basically giving that instruction, and if what it outputs is significantly similar (or indeed identical) then it is quite reasonable to assume copyright has been infringed.

    A key difference here is that, while it is “in private” between the user and ChatGPT, those are still two different parties. When you wrote your temporary code, that was just internal between workers of your employer - the material is only shared to one party, your employer, which encompases multiple people (who are each employed or contracted by a single entity). ChatGPT works with two parties, OpenAI and the user, thus everything ChatGPT produces is published - even if it is only published to an individual user, that user is still a separate party to the copyright infringer.

    I mean, it kinda does? technically? Because if you fail to enforce your copyright then you cant claim copyright later on.

    If a person robs a bank, but is not caught, are they not still a bank robber?

    While calling someone who hasn’t been convicted of a crime a criminal might open you up to liability, and as such in practice a professional journalist will avoid such concrete labels as a matter of professional integrity, that does not mean such a statement is false. Indeed, it is entirely possible for me to call someone a bank robber and prove that this was a valid statement in a defamation lawsuit, even if they were exonerated in criminal court. Crimes have to be proven beyond reasonable doubt, ie greater than 99% certain, while civil court works on the balance of probabilities, ie which argument is more than 50% true.

    I can say that it is more than 50% likely that copyright infringement has occurred even if no criminal copyright infringement is proven.

    That isn’t pulled from my ass, that’s just the nuance of how law works. And that’s before we delve into the topic of which judge you had, what legal training they undertook and how much vodka was in the “glass of water” on their bench, or even which way the wind blew that day.


    According to the Federal legislation, it does not matter whether or not the copying was for commercial or non-commercial purposes, the only thing that matters is the copying itself. Your judge got it wrong, and you were very lucky in that regard - in particular that your case was not appealed further to a higher, more competent court.

    Commerciality should only be factored in to a circumstance of fair use, per the legislation, which a lower court judge cannot overrule. If your case were used as case law in another trial, there’s a good chance it would be disregarded.