• FaceDeer@kbin.social
    link
    fedilink
    arrow-up
    9
    ·
    1 year ago

    Make note, folks who have been demanding only “ethical” AI training. You’re demanding a world in which only Getty Images and other such existing incumbent copyright-holding corporations have decent AIs.

    • FlumPHP@programming.devOP
      link
      fedilink
      arrow-up
      5
      ·
      1 year ago

      Getty is benefitting from having historically paid creators for the rights to their creations. The horror.

      VCs have burned oodles of cash on startups. They could do the same to fund artists and photographers to create training images. A company could earn the good will of the community by starting with public domain and CC images. People who support AI image generation could sign over their own photos.

      There are options that aren’t as easy and carry more risk than unethically scraping the web. But companies are willing to be unethical until the law catches up, in hopes of cementing their foothold. See Uber and Airbnb for examples.

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        6
        ·
        1 year ago

        Getty Images is infamous for adding public domain images to their archives and then sending threatening demands for payment from anyone they subsequently spot using them. They’re a big giant corporation like any other, all they’re interested in is cash flow.

        Note that I put “ethical” in quotes because ethics are a subjective matter that can’t be proven one way or another. “Scraping the web” is IMO no different from regular old reading the web, which is what it’s for. If you don’t want your images to be seen then don’t put them online in the first place.

        • FlumPHP@programming.devOP
          link
          fedilink
          arrow-up
          2
          ·
          1 year ago

          If you don’t want your images to be seen then don’t put them online in the first place.

          I don’t think anyone is objecting to the things they put online being seen. They’re objecting to companies creating derivatives for commercial purposes.

            • Still@programming.dev
              link
              fedilink
              arrow-up
              0
              ·
              edit-2
              1 year ago

              I think there is a wording issue going on here, people object to their posts being used in ways they weren’t expecting, in this case people post things for others to see not for use in AI datasets,

              whether the AI is open source or not doesn’t effect anything about the training data being used with or without permission

              • FaceDeer@kbin.social
                link
                fedilink
                arrow-up
                1
                ·
                1 year ago

                If explicit permission for specifically AI training is required then AI is basically impossible, because nobody gave that permission.

                I don’t think such permission should be required, though, either legally or ethically. When you put something up for public viewing you don’t get to retroactively go “but not like that” when something you didn’t expect looks at it. The permission you gave inherently involves flexibility.

    • Willie@kbin.social
      link
      fedilink
      arrow-up
      3
      ·
      1 year ago

      Make note about what? This is a good thing. They went through the efforts to acquire the rights for the training data, and people might have even been paid for the original work.

      It’s not like this stops someone from paying artists or photographers to make art or photos for their training data, or creating some sort of group where contributors actively give the rights to their own artwork or photos for a model, like some sort of open source project kind of thing, people love that kind of stuff! You’re just acting like this is some awful thing, when it’s completely fine, and the way it should be.

      • lemonflavoured@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        To me the obvious answer would be to pay people a small amount per photo for pictures of various things and then use that as training data.

        • lps2@lemmy.ml
          link
          fedilink
          arrow-up
          0
          ·
          1 year ago

          That’s expensive and companies would rather not pay while the law is unclear on using copywrited images in a training set

          • lemonflavoured@kbin.social
            link
            fedilink
            arrow-up
            1
            ·
            edit-2
            1 year ago

            The thing is that for medium to large companies it’s probably less expensive to pay people a nominal fee for pictures than it would be to risk being sued by, say, Disney, Nintendo, WWE or Games Workshop (to use some famously litigious companies).

            • lps2@lemmy.ml
              link
              fedilink
              arrow-up
              1
              ·
              1 year ago

              I hope that’s the direction we head that way artists are appropriately compensated for their work. We’ll see entire libraries/brokers pop up that grant LLM makers access to work for a fee.