• thingsiplay@beehaw.org
    link
    fedilink
    arrow-up
    16
    ·
    17 hours ago

    Using another LLM model to fine tune it and then saying they spend a fraction of the money of those who created the model its based on, is just ironic. They used the Google model and fine tuned it. Its like someone building a car for millions of Dollars, I take the car and make some changes for much less money. Then I claim that I build a car for 50 Dollars. This is the level of logic we are dealing with.

    • jarfil@beehaw.org
      link
      fedilink
      arrow-up
      10
      ·
      17 hours ago

      The other way around. They started with Alibaba’s Qwen, then fine tuned it to match the thinking process behind 1000 hand picked queries on Google’s Gemini 2.0.

      That $50 proce tag is kind of silly, but it’s like picking an old car and copying the mpg, seats, and paint job from a new car. It’s still an old car underneath, only it looks and behaves like a new one in some aspects.

      I think it’s interesting that old models could be “upgraded” for such a low price. It points to something many have been suspecting for some time: LLMs are actually “too large”, they don’t need all that size to show some of the more interesting behaviors.

  • melp@beehaw.org
    link
    fedilink
    English
    arrow-up
    5
    ·
    17 hours ago

    Much like everything else for 2025… this is just getting dumber and dumber.