• 0 Posts
  • 31 Comments
Joined 2 years ago
cake
Cake day: September 1st, 2024

help-circle

  • The problem I’ve got is that you all have a god of the gaps, the conversation I was having 3 years ago was different to 2 years ago was different to 1 year ago

    And I guess the problem I have with you, is that you seem to think that you can get results with 16GB, competitive with models that run on a Blackwell 6000 with 96GB, while ignoring the fact that the vast majority of the people in the world are running GPUs with 4 to 8 GB of VRAM, if they even have access to GPUs, at all.

    That’s the gap. Most people don’t have the kind of money you think they do, and even those who do have some money, they will never achieve the same results as with cloud models, because if there’s a state of the art optimization that makes models 10 times smaller, cloud models will become 10 times bigger with that advantage. It’s pretty simple.


  • 2026’s average gaming PC is massive amounts of memory and compute apparently

    Any model that can run on 16GB or less, is not going to be any close in real world tasks, to any other cloud based model. It just cannot be. There are people out there running Qwen on the Mac Studio with 96GB, and it falls short of cloud based models in both performance and speed.

    lol there are plenty of open source models in the top 100 with multiple SOTA models released in the last few months alone

    The top 100 of what, exactly? Many blended benchmark results are notoriously biased, and LLMs “cheat” on benchmarks on every single opportunity, so it is still hard to tell, outside of real world tasks and speed, which models are actually better than others.

    But regardless, the main point of the gap is resources. Even if the average gaming computer was really enough to run meaningful models, the vast majority of the world wouldn’t have access to it, even more so in this day and age, where a single RAM stick couldn’t be bought with a whole monthly salary in most parts of the world.
















  • I would need a citation for that “2x-5x faster” with the same quality, because that hasn’t been my experience at all. Most of my colleagues treat LLMs as “better Google”, and agentic coding in production has been downsized, to the point where it may help with the least critical paths only. And we aren’t particularly AI skeptic, at all.

    Also, I feel like progress has stalled in the past couple of years, e.g. Opus latest version doesn’t seem to provide me with any noticeable advantages over the previous one. Are they getting better on paper? I suppose they do, but I couldn’t care less about that if they don’t give me better results.

    The thing is, writing code was never the issue, engineering it is. If a machine helps me write code 10 times faster, that saves me maybe a couple hours, which isn’t really meaningful. On the other hand, it increases my workload by forcing me to thoroughly check the work of less experienced devs who rely on them, just to make sure that there aren’t errors that could cause serious harm.

    I guess what I’m trying to say is that AI is giving inexperienced people confidence they shouldn’t have in the first place, and that’s not a good thing.