bot@lemmy.smeargle.fansMB to Hacker News@lemmy.smeargle.fans · 11 months agoLLM in a Flash: Efficient LLM Inference with Limited Memoryhuggingface.coexternal-linkmessage-square0fedilinkarrow-up13arrow-down10file-textcross-posted to: hcc@lemmy.dbzer0.comsingularity@lemmit.onlinehackernews@derp.foo
arrow-up13arrow-down1external-linkLLM in a Flash: Efficient LLM Inference with Limited Memoryhuggingface.cobot@lemmy.smeargle.fansMB to Hacker News@lemmy.smeargle.fans · 11 months agomessage-square0fedilinkfile-textcross-posted to: hcc@lemmy.dbzer0.comsingularity@lemmit.onlinehackernews@derp.foo