haxor@derp.fooMB to Hacker News@derp.fooEnglish · 11 months agoLLM in a Flash: Efficient LLM Inference with Limited Memoryhuggingface.coexternal-linkmessage-square0fedilinkarrow-up13arrow-down11file-textcross-posted to: hackernews@lemmy.smeargle.fanshcc@lemmy.dbzer0.comsingularity@lemmit.online
arrow-up12arrow-down1external-linkLLM in a Flash: Efficient LLM Inference with Limited Memoryhuggingface.cohaxor@derp.fooMB to Hacker News@derp.fooEnglish · 11 months agomessage-square0fedilinkfile-textcross-posted to: hackernews@lemmy.smeargle.fanshcc@lemmy.dbzer0.comsingularity@lemmit.online