minus-squareprojectmoon@lemm.eetoOpen Source@lemmy.ml•How to run LLaMA (and other LLMs) on Android.linkfedilinkarrow-up3·26 days agoIt’s enough to run quantized versions of the distilled r1 model based on Qwen and Llama 3. Don’t know how fast it’ll run though. linkfedilink
minus-squareprojectmoon@lemm.eetoHardware@lemmy.world•Million GPU clusters, gigawatts of power – the scale of AI defies logiclinkfedilinkarrow-up1·2 months agoI just need one more. I have two but one is old and has very little VRAM 🫤 linkfedilink
It’s enough to run quantized versions of the distilled r1 model based on Qwen and Llama 3. Don’t know how fast it’ll run though.