It was, its just improved massively, and this specific library/fork is very obscure, heh. Good luck!
It’s also very poorly documented, so feel free to poke me if you run into something. I can even make and upload a more optimal quantization for you since I’ve already set that up for myself, anyway.
Pretty sure this sort of hybrid approach wasn’t widely available the last time I was testing local LLMs (8-12 months).
Will need to do a test run. Cheers!
It was, its just improved massively, and this specific library/fork is very obscure, heh. Good luck!
It’s also very poorly documented, so feel free to poke me if you run into something. I can even make and upload a more optimal quantization for you since I’ve already set that up for myself, anyway.