

2·
18 days agoI also have a 5060 (ti) with 16GB of RAM. I tend to use GPT-OSS:20B or Qwen3:14B with a context of ~30k. I have custom system prompt for my style of reponse I like on open web ui. That takes up about 14GB of my 16GB VRAM
But yeah it is slower and not as “smart” as the cloud based models, but I think the inconvenience of the speed and having to fact check/test code is worth the privacy and environmental trade offs
Exactly once they can correlate a couple things, they can correlate and search for even more info until all you accounts are revealed