

Hello! I recently deployed GPUStack, a self-hosted GPU resource manager.
It helps you deploy AI models across clusters of GPUs, regardless of network or device. Got a Mac? It can toss a model on there and route it into an interface. Got a VM on a sever somewhere? Same. How about your home PC, with that beefy gaming GPU? No prob. GPUStack is great at scaling what you have on hand, without having to deploy a bunch of independent instances of ollama, llama.ccp, etc.
I use it to route pre-run LLMs into Open WebUI, another self-hosted interface for AI interactions, via the OpenAI API that both GPUStack and Open WebUI support!
At a load of shit