Those of us with NVIDIA GPUs, particularly ones with enough VRAM, have been able to run large language models locally for quite a while. I did a guide