Running large language models (LLMs) typically requires expensive, high-performance hardware with substantial memory and GPU ...