Running Gemma 4 Locally on an Intel Iris Xe Gpu
A few months ago, Google announced Gemma 4, its latest family of open models designed to run on a wide range of hardware while supporting a variety of generation tasks. The models that immediately caught my attention were the 2B and 4B variants. I wanted to see how well they would perform on hardware I already had running in my home lab.
I’ve been running a home lab for several years. It hosts services such as Plex, Pi-hole, and various personal projects, but I had never used it for AI workloads. My goal was to understand how difficult it would be to run a genuinely useful language model locally. Longer term, I’d like to integrate a local LLM into some of my personal applications, such as a budgeting app I maintain, rather than routing requests to providers like OpenAI, Anthropic, or Google.