Inside the Runtime

From memory and compute pipelines to context management and assistant workflows, we design the full AI stack, and heren share our progress to drive the future of local intelligence together.

Inside the Runtime

Follow new releases, engineering breakthroughs, and examples of Local AI in action — all built to run closer to where your product lives.

OpenInfer Joins Forces with Intel® and Microsoft to Accelerate the Future of Collaboration in Physical AI

Today, we’re excited to share a big step forward for OpenInfer: we’ve officially joined the Intel® Partner Alliance and Microsoft’s Pegasus Program. These are two of the most influential innovation...

Boosting Local Inference with Speculative Decoding

In our recent posts, we’ve explored how CPUs deliver impressive results for local LLM inference, even rivaling GPUs, especially when LLMs push on hardware's memory bandwidth limits. These bandwidth...

Ready to Get Started?

OpenInfer is now available! Sign up today to gain access and experience these performance gains for yourself.