Our vision is to bring AI to every surface seamlessly
About Us
At OpenInfer, we are a team of experts in hardware optimization and enterprise-grade software development, dedicated to redefining AI inference.
With deep expertise in GPU architecture, compiler optimization, and system efficiency, we build high-performance inference solutions that empower developers to run AI models efficiently on any hardware. Our mission is to unlock scalable, cost-effective, and private AI inference for businesses, enabling seamless integration across client and cloud environments.
Meet our team
-
Behnam
Dr. Bastani has 25+ years of experience in AI in constrained compute platforms. He was previously Senior / Director of Engineering for AI & ML at Roblox and Meta and has shipped AI Engines at scale at Meta, Google and Roblox.
-
Reza
Reza Nourai brings over 20+ years of experience in GPU and large scale memory architectures. He has led major industry breakthroughs at Meta, Microsoft, Roblox and Magic Leap.
-
Vitali
Vitali Lovich brings 20+ years of expertise in system optimization and distributed computing at Google, Apple, Meta, and Cloudflare. His proven track record in driving complex system programs and delivering industry-defining features positions him as a critical force in building a state-of-the-art kernel runtime at OpenInfer.
-
Sam
Formerly the Technical Marketing Director at Twitch, Sam is a software engineer with a background in agency work, known for being a trusted technical partner to stakeholders. Sam enjoys pushing systems beyond their intended limits and approaches problem-solving by first understanding the 'why' behind every challenge.
-
Steven
Steven is a seasoned architect and engineer with over 30 years of experience, including AI development since 2011. His career spans leading roles at Meta, Ericsson, and Treasury Intelligence Solutions, as well as pioneering AI projects in predicting complex behaviors.
-
Chris
Chris is a C++ performance engineer specializing in low-level optimizations. With a relentless focus on efficiency, they have developed one of the fastest known JSON parsers/serializers and contributed a key optimization to digit-counting algorithms.
-
Onkar
Onkar is a performance optimization specialist with experience in designing and developing Runtime systems and libraries for state-of-the-art heterogeneous compute and memory architectures to enable efficiency in demanding applications. His industry experience at NetApp, HPE Labs, IBM Research and several National Labs also includes High Performance Computing, Backend development and systems research.
-
Nadav
Nadav brings over a decade of experience in developing high-performance engines for 3D and on-device machine learning applications at Google and Roblox. He has a passion for delivering exceptional products to both end users and developers.