Senior On-Device AI Inference Performance Engineer
San Mateo, CA
Position Overview
We are looking for highly skilled engineers with a focus on C/C++, low level systems, performance and power optimization to join our team full-time. In this role, you will apply your expertise in performance optimization and contribute to building a scalable, efficient inference engine to power the future of on-device AI use cases, such as real-time agents and assistants. You will work with seasoned engineers to enhance our end to end inference stack, power management, and overall system efficiency.
Key Responsibilities
- Innovate on the inference optimization pipeline through algorithmic and system optimization
- Own end to end system characterization across a range of hardware
- Own the design and implementation of inference optimizations for AI workloads, targeting peak efficiency on diverse hardware.
- Engage in performance benchmarking, profiling, and troubleshooting to improve execution across various hardware.
- Work on cross-functional teams to design, implement, and test new features.
Qualifications
- 5+ years of hands-on experience in C/C++ development with a focus on performance optimization.
- Strong understanding of low level systems and efficiency in the context of high-performance computing.
- Familiarity with GPU-based computing and CUDA or similar GPU programming environments.
- Solid knowledge of system design and performance optimization techniques.
- Experience with open-source contributions and community-driven projects is a plus.
What You'll Gain
- Opportunity to work alongside industry experts in AI optimization, high-performance computing, and hardware acceleration.
- Hands-on experience with cutting-edge technologies at the intersection of AI and hardware acceleration.
- Exposure to open-source development and collaboration with a vibrant community.
Benefits We Offer:
At OpenInfer we offer comprehensive benefits, some include:
- Medical, Dental, and Vision benefits for you and your family
- Flexible Paid Time Off, 10 days
- Parental Leave
- 401(k) Plan with company matching
- Snacks and coffee to keep you energized
These benefits are further detailed in OpenInfer policies and are subject to change at any time, consistent with the terms of any applicable compensation or benefits plans.
How to Apply
Please send your resume and a cover letter to [email protected]. Include any relevant projects, open-source contributions, or case studies that showcase your expertise in performance optimization and low level system design.