Site icon Gravity IT Resources

AI Inference Researcher

To Apply for this Job Click Here

Position: AI Inference Researcher

Location– DMV preferred but open to remote

Our client is looking for a performance-obsessed AI researcher to join their growing team.

Your mission is to turn whitepapers, profiling traces, and raw intuition into working optimizations that make real workloads faster. You’ll work across compiler frameworks, GPU kernels, graph IRs, memory strategies, and quantization techniques — with the freedom to test hypotheses quickly and the responsibility to ship production-worthy gains.

This is a research role with an emphasis on delivering production-quality code. You’ll sit at the boundary between compiler experimentation and systems engineering, helping make Ignite one of the fastest inference engines in the world.

Responsibilities-

– Profile and optimize transformer-based inference pipelines

– Research and experiment with graph-level and kernel-level optimizations

– Design experiments, measure speedups, and continuously shave off latency across varying batch sizes, sequence lengths, and architectures

– Collaborate closely with the engineering teams to ensure results translate into real-world improvements

– Stay ahead of the optimization literature — and bring in ideas before they’re stale

Qualifications-

– Strong programming experience in C++ and Cuda

– Hands-on experience with performance profiling and tuning

– Existing understanding or interest in learning GPU architecture

– Comfort working with compiler toolchains and intermediate representations

– Curiosity and intensity — you enjoy tuning workloads at breakneck speed, not just shipping the baseline

Nice to have: experience with quantization-aware training, kernel autotuning, or specialized LLM serving runtimes

 

What The Client Offers-

– Competitive salary with meaningful equity

– A chance to help define a new category of AI infrastructure

– Greenfield architecture — build the product you’ve always wanted to use

– High trust and autonomy, with deep impact on platform direction

– Remote-first culture with the option to collaborate in person as we scale

 

To Apply for this Job Click Here

Exit mobile version