GPUs. THE PERSON: You are a senior systems engineer with deep LLM domain knowledge who enjoys working close to the... inference systems (e.g., FasterTransformer), with demonstrated performance tuning. * GPU Kernel Development Proven experience...
your career. THE ROLE: As a senior member of the LLM inference framework team, you will be responsible for building.... This role sits at the intersection of inference engines, distributed systems, and GPU runtime and kernel backends. THE PERSON...
, and enabling RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node systems... engineer with strong technical and analytical expertise in GPGPU C++, Triton, TileLang or DSL development within Linux...
and tune large-scale training and inference models for optimal performance on AMD hardware. GPU Kernel Development: Design..., and enabling training and inference at scale across multi-GPU and multi-node systems. You will collaborate across internal GPU...
NVIDIA is the platform for every new AI-powered application. We seek a Principal Software Engineer - AI Inference... to advance open-source LLM serving. This role involves contributing to upstream inference engines like vLLM and SGLang...
of deep learning models, algorithms and frameworks, such as PyTorch, XLA etc. Understanding of LLM inference optimizations.... Today, we are increasingly known as “the AI computing company”. We are looking for an AI & Deep Learning Compiler Engineer. NVIDIA is hiring...
-LLM team and help shape the next generation of edge AI for automotive and robotics. We build the software stack... development for critical transformer components such as attention, GEMM, and MoE. Benchmark, profile, and optimize inference...
focuses on optimizing generative AI models such as large language models (LLM) and diffusion models for maximal inference... architecture search, and streamlined deployment strategies with open-sourced inference frameworks. Seeking a Senior Deep Learning...
, and Inference Engines such as TRT-LLM, vLLM, SGLang Rapid prototyping and development with Python, C++, CUDA or related DSLs... engineer to bring advanced communication technologies into AI stacks, including PyTorch, TRT-LLM, vLLM, SGLang, JAX...
from development through runtime. As a Senior Principal Machine Learning Engineer, you will drive research on cutting-edge areas... understanding of attention mechanisms and related knowledge is a plus. Demonstrated expertise with modern LLM inference engines (e.g...
from development through runtime. As a Principal Machine Learning Inference Engineer, you will serve as a technical authority... of our AI platform - ML inference. Beyond individual contribution, you will lead complex technical projects, mentor senior engineers...