Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, including...+ experience in AI model training optimization. - Strong software engineering skills, including proficiency in Python, C...
Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, including... in AI model training optimization. - Strong software engineering skills, including proficiency in Python, C++, and CUDA...
into smaller, more efficient models, enabling scalable training, optimization, and deployment. Responsibilities may include..., but are not limited to, distillation frameworks, model acceleration, hardware-efficient inference, and their applications...
into smaller, more efficient models, enabling scalable training, optimization, and deployment. Responsibilities may include..., but are not limited to, distillation frameworks, model acceleration, hardware-efficient inference, and their applications...
and latest trend in inference and training optimization. Hand-on experience in mapping model architecture to low level software... model architecture, especially SoTA models, distributed inference and deployment at scale is crucial. KEY RESPONSIBILITIES...
to support model training, fine-tuning, evaluation, and deployment — employing Spark, Kafka, Flink, and other distributed..., and performance optimization for large-scale serving systems. Hands-on experience with LLM orchestration frameworks, model routing...
to support model training, fine-tuning, evaluation, and deployment — employing Spark, Kafka, Flink, and other distributed... orchestration frameworks, model routing, and multi-model inference. Proficiency in Python, Java, C++, or Go, with an emphasis...
scientific venues in AI/ML fields. Develop and enhance GPU-accelerated pipelines for (customized) model training and inference... with inference optimization, performance trade-offs, and scalable integration will be an asset. We are looking for a candidate who...
paradigms - Deploy and optimize text/multimodal LLMs, including inference acceleration, model alignment during training... with distributed databases or distributed data processing frameworks is a plus - Experience with GPU inference optimization, LLM/VLM...
trend in inference and training. Experience in mapping model architecture to low level software, hardware and understanding... performance and optimization team across various frameworks and model architectures. This is a highly visible role with large...