across hyperscale accelerator clusters. This role sits at the intersection of distributed systems, kernel-level performance engineering... strategy, kernel optimization, and end-to-end system performance. This is a hands-on role with direct impact on training...
intersection of distributed systems, kernel-level performance engineering, and large-scale model training. The ideal candidate can..., distributed training frameworks, or performance-critical kernels. Prior experience collaborating directly with hardware vendors...