Platform team is responsible for deploying and maintaining common, cloud-based software and data infrastructure used across the... and platform engineers. A key focus of this role is to improve the reliability, security, and cost effectiveness of our software...
. As a Senior Cluster Site Reliability Engineer (SRE), you will help scale our research compute cluster to meet our growing needs... will provide a world-class HPC platform for researchers to focus on cutting-edge machine learning problems at scale...