Senior Site Reliability Engineer - AI Infrastructure
Kerry ConsultingJob Description
Our client operates large-scale GPU cloud platforms across Asia-Pacific. As part of their expansion, they are looking for experienced platform engineers to build and scale their next-generation data center operations. This role offers direct impact in a well-funded technology company working at the forefront of sustainable AI infrastructure.
Role
You'll drive the technical foundation for MLOps capabilities and platform infrastructure supporting cutting-edge NVIDIA GPU clusters. This position demands expertise in designing and operating Kubernetes environments for high-performance computing, implementing Infrastructure-as-Code frameworks, and building world-class observability platforms. You'll collaborate directly with founders and engineering leadership to establish DevOps standards, enhance CI/CD pipelines, and integrate enterprise-grade monitoring across distributed systems. The role requires ownership of incident response, active participation...
Apply for this Job
Submit your application for the Senior Site Reliability Engineer - AI Infrastructure position at Kerry Consulting.
Apply Now Save for Later