Senior Site Reliability Engineer - AI Infrastructure
Kerry ConsultingJob Description
Our client operates large-scale GPU cloud platforms across Asia-Pacific. As part of their expansion, they are looking for experienced platform engineers to build and scale their next-generation data center operations. This role offers direct impact in a well-funded technology company working at the forefront of sustainable AI infrastructure.
Role
You’ll drive the technical foundation for MLOps capabilities and platform infrastructure supporting cutting‑edge NVIDIA GPU clusters. This position demands expertise in designing and operating Kubernetes environments for high‑performance computing, implementing Infrastructure‑as‑Code frameworks, and building world‑class observability platforms. You’ll collaborate directly with founders and engineering leadership to establish DevOps standards, enhance CI/CD pipelines, and integrate enterprise‑grade monitoring across distributed systems. The role requires ownership of incident response, active participation in on‑call rot...
Apply for this Job
Submit your application for the Senior Site Reliability Engineer - AI Infrastructure position at Kerry Consulting.
Apply Now Save for Later