ML Research Platform Engineer (Distributed Training & HPC)

QNT Partners
Full-time Singapore, Singapore business-and-financial-operations
Posted:
June 09, 2026
Location:
Singapore, Singapore, Singapore

Job Description

Location: Singapore, Hong Kong or Shanghai


About the role

We are looking for a platform engineer to build the infrastructure that powers our next-generation machine learning research. Think: large-scale experimentation, distributed training, and reproducibility.


This is not an applied ML role. You will not be fine-tuning LLMs or building agents. Instead, you will build the systems that enable researchers to train models at scale


What you will own

  • Distributed training pipelines for GPU-accelerated workloads (PyTorch, JAX)
  • Experiment management and model versioning
  • Resource scheduling on on-premise HPC clusters and cloud (Slurm, Kubernetes)
  • Observability and debugging for complex training jobs
  • Data lineage

Apply for this Job

Submit your application for the ML Research Platform Engineer (Distributed Training & HPC) position at QNT Partners.

Apply Now Save for Later

Job Overview

Job Type: Full-time
Location: Singapore, Singapore
Posted: June 09, 2026
Deadline: July 19, 2026