By
Site Reliability Engineer - AI Application
ByteDance
Full-time
Singapore, Singapore
Software Architecture & Engineering
Posted:
March 03, 2026
Location:
Singapore, Singapore, Singapore
Job Description
Responsibilities
The Applied Machine Learning (AML) - Enterprise team provides machine learning platform products on VolcanoEngine with cloud resource scheduling system which intelligently orchestrates different tasks and jobs with minimised costs of every experiment and maximised resource utilisation, rich modelling tools including customised machine learning tasks and web IDE, and multi-framework high performance model inference services.
- Ensure the reliability and normal operation of multiple core systems related to Viking Team's Big data and online services, while focusing on system capacity planning and stability assurance;
- Enhance system visibility by monitoring the availability and performance metrics of system components, helping development teams quickly locate faults, and especially ensuring the stability in critical links such as AI search/vector databases;
- Improve the reliability, scalability, and Performance optimization of...
Apply for this Job
Submit your application for the Site Reliability Engineer - AI Application position at ByteDance.
Apply Now Save for LaterJob Overview
Job Type:
Full-time
Location:
Singapore, Singapore
Posted:
March 03, 2026
Deadline:
April 12, 2026