Operations Lead

ByteBridge
Full-time Johor Bahru, Johor Other-General
Posted:
March 03, 2026
Location:
Johor Bahru, Johor, Malaysia

Job Description

Role Responsibilities
Act as the Single Point of Contact (SPOC) and technical owner for GPU cluster operations.
Coordinate across GPU hardware, networking, data center facilities, and external vendors to ensure stable operations.
Plan, manage, and execute change management, system upgrades, and maintenance windows.
Interface directly with customers and vendors' technical support teams to resolve operational and service issues.
Own SLA management, incident tracking, and conduct post-incident reviews and Root Cause Analysis (RCA).
Requirements
Proven experience in AI / HPC cluster operations and support.
Solid understanding of end-to-end architecture, including GPU, InfiniBand (IB), and system-level integration.
Strong communication, coordination, and execution skills, with the ability to drive issues to resolution across multiple stakeholders.
Willingness to work onsite at the client's data center.
Show more Show less

Apply for this Job

Submit your application for the Operations Lead position at ByteBridge.

Apply Now Save for Later

Job Overview

Job Type: Full-time
Location: Johor Bahru, Malaysia
Posted: March 03, 2026
Deadline: April 12, 2026