Posted:
June 08, 2026
Location:
Remote, Remote, Colombia

Job Description

Senior HPC Network Engineer

We are seeking a Senior HPC Network Engineer to support advanced AI, research, and Kubernetes-based GPU infrastructure for a major global technology client. The role focuses on architecting, operating, and optimizing high-performance network fabrics for large-scale LLM and distributed AI workloads, including InfiniBand/RDMA, high-speed Ethernet, Kubernetes networking, host-side GPU networking, SmartNIC/DPU technologies, and deep network observability. The ideal candidate has strong hands‑on experience with InfiniBand NDR/HDR and next‑generation fabrics, RDMA/RoCE, NVIDIA/Mellanox networking, NCCL/MSCCL communication patterns, Linux host networking, PCIe/GPU/NIC topology, and Kubernetes networking for GPU clusters.

Responsibilities

  • Architect, operate, and troubleshoot high-performance InfiniBand/RDMA and Ethernet fabrics for large-scale GPU clusters and distributed AI/LLM workloads
  • Design and evaluate cluster netwo...

Apply for this Job

Submit your application for the Senior HPC Network Engineer - AI Infrastructure (Colombia) position at EPAM Systems.

Apply Now Save for Later

Job Overview

Job Type: Full-time
Location: Remote, Colombia
Posted: June 08, 2026
Deadline: July 18, 2026