NV
Posted:
June 06, 2026
Location:
Shanghai, China, China
Job Description
Join NVIDIA, a leader in advancing computer graphics, PC gaming, and accelerated computing for over 25 years. As an LLM Inference Software Engineer, you will be at the forefront of innovative AI technology, working on the ground-breaking TRTLLM project. This role offers you the exceptional opportunity to accelerate LLM inference using GPU technology, influencing everything from single PCs to clusters with thousands of powerful GPUs. Be part of a team that values creativity, cooperation, and the pursuit of excellence.
What you'll be doing:
+ You will develop and optimize software solutions to accelerate LLM inference using GPU technology.
+ Collaborate closely with a world-class team of engineers to implement and refine GPU-based algorithms.
+ Analyze and determine the most effective methods to improve performance, ensuring seamless execution across diverse computing environments.
+ Engage in both individual and team projects, contributing to NVIDIA's mission ...
What you'll be doing:
+ You will develop and optimize software solutions to accelerate LLM inference using GPU technology.
+ Collaborate closely with a world-class team of engineers to implement and refine GPU-based algorithms.
+ Analyze and determine the most effective methods to improve performance, ensuring seamless execution across diverse computing environments.
+ Engage in both individual and team projects, contributing to NVIDIA's mission ...
Apply for this Job
Submit your application for the Compute Architecture Software Engineer position at NVIDIA.
Apply Now Save for LaterJob Overview
Job Type:
Full-time
Location:
Shanghai, China
Posted:
June 06, 2026
Deadline:
June 11, 2026