AI Inference Engineer Intern - Model Pruning

quadric.io, Inc
Full-time Burlingame, CA other-general
Posted:
June 15, 2026
Location:
Burlingame, CA, United States

Job Description

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Note: Our preference is for this internship to be based out of our Burlingame, California office. Candidates should be based in the Bay Area or able to relocate for the internship period and available to work on site.

Responsibilities:
Model pruning: Prune the model to speed up inference with re-training to maintain accuracy.

Requirements

+ MS student in CS or related fields...

Apply for this Job

Submit your application for the AI Inference Engineer Intern - Model Pruning position at quadric.io, Inc.

Apply Now Save for Later

Job Overview

Job Type: Full-time
Location: Burlingame, United States
Posted: June 15, 2026
Deadline: June 20, 2026