Co
Software Engineer for AI Model Evaluation
Confidential
Full-time
minneapolis, mn, minneapolis, mn
technology
Posted:
June 11, 2026
Location:
minneapolis, mn, minneapolis, mn, United-States
Job Description
This role focuses on advancing the evaluation and development of cutting-edge coding agents. You will operate at the intersection of AI research, software engineering, and model evaluation, designing the benchmarks, methodologies, and data systems that shape how next-generation coding models are measured and improved.
Key Responsibilities- Design and own evaluation frameworks for coding agents, including benchmark specifications, scoring methodologies, rubrics, and quality standards.
- Lead end-to-end research initiatives aimed at measuring and enhancing coding model performance across various software engineering tasks.
- Develop high-quality datasets, golden examples, and evaluation protocols that facilitate reliable assessment of frontier coding systems.
- Analyze model behavior and failure modes, identifying systematic weaknesses and translating findings into actionable improvements for training and evaluation.
- Build tool...
Apply for this Job
Submit your application for the Software Engineer for AI Model Evaluation position at Confidential.
Apply Now Save for LaterJob Overview
Job Type:
Full-time
Location:
minneapolis, mn, United-States
Posted:
June 11, 2026
Deadline:
July 21, 2026