Software Engineer for AI Model Evaluation

Confidential

minneapolis, mn, minneapolis, mn, United-States Full-time June 30, 2026

Vacancy Description

This role focuses on advancing the evaluation and development of frontier coding agents, positioned at the intersection of AI research, software engineering, and model evaluation. You will design the benchmarks, methodologies, and data systems that shape how next-generation coding models are measured and improved.

Key Responsibilities

Design and own evaluation frameworks for coding agents, including benchmark specifications, scoring methodologies, rubrics, and quality standards.
Lead end-to-end research initiatives focused on measuring and improving coding model performance across diverse software engineering tasks.
Develop high-quality datasets, golden examples, and evaluation protocols that enable reliable assessment of frontier coding systems.
Analyze model behavior and failure modes, identifying systematic weaknesses and translating findings into actionable improvements for training and evaluation.
Build tooling an...

Ready to Apply?

अभी आवेदन करें

Submit your application for Software Engineer for AI Model Evaluation at Confidential

Apply for this Position

Location minneapolis, mn, minneapolis, mn

Country United-States

Type Full-time

Category technology

Posted June 30, 2026

Software Engineer for AI Model Evaluation

Vacancy Description

Ready to Apply?

Vacancy Details

About Confidential

Confidential

Share This Vacancy