Agent Quality / Evals Engineer 1754

Softgic S.

workfromhome, workfromhome, Colombia Full-time June 29, 2026

Vacancy Description

Owns the eval harness and quality gate from the beginning. This role replaces the old late‑stage “Evals Specialist” model with a standing owner for measurable agent quality.

Key Responsibilities

Build and maintain the MVP eval harness: golden tasks, exception tasks, scorecard metrics, and regression packs.
Wire evals into CI so quality regressions fail builds and releases.
Define and maintain release‑gate thresholds with Product and the Tech Lead.
Lay the path for later adversarial and drift‑testing expansion without overbuilding MVP scope.

Requirements

Experience evaluating ML, LLM, or non‑deterministic systems.
Strong test and benchmark design capability.
Comfort working with noisy metrics, thresholds, and probabilistic behavior.
Good scripting and automation skills.
Uses AI to generate candidate eval cases and failure hypotheses, but never confuses gene...

Ready to Apply?

अभी आवेदन करें

Submit your application for Agent Quality / Evals Engineer 1754 at Softgic S.

Apply for this Position

Location workfromhome, workfromhome

Country Colombia

Type Full-time

Category Ingeniería de calidad

Posted June 29, 2026

Agent Quality / Evals Engineer 1754

Vacancy Description

Key Responsibilities

Requirements

Ready to Apply?

Vacancy Details

About Softgic S.

Softgic S.

Share This Vacancy