DevOps Engineer - AI Model Evaluator

Obsidian
📍 helsinki, uusimaa, Finland 💼 Full-time 🕒 Posted July 01, 2026

Job Description

About the Role

  • Mercor is partnering with a leading AI research lab to support a Frontier Code Agents project.
  • Contributors help evaluate and improve frontier AI coding models through structured technical assessments.
  • The work focuses on realistic infrastructure engineering workflows and model evaluation.
  • Spots are limited and filling quickly on a first come, first serve basis.

What You'll Do

  • Use frontier AI coding agents to complete and evaluate complex infrastructure engineering tasks.
  • Review model-generated implementations involving cloud platforms, Kubernetes, CI/CD systems, observability, and infrastructure automation.
  • Identify bugs, edge cases, reliability issues, and failure modes.
  • Compare outputs from multiple frontier models and assess their strengths and weaknesses.
  • Apply professional engineering judgment to realistic infrastructure engineering scenarios.
<...

Ready to Apply?

Submit your application today and join our talented team at Obsidian.

Submit Application

Job Details

  • Location helsinki, uusimaa
  • Job Type Full-time
  • Category Software Development, Software Architecture & Engineering
  • Posted Date July 01, 2026
  • Application Deadline August 10, 2026