AI Quality Associate
A highly specialized role for evaluating agent behaviors and building the evaluation frameworks needed for high-quality LLM outputs.
AI Quality Associate
Atlanta; Boston; Charlotte; Chicago; Dallas; Los Angeles; New York; San Francisco
AI Quality Associate
The AI Quality Associate focuses on validating the quality of AI-generated outputs, agent behaviors, and AI-assisted workflows. This impactful role will be critical in building benchmark scenarios, defining scoring rubrics, evaluating business usefulness, and identifying failure patterns that conventional pass or fail software testing would not catch.
What You'll Do:
AI Output Evaluation Design and execute structured evaluations for AI-enabled features and workflows Assess outputs for groundedness, instruction adherence, consistency, usefulness, tone, control compliance, and risk Identify hallucinations, unsupported assertions, missing logic, and unsafe recommendations
Benchmark & Rubric Development Build and maintain golden datasets, benchmark prompts, comparison sets, and scorecards Develop rubrics that allow quality to be measured consistently across releases and changes
Workflow & Model Change Validation Compare performance across prompt versions, workflow revisions, tools, and models Support release decisions with evidence on quality regression or improvement
Business & Domain Partnership Work closely with Finance SMEs, product managers, and engineers to determine what acceptable looks like in real business contexts Help define human-review thresholds and escalation patterns for higher-risk use cases
Production Feedback Analyze reviewer feedback, override patterns, and live quality signals to improve evaluation coverage over time
You Have: 4+ years of experience in QA, analytics, business process validation, AI evaluation, operations, or similar roles Strong writing, analysis, and pattern-recognition skills Experience evaluating outputs against nuanced criteria rather than only binary correctness Experience with finance, accounting, FP&A, transaction services, or business process design preferred Ability to work with structured rubrics, scenario libraries, and evidence-based reviews – you are thoughtful, precise, and highly discerning Ability to spot subtle output problems others miss Comfort with ambiguity but disciplined in scoring and documentation Comfort collaborating across Engineering and business teams, with a focus on trust, usefulness, and business reality
The annual salary for this role ranges from: $110,000 to $145,000 + benefits + bonus.