AI Quality Associate

company:Accordionlocation:New York, NYremote:Hybridcomp:$110,000 to $145,000

# editor_note · curation

A highly specialized role for evaluating agent behaviors and building the evaluation frameworks needed for high-quality LLM outputs.

source_signals.ymlwhat justified inclusion

✓signal_1:validating the quality of AI-generated outputs

✓signal_2:agent behaviors

✓signal_3:and AI-assisted workflows

✓signal_4:building benchmark scenarios

✓signal_5:defining scoring rubrics

✓signal_6:evaluating business usefulness

tagsarchetype + ai-native surface

#AI Evaluation#Benchmark Design#Prompt Engineering#Hallucination Detection#Golden Datasets

descriptionsource excerpt

AI Quality Associate

Atlanta; Boston; Charlotte; Chicago; Dallas; Los Angeles; New York; San Francisco

AI Quality Associate

The AI Quality Associate focuses on validating the quality of AI-generated outputs, agent behaviors, and AI-assisted workflows. This impactful role will be critical in building benchmark scenarios, defining scoring rubrics, evaluating business usefulness, and identifying failure patterns that conventional pass or fail software testing would not catch.

What You'll Do:

AI Output Evaluation Design and execute structured evaluations for AI-enabled features and workflows Assess outputs for groundedness, instruction adherence, consistency, usefulness, tone, control compliance, and risk Identify hallucinations, unsupported assertions, missing logic, and unsafe recommendations

Benchmark & Rubric Development Build and maintain golden datasets, benchmark prompts, comparison sets, and scorecards Develop rubrics that allow quality to be measured consistently across releases and changes

Workflow & Model Change Validation Compare performance across prompt versions, workflow revisions, tools, and models Support release decisions with evidence on quality regression or improvement

Business & Domain Partnership Work closely with Finance SMEs, product managers, and engineers to determine what acceptable looks like in real business contexts Help define human-review thresholds and escalation patterns for higher-risk use cases

Production Feedback Analyze reviewer feedback, override patterns, and live quality signals to improve evaluation coverage over time

You Have: 4+ years of experience in QA, analytics, business process validation, AI evaluation, operations, or similar roles Strong writing, analysis, and pattern-recognition skills Experience evaluating outputs against nuanced criteria rather than only binary correctness Experience with finance, accounting, FP&A, transaction services, or business process design preferred Ability to work with structured rubrics, scenario libraries, and evidence-based reviews – you are thoughtful, precise, and highly discerning Ability to spot subtle output problems others miss Comfort with ambiguity but disciplined in scoring and documentation Comfort collaborating across Engineering and business teams, with a focus on trust, usefulness, and business reality

The annual salary for this role ranges from: $110,000 to $145,000 + benefits + bonus.