Now in beta — Join early access

AI Grading that cites its work.

AutoGrader combines large language models with human oversight to deliver transparent, evidence-anchored assessment at scale — every score backed by a quote from the student's own submission.

Start Grading See How It Works

app.autograder.ai / assignments / edl-hw2

AutoGrader human-in-the-loop review interface showing student submission, score breakdown and evidence

Total Score

49 / 50

↑ Evidence cited for every point

New policy rule added

IF partial answer → deduct 0.5pts

Confidence: 0.92 · 14 instances

Used in active courses at Carnegie Mellon University

Core Features

Built for the way real courses work

AutoGrader handles the full complexity of higher-education assessment — from OCR to policy learning — so instructors can focus on teaching.

🔍

Evidence-Anchored Scoring

Every point awarded or deducted is backed by an exact quoted passage from the student's submission — no black-box decisions, full auditability.

🧠

Policy Learning

When TAs override AI scores, the system learns generalizable IF/THEN grading rules that automatically apply to all future submissions in that course.

👁️

Human-in-the-Loop Review

A three-panel workstation gives TAs simultaneous access to the reference solution, student submission, and an interactive AI grading console.

📄

Multi-Format Ingestion

Accepts PDFs, Python files, Jupyter notebooks, Canvas quiz exports, HTML, and Markdown — reflecting the reality of today's mixed-format coursework.

✨

AI Rubric Builder

Upload a reference solution and answer a few questions — AutoGrader generates a complete, calibrated rubric through a clarifying dialogue workflow.

📊

Bulk Processing & Analytics

Process entire class sections with configurable concurrency. Export grade reports as Excel, CSV, PDF with histograms, or JSON for your LMS.

How It Works

Three-stage pipeline, zero opacity

A rigorous multi-stage architecture ensures that every decision is traceable, every score is defensible, and every override makes the system smarter.

Extraction & Parsing

Dual OCR (Mistral AI + PyMuPDF) produces rich markdown with block-level source maps — page, block, and line references for every passage.

LLM Judge Scoring

An LLM grades each rubric category with cited evidence (max 250 chars per quote), explicit reasoning, and calibrated confidence scores.

Policy Calibration

Stored grading rules from prior TA overrides are retrieved and applied — only above the confidence threshold — to align output with instructor expectations.

Human Review & Memory Update

TAs review, edit, and approve. Any override triggers the Memory Updater, which extracts a generalizable rule for future use.

Live Pipeline

📥

Extractor

Dual OCR · block-level source maps

Done

⚖️

Judge

Rubric adjudication · evidence citation

Done

🎯

Calibrator

Applying 3 policy rules…

Running

👤

TA Review

Human-in-the-loop workstation

Waiting

PDF Jupyter Python Canvas Quiz HTML Markdown

Policy Learning

Your grading style,
codified automatically

Every TA override becomes a reusable rule. The system accumulates institutional knowledge, progressively aligning with your pedagogical intent — no extra work required.

Override rate over time

↓ 68% fewer overrides

after 3 grading cycles with policy learning enabled

TA overrides per 100 submissions by cycle →

Cycle 12345Cycle 6

Policy Memory — EDL 201

Active grading rules
accumulated this semester

Total rule applications
across all assignments

0.91

Average rule confidence
score across active rules

↓68%

Fewer TA overrides after
3 grading cycles

Rule caps: 50 per assignment · 200 per course Auto-enforced

Review Interface

Everything a TA needs, in one view

The grading workstation puts reference solutions, student work, and AI analysis side-by-side — so reviewers spend time judging, not searching.

📋

Three-Panel Layout

Reference solution, student submission, and grading console shown simultaneously with lazy page rendering for performance.

💬

Conversational AI Console

Ask the grader follow-up questions, request score adjustments in natural language, and explore evidence trails interactively.

🔴

Real-Time Progress via WebSocket

Live pipeline status updates keep TAs informed as each stage completes, with adjacent submission prefetching for zero-lag navigation.

🔍

Plagiarism & Similarity Analysis

Integrated analysis surfaces potential similarity flags alongside grade breakdowns, keeping the review holistic.

AutoGrader — Chat with Grader

AutoGrader conversational AI grading interface

Total Score: 50/50 — AI grader responds with cited reasoning

What educators say

Grading has never felt this transparent

Hear from instructors and teaching assistants who've used AutoGrader in real courses.

★★★★★

"AutoGrader has completely changed how I think about grading at scale. The evidence-anchored scores mean students never question why they lost a point — it's right there in their own words. We went from three days of TA grading to a single afternoon review session."

Saksham Bhutani

Teaching Assistant · ECE Department · CMU

★★★★★

"I was skeptical that an AI could capture the nuance our rubrics require, but after the first homework the policy learning had already picked up on our notation conventions. By week four I was spending more time on teaching than grading — which is exactly how it should be."

Kiruthika Raja

Teaching Assistant · ECE Department · CMU

AI Grading that cites its work.

Built for the way real courses work

Evidence-Anchored Scoring

Policy Learning

Human-in-the-Loop Review

Multi-Format Ingestion

AI Rubric Builder

Bulk Processing & Analytics

Three-stage pipeline, zero opacity

Extraction & Parsing

LLM Judge Scoring

Policy Calibration

Human Review & Memory Update

Your grading style,codified automatically

Everything a TA needs, in one view

Three-Panel Layout

Conversational AI Console

Real-Time Progress via WebSocket

Plagiarism & Similarity Analysis

Grading has never felt this transparent

Ready to reclaimyour grading time?

Your grading style,
codified automatically

Ready to reclaim
your grading time?