Case Study · EPA 608 · March 2026

AI Answered Confidently.
AI Was Wrong.

We asked three AI agents the same three EPA 608 regulatory questions under different conditions. Here's what happened.

2/3 Questions wrong
HIGH Confidence
$0 Web searches used

Condition A: training data only

scroll to explore

The Experiment

Three AI agents — identical model, identical questions — with one variable: what knowledge they could access. Each was asked three EPA 608 regulatory questions with verifiable, numerical answers drawn directly from the Code of Federal Regulations.

🧠
Condition A
Memory Only
  • Web search disabled
  • No skillbook
  • Training data only
🌐
Condition B
Web Search
  • Web search enabled
  • No skillbook
  • Live eCFR access
Condition C
Skillbook
  • Web search disabled
  • EPA 608 Skillbook
  • Verified knowledge graph

The Results Matrix

Every cell is a real answer from a real AI agent. Red = wrong. Green = correct. Confidence is self-reported.

🧠 A — Memory 🌐 B — Web Search ✦ C — Skillbook
Q1Charge Limit
"5 lbs"
HIGH
"15 lbs"
HIGH
"15 lbs" verbatim §82.156(e)
HIGH
Q2MVAC Vacuum
"4 in Hg vacuum (≈102 mm)"
MEDIUM
"102 mm of mercury vacuum"
HIGH
"102 mm of mercury vacuum"
HIGH
Q3Leak Rates
"20% / 30% / 10%"
LOW ⚠
"20% / 30% / 10%"
HIGH
"20% / 30% / 10%" + stale-data flag
HIGH
Searches 0 3 searches 0
Est. tokens ~900 ~5,500 ~2,200

5 lbs. High confidence. 3× off.

The model didn't waffle. It gave a specific number with complete confidence. It was wrong by a factor of three.

🧠 Condition A — Memory
5 lbs
HIGH confidence

No hesitation. No hedge. Just the wrong number, stated as fact.

🌐 Condition B — Web Search
15 lbs ✓
HIGH confidence

Found eCFR verbatim. Correct after 1 web search.

✦ Condition C — Skillbook
15 lbs ✓
HIGH confidence

Verbatim § 82.156(e) cited. Zero searches needed.

"System-dependent equipment may not be used with appliances with a full charge of more than 15 pounds of refrigerant, unless the system-dependent equipment is permanently attached to the appliance as a pump-out unit." — 40 CFR § 82.156(e)

The model didn't hedge. It committed to a number that was wrong by 3×. A student who memorized this answer would fail that question on the EPA 608 exam.

Wrong number. Wrong units. Confident explanation.

The model gave the wrong value in the wrong unit system — then constructed a rationalization that made them seem equivalent. They aren't.

❌ What the model said
4 in Hg
inches of mercury vacuum
✅ What the CFR says
102 mm
millimeters of mercury vacuum
4 inches Hg ≈ 101.6 mm Hg — not the regulatory standard, and not the same unit system the CFR uses
"…4 inches of mercury vacuum (≈ 102 mm Hg absolute)…" — Condition A model response (paraphrased)

The model guessed the wrong units, guessed a wrong number, then rationalized them as equivalent. They're not. The CFR says 102 mm. That's the answer. Full stop.

The right answer was on the internet.
So was the wrong one.

The model got Q3 right — but only uncertainly, and couldn't confirm which rule was in force. The regulatory landscape has two sets of numbers. One is stale.

2016
EPA revises §608
Rule finalized
Jan 1, 2019
New thresholds
take effect
Today
§82.157(c)(2)
in force
Old thresholds (§ 82.156(i), pre-2019)
Commercial refrigeration 35%
Industrial process 35%
Comfort cooling 15%
Current thresholds (§ 82.157(c)(2), 2019+)
Commercial refrigeration 20%
Industrial process 30%
Comfort cooling 10%

The model knew to be uncertain here — it flagged the old/new rule ambiguity. But without a verified source, it couldn't confirm which applied. The skillbook cites the current rule verbatim and flags the stale data trap explicitly so agents never guess.

Accuracy costs tokens.
Except when it doesn't.

Web search got to the right answer — but at 6× the token cost. The skillbook matched that accuracy at 2.5× the cost, with stronger sourcing and zero searches needed.

✦ Skillbook: 3× cheaper than web search, same accuracy, stronger sourcing

Try It Yourself

Here are the exact prompts used in each condition. Copy them into any AI assistant and compare results.

No web search · No skillbook
You are an EPA 608 certification expert. Answer using only your training knowledge — no web search, no external tools.

Q1: Under 40 CFR Part 82, what is the maximum refrigerant charge size (in pounds) for which system-dependent recovery equipment may be used during servicing — unless the equipment is a permanently installed pump-out unit?

Q2: Under 40 CFR § 82.156(c), what is the specific vacuum level (with exact units) required when disposing of MVAC-like appliances as an alternative to subpart B procedures?

Q3: Under 40 CFR § 82.157(c)(2), what are the specific annual leak rate thresholds for: (a) commercial refrigeration, (b) industrial process refrigeration, (c) comfort cooling?

For each: give your answer, state your confidence (HIGH/MEDIUM/LOW), and explain your reasoning briefly.
Web search enabled · No skillbook
You are an EPA 608 certification expert. You MAY use web search to find current regulatory information.

Q1: Under 40 CFR Part 82, what is the maximum refrigerant charge size (in pounds) for which system-dependent recovery equipment may be used during servicing — unless the equipment is a permanently installed pump-out unit?

Q2: Under 40 CFR § 82.156(c), what is the specific vacuum level (with exact units) required when disposing of MVAC-like appliances as an alternative to subpart B procedures?

Q3: Under 40 CFR § 82.157(c)(2), what are the specific annual leak rate thresholds for: (a) commercial refrigeration, (b) industrial process refrigeration, (c) comfort cooling?

For each: give your answer, state your confidence (HIGH/MEDIUM/LOW), and explain your reasoning briefly.
No web search · EPA 608 Skillbook
You are an EPA 608 certification expert. Use the EPA 608 Skillbook at https://skillbooks.ai/books/epa-608/SKILL.md — read SKILL.md first, then fetch the pages you need. Cite the skillbook page and quote the source verbatim. Do not use web search.

Q1: Under 40 CFR Part 82, what is the maximum refrigerant charge size (in pounds) for which system-dependent recovery equipment may be used during servicing — unless the equipment is a permanently installed pump-out unit?

Q2: Under 40 CFR § 82.156(c), what is the specific vacuum level (with exact units) required when disposing of MVAC-like appliances as an alternative to subpart B procedures?

Q3: Under 40 CFR § 82.157(c)(2), what are the specific annual leak rate thresholds for: (a) commercial refrigeration, (b) industrial process refrigeration, (c) comfort cooling?

For each: give your answer, state your confidence (HIGH/MEDIUM/LOW), cite the skillbook page, and quote the source verbatim.

What This Means

Three questions. Three conditions. Four lessons.

📅
Training data has an expiry date

Regulations change. The EPA updated §608 in 2019. Old materials still circulate. A model trained on pre-2019 data carries pre-2019 answers — and may not know it.

⚠️
High confidence ≠ accuracy

The model was most confident on its worst answer. "5 lbs — HIGH confidence" is more dangerous than "I'm not sure." Confidence scores don't validate facts.

🌐
Web search works — but it's expensive

At 6× the token cost, web search found the right answers by hitting eCFR directly. But it depends on finding the current rule, not an old one. And it costs every time.

Skillbooks are the better trade-off

Pre-verified, structured, chain of custody from regulation to content. 2.5× cheaper than web search. More reliable than memory. Explicitly flags known data traps.

The EPA 608 Skillbook

Agent-native knowledge graph built from authoritative CFR source documents. Available now for any AI agent that can fetch a URL.

Open the EPA 608 Skillbook →