Thought Leadership

When 'False Positives' Still Find Real Problems

9 min read
Free first check

Run one real project first

Upload drawings, specs, codes, checklists, or city comments. Eligible work-email signups get a $100 first-check credit and evidence-backed issues in hours.

First $100 coveredWork email requiredResults in hours

Quick Summary

  • If AI flags 10 issues and 7 are real—that's 7 issues your team didn't catch
  • The goal isn't perfection. The goal is net risk reduction.
  • AI isn't making decisions—it's surfacing things worth looking at, faster and earlier
  • The real mistake isn't reviewing false positives. It's missing the ones that matter.

Here's an honest truth about AI in construction: it doesn't need to be perfect to be useful. If AI flags 10 issues and 7 of them are real—that's 7 issues your team didn't catch.

Yes—some flags will be false positives. So are plenty of RFIs generated by humans. We don't throw out RFIs because some are bad. We review them quickly and move on. AI works the same way.

The goal isn't perfection. The goal is net risk reduction. AI isn't making decisions. It's surfacing things worth looking at—faster, earlier, and across thousands of pages humans don't have time to cross-check.

When the Tool Pays for Itself

If one flagged issue prevents a plan check resubmittal, a change order, or a week-long delay—the tool already paid for itself.

The real mistake isn't reviewing a few false positives. It's missing the ones that matter.

The Investigation Trigger Effect

“The AI was wrong about the deck. But it prompted us to look at the joist—which might have its own code compliance question.” This observation from an engineering review captures another layer of value: even when the specific finding is incorrect, the investigation it triggers often uncovers different real problems.

The Investigation Trigger Effect

When AI flags a potential issue, an expert investigates. The investigation may reveal:

Investigation Outcomes

  • The flagged issue is valid

    AI was right. Issue gets fixed. Clear value.

  • The flagged issue is wrong, but investigation finds something else

    AI was wrong about this specific thing, but the expert's attention found a different problem in the same area.

  • The flagged issue is wrong, nothing else found

    True false positive. Expert dismisses in seconds. Minimal cost.

Real Example: The Deck That Led to the Joist

In a recent engineering review, AI flagged a steel deck overhang as potentially exceeding SDI limits. The structural engineer investigated and determined:

  1. 1
    AI flagged deck overhang

    "6-foot overhang may exceed SDI limits for collapse prevention"

  2. 2
    Engineer investigated

    Reviewed the deck detail, the supporting structure, and the SDI requirements

  3. 3
    Deck finding was partially valid

    "The deck is parallel to the wall at the edge. The detail shows outrigger but doesn't specify size and properties. The cantilever is pretty big for a deck."

  4. 4
    But investigation revealed more

    "While looking at this, I noticed the joist—which might have its own code compliance question."

The original finding was about the deck. But the investigation drew attention to a related structural element that warranted review—an issue that might otherwise have gone unnoticed.

Why This Happens

The investigation trigger effect occurs because:

Concentrated Attention

Reviewing a specific flagged area brings expert attention to that section of the drawings. The expert notices things they wouldn't have seen in a general review.

Context Loading

To evaluate the AI finding, the expert must understand the surrounding context. Loading that context into working memory makes related issues visible.

Professional Pattern Recognition

Experienced professionals have pattern recognition that AI doesn't. When AI draws their attention somewhere, their expertise catches things AI missed.

The Multiplier Effect

AI finds issues systematically across thousands of pages. Each finding that draws expert attention potentially reveals additional issues. The value multiplies beyond what AI directly identifies.

Implications for AI Accuracy Metrics

This effect complicates how we measure AI accuracy:

  • True positives: AI finding is correct. Clear value.
  • False positives: AI finding is wrong. But investigation may find different issues.
  • Investigation value: Even "wrong" findings that trigger productive investigation have value.

A pure accuracy metric misses the investigation trigger value. An AI system that's 80% accurate but consistently draws attention to problem-prone areas may deliver more value than one that's 90% accurate but finds only isolated issues.

Optimal Workflow

To maximize the investigation trigger effect:

Expert + AI Workflow

  1. 1AI identifies potential issues across all documents
  2. 2Expert reviews each AI finding in context
  3. 3Expert confirms, dismisses, or modifies the finding
  4. 4Expert notes any additional issues discovered during investigation
  5. 5Curated findings delivered to project team

The Takeaway

AI plan checking shouldn't be evaluated solely on whether each specific finding is correct. The value includes the investigation and expert attention each finding triggers. Even "false positives" that lead experts to discover different real problems deliver value.

AI + Expert Review

Our workflow combines AI identification with expert verification—maximizing both the direct findings and the investigation trigger value.

Questions? Chat or email

Stop Catching Errors in the Field

Run a real project first. Eligible work-email signups get a $100 first-check credit, then keep checking revised sets and future jobs from the same workflow.

First free check for eligible work-email signups

Run one real project first. Future checks are pay-per-run, with volume pricing available for teams.

See sample report (282 issues found)

Not sure yet? Upload a completed project you already know. See what we catch, then recheck a revised set or run the next project.

First free check
Results within hours
No sales call

Upload all project PDFs: drawings, specs, codes, checklists, shop drawings, submittals, contracts, zoning codes, city comments. AI checks everything against everything.

187,000+ issues caught across 500+ engineering and construction firms

One issue found pays for the whole check