Methodology

Provyn Index is designed to be the most rigorous structured database of AI incidents available. This page describes how we collect, verify, and maintain the data.

Data Collection

Incidents are identified through systematic monitoring of multiple source categories:

  • 1.News reporting — Major technology, legal, and industry publications including Reuters, Bloomberg, The New York Times, WIRED, MIT Technology Review, and The Verge.
  • 2.Regulatory filings — FTC actions, EEOC complaints, EU DPA decisions, NHTSA investigations, SEC enforcement, and state attorney general actions.
  • 3.Court records — Federal and state court filings, class action complaints, settlement agreements, and judicial opinions.
  • 4.Academic research — Peer-reviewed papers, pre-prints, and technical reports documenting AI system failures, bias audits, and safety evaluations.
  • 5.Company disclosures — Official statements, incident reports, safety cards, and post-mortems from AI providers and deployers.

Structuring Process

Each incident is structured into a standardized record with 30+ metadata fields designed for actuarial and compliance use. The structuring process uses AI-assisted research to extract and organize information from source documents, with specific instructions to:

Use only factual information supported by the source material
Provide conservative financial estimates based on documented figures
Include at least one verifiable source URL per incident
Flag uncertainty with null values rather than speculating

Verification Pipeline

Every incident passes through a Chain-of-Verification (CoVe) quality process:

1.Claim Decomposition

Each record is decomposed into individual verifiable claims (dates, financial figures, entities, outcomes).

2.Cross-Field Consistency

Automated checks verify internal consistency — dates are chronologically valid, financial figures are within plausible ranges, severity ratings match documented impact.

3.Source Quality Assessment

Source URLs are validated. Incidents with only unverifiable or low-credibility sources are flagged for manual review.

4.Description Quality

Detailed descriptions are evaluated for specificity, factual density, and absence of hallucinated details. Descriptions under quality thresholds are regenerated with tighter constraints.

5.Auto-Fix & Flagging

Clear data quality issues (format errors, out-of-range values) are auto-corrected. Ambiguous issues are flagged for human review.

Data Quality Standards

Minimum source citations per incident1 (target: 2+)
Financial estimatesDocumented figures only; null if unknown
Severity classificationBased on documented impact, not media coverage
Verification pass rate required“Acceptable” or above
Update cadenceContinuous; new incidents added weekly

Limitations & Transparency

We acknowledge the following limitations:

  • Selection bias. Publicly reported incidents skew toward Western media coverage. Incidents in non-English-speaking regions may be underrepresented.
  • Financial estimates. Documented costs often reflect settlements or fines rather than total economic impact. True costs are likely higher.
  • AI-assisted structuring. While we use AI to help structure data, all records pass through verification. We document this transparently rather than obscuring it.
  • Evolving incidents. Some incidents are ongoing. We track updates but initial entries may not reflect final outcomes.

Corrections & Contributions

We welcome corrections, additional sources, and new incident submissions. Data accuracy is our highest priority.