← Back to incidents
AI Tutoring Platforms Provided Incorrect Math and Science Answers to Students
MediumMultiple AI tutoring platforms including Khan Academy's Khanmigo and Chegg's AI tutor provided incorrect answers to math and science problems. Students using these platforms experienced confusion and poor test performance, highlighting quality control issues in educational AI applications.
Category
Hallucination
Industry
Education
Status
Reported
Date Occurred
Jan 1, 2025
Date Reported
Jan 15, 2025
Jurisdiction
US
AI Provider
Other/Unknown
Application Type
chatbot
Harm Type
operational
Human Review in Place
No
Litigation Filed
No
educationtutoringmathsciencehallucinationstudent_harmed_techaccuracy
Full Description
In early 2025, multiple AI-powered tutoring platforms faced scrutiny after research documented significant error rates in their responses to student questions in mathematics and science. The platforms affected included Khan Academy's Khanmigo, Chegg's AI tutoring service, and several other ed-tech companies that had rapidly deployed large language model-based tutoring systems to compete in the growing online education market.
Educators and researchers began documenting instances where these AI tutors provided fundamentally incorrect answers to problems ranging from basic algebra to advanced calculus and chemistry. The errors were not limited to minor computational mistakes but included conceptual misunderstandings and flawed reasoning processes that could mislead students about fundamental principles. Teachers reported receiving student work that reflected these incorrect methodologies, leading to classroom confusion and the need for remedial instruction.
The issue gained prominence when education researchers published studies quantifying error rates across different platforms and subject areas. The research revealed that while AI tutors excelled at providing quick responses and maintaining engaging conversations with students, they frequently failed at the core educational task of providing accurate information. Mathematics problems involving multi-step reasoning and science questions requiring application of principles to novel scenarios showed particularly high error rates.
The incident sparked broader debates about accountability in educational technology and the appropriate use of AI in learning environments. Education advocates argued that the rush to deploy AI tutoring systems prioritized engagement and cost reduction over educational accuracy and student outcomes. The platforms involved faced pressure from educators, parents, and school districts to implement stronger quality control measures and provide transparency about their AI systems' limitations and error rates.
Root Cause
AI models used by tutoring platforms generated mathematically or scientifically incorrect responses due to hallucination and training data limitations, particularly in complex problem-solving scenarios requiring step-by-step reasoning.
Mitigation Analysis
Subject matter expert review of AI responses before delivery to students could have prevented most errors. Implementing mathematical verification systems to check computational answers and establishing feedback loops from educators to flag incorrect responses would significantly reduce harm. Regular testing against standardized curriculum benchmarks and transparent error rate reporting would improve platform reliability.
Lessons Learned
The incident highlighted the critical importance of rigorous testing and validation for AI systems deployed in educational contexts, where accuracy directly impacts student learning outcomes. It demonstrated the need for clear accountability standards in ed-tech and the risks of prioritizing rapid deployment over educational quality.
Sources
AI Tutoring Platforms Face Accuracy Concerns as Errors Impact Student Learning
Education Week · Jan 15, 2025 · news
The AI Tutoring Accuracy Crisis: When Helpful Becomes Harmful
Chronicle of Higher Education · Jan 12, 2025 · news