← Back to incidents

DeepSeek AI Exposes 1 Million User Records Including Chat Histories and API Keys in Public Database

High

Security researcher discovered DeepSeek AI's publicly accessible ClickHouse database containing over 1 million records of user chat histories, API keys, and system logs, highlighting critical security gaps in AI startup infrastructure.

Category
Privacy Leak
Industry
Technology
Status
Reported
Date Occurred
Jan 1, 2025
Date Reported
Jan 17, 2025
Jurisdiction
International
AI Provider
Other/Unknown
Model
DeepSeek-V3
Application Type
api integration
Harm Type
privacy
People Affected
1,000,000
Human Review in Place
Unknown
Litigation Filed
No
data_exposuredatabase_securityapi_keyschat_historieschinese_aiclickhouseprivacy_breachstartup_security

Full Description

On January 17, 2025, security researcher Gal Nagli at cloud security company Wiz disclosed a significant data exposure incident involving DeepSeek, the Chinese AI startup that had recently gained global attention for its competitive large language models. Nagli discovered that DeepSeek had left a ClickHouse database server publicly accessible on the internet without proper authentication or access controls. The exposed database contained over 1 million records spanning user interactions, system operations, and sensitive configuration data. Among the most concerning exposed information were complete user chat histories with DeepSeek's AI models, API keys that could potentially grant unauthorized access to DeepSeek's services, system logs containing operational details, and metadata about user sessions and queries. The database appeared to contain data accumulated over several months of DeepSeek's operations. Wiz's research team used standard internet scanning techniques to identify the exposed database, demonstrating that the data was accessible to anyone with basic technical knowledge. The security firm noted that the exposure represented a significant privacy risk for DeepSeek users, many of whom may have shared sensitive personal or business information in their conversations with the AI system. Additionally, the exposed API keys could have enabled malicious actors to access DeepSeek's services without authorization, potentially leading to service abuse or further data compromise. DeepSeek's rapid rise to prominence in late 2024 and early 2025, particularly following the release of their DeepSeek-V3 model, had attracted millions of users globally who were drawn to the company's claims of matching or exceeding the performance of leading Western AI models at lower costs. This popularity made the data exposure particularly significant, as it affected a large and growing user base that included both individual users and enterprise customers integrating DeepSeek's APIs into their applications. Following Wiz's responsible disclosure, DeepSeek acknowledged the security issue and took steps to secure the database. However, the incident raised broader questions about the security practices of rapidly scaling AI startups, particularly those operating across international jurisdictions with varying data protection requirements. The exposure highlighted the challenges faced by AI companies in properly securing the vast amounts of user data they collect and process, especially as they scale quickly to meet growing demand for AI services.

Root Cause

DeepSeek misconfigured a ClickHouse database server, leaving it publicly accessible without proper authentication or access controls. The database contained sensitive user data and system information that should have been protected behind security layers.

Mitigation Analysis

This incident could have been prevented through proper database security configuration including authentication requirements, network access controls, and regular security audits. Implementation of data classification policies would have identified sensitive data requiring additional protection. Automated security scanning tools could have detected the exposed database during routine infrastructure monitoring.

Lessons Learned

This incident demonstrates that even prominent AI startups may lack fundamental cybersecurity practices, particularly around database security and access controls. The exposure underscores the critical importance of implementing security-by-design principles in AI infrastructure and conducting regular security audits as companies scale rapidly.