DeepSeek AI Exposes 1 Million User Records Including Chat Histories and API Keys in Public Database

High

Security researcher discovered DeepSeek AI's publicly accessible ClickHouse database containing over 1 million records of user chat histories, API keys, and system logs, highlighting critical security gaps in AI startup infrastructure.

Full Description

On January 17, 2025, security researcher Gal Nagli at cloud security company Wiz disclosed a significant data exposure incident involving DeepSeek, the Chinese AI startup that had recently gained global attention for its competitive large language models. Nagli discovered that DeepSeek had left a ClickHouse database server publicly accessible on the internet without proper authentication or access controls. The exposed database contained over 1 million records spanning user interactions, system operations, and sensitive configuration data. Among the most concerning exposed information were complete user chat histories with DeepSeek's AI models, API keys that could potentially grant unauthorized access to DeepSeek's services, system logs containing operational details, and metadata about user sessions and queries. The database appeared to contain data accumulated over several months of DeepSeek's operations. Wiz's research team used standard internet scanning techniques to identify the exposed database, demonstrating that the data was accessible to anyone with basic technical knowledge. The security firm noted that the exposure represented a significant privacy risk for DeepSeek users, many of whom may have shared sensitive personal or business information in their conversations with the AI system. Additionally, the exposed API keys could have enabled malicious actors to access DeepSeek's services without authorization, potentially leading to service abuse or further data compromise. DeepSeek's rapid rise to prominence in late 2024 and early 2025, particularly following the release of their DeepSeek-V3 model, had attracted millions of users globally who were drawn to the company's claims of matching or exceeding the performance of leading Western AI models at lower costs. This popularity made the data exposure particularly significant, as it affected a large and growing user base that included both individual users and enterprise customers integrating DeepSeek's APIs into their applications. Following Wiz's responsible disclosure, DeepSeek acknowledged the security issue and took steps to secure the database. However, the incident raised broader questions about the security practices of rapidly scaling AI startups, particularly those operating across international jurisdictions with varying data protection requirements. The exposure highlighted the challenges faced by AI companies in properly securing the vast amounts of user data they collect and process, especially as they scale quickly to meet growing demand for AI services.

Root Cause

DeepSeek misconfigured a ClickHouse database server, leaving it publicly accessible without proper authentication or access controls. The database contained sensitive user data and system information that should have been protected behind security layers.

Mitigation Analysis

This incident could have been prevented through proper database security configuration including authentication requirements, network access controls, and regular security audits. Implementation of data classification policies would have identified sensitive data requiring additional protection. Automated security scanning tools could have detected the exposed database during routine infrastructure monitoring.

Regulatory Framework References

All frameworks →

EU AI Act

Art. 10—Data & Data GovernanceArt. 5(1)(a)—Prohibited Practice: Subliminal Manipulation

ISO/IEC 42001

A.8.5—Privacy and AIA.6.2.5—Data Provenance

NIST AI RMF

MAP 5.1—Privacy Risk IdentificationGOVERN 1.7—Privacy Values

Lessons Learned

This incident demonstrates that even prominent AI startups may lack fundamental cybersecurity practices, particularly around database security and access controls. The exposure underscores the critical importance of implementing security-by-design principles in AI infrastructure and conducting regular security audits as companies scale rapidly.

Sources

DeepSeek AI Database Exposed Over 1 Million Records Including Chat Histories

Wiz · Jan 17, 2025 · company statement

Security Firm Discovers DeepSeek AI Left User Data Exposed in Public Database

TechCrunch · Jan 17, 2025 · news