ChatGPT drives bulk of enterprise generative AI data risk
Harmonic Security has published new analysis of 22.4 million generative AI prompts from 2025 and said a small number of applications account for the vast majority of enterprise data exposure.
The company analysed 671 genAI tools and reported that six applications made up 92.6% of potential data exposure across the dataset.
It said the typical organisation could focus governance on that group to reduce risk.
ChatGPT concentration
ChatGPT accounted for 71.2% of data exposures in the research. Harmonic said ChatGPT represented 43.9% of the 22.4 million prompts reviewed.
The analysis also pointed to Microsoft Copilot and Google Gemini as applications with higher data exposure proportions than their share of prompts, though at a much smaller scale than ChatGPT. Harmonic reported Microsoft Copilot at 2.9% of use and Gemini at 3.2% of use within the dataset.
Harmonic's figures indicate that overall exposure risk did not map evenly to usage volumes across tools. The company positioned that imbalance as a factor in risk prioritisation for security teams.
Personal accounts
Harmonic said most exposure in the dataset occurred via business-provided tools. It also found that 17% of all exposures took place through personal or free accounts.
The company said those accounts sit outside typical corporate controls. It said IT teams have "zero visibility, no audit trails, and data may train public models" in such cases.
Within the 98,034 instances that Harmonic classified as sensitive, it said 87% occurred via ChatGPT Free. It reported the remainder as being concentrated in a small group of widely used services: Google Gemini with 5,935 instances, Microsoft Copilot with 3,416, Claude with 2,412, and Perplexity with 1,245.
Harmonic also said Cloud Access Security Brokers often struggle to differentiate between account types. It said such tools often rely on broad blocking approaches.
Long-tail governance
Beyond the dominant applications, Harmonic said the remaining 600-plus tools still create a governance burden. It said a blanket approach that blocks AI-related sites can affect embedded AI features in mainstream services.
The company cited examples including Canva, Google Translate, Grammarly and Gamma. It said blocking such sites can create "significant organizational friction," and it added that controls may then get abandoned.
Harmonic also said 4% of genAI usage in the dataset came from China-based applications. It said these apps had "no oversight at all."
Types of data
Harmonic said 579,000 prompts, or 2.6% of the 22.4 million, contained company-sensitive data. It said the nature of that data often makes detection difficult because it is "highly unstructured."
The company broke down the leading categories of exposure. It said code represented 30% of data exposures. It said legal discourse accounted for 22.3%.
Harmonic reported that M&A data made up 12.6% of exposures, financial projections 7.8%, and investment portfolio data 5.5%. It also listed access keys, personally identifiable information, and sales pipeline data among other categories found in the prompts.
"Regulating access to the 'big six' GenAI apps can mean organizations take a giant step towards controlling their overall AI data exposure. ChatGPT in particular needs to be tightly controlled with a data exposure risk far greater than its use. But, critically, blocking isn't the answer. There are multiple ways for employees to circumvent controls and organizations are at risk from missing out on the huge productivity benefits AI can provide.
Paterson continues: Organisations need to move to enablement whereby employees are given access to the best GenAI options for their business but with oversight whereby employees are warned and / or blocked from uploading sensitive information. This is a fast-moving area, just because six dominate there are 661 tools we found to be in use which includes specialized coding assistants, domain-specific tools, and AI features embedded in existing tools. Four percent of usage also came from China-based apps which have no oversight at all. The businesses who win tend to focus first on the 'big six' initially but lean into the long tail with fine-grained data controls."
The analysis reflects employee activity within organisations that use Harmonic's monitoring products, according to the company. Harmonic said no personally identifiable information or proprietary file contents left customer environments. It said it aggregated and sanitised data before analysis.
Harmonic said the findings point to continued pressure on enterprises to manage a growing number of genAI services, and it expects use of specialised assistants and embedded AI features to remain a significant factor in exposure patterns.