Skip to main content
AI & Privacy

Google AI Overview Is Showing Your Personal Data: Here's What to Do

Discover how Google AI Overview may expose your personal data and learn practical steps to protect your privacy. Take control of your information now.

Written by GhostMyData TeamFebruary 18, 202615 min read

You've probably searched for yourself on Google before. Maybe you found an old social media profile or a mention in a news article. But now, with Google's AI-powered search overviews rolling out to millions of users, something more concerning is happening: Google's AI is surfacing detailed personal information about you in conversational summaries that appear at the top of search results—often pulling from data broker sites, public records, and scraped web content you never knew existed.

Unlike traditional search results where you could at least see the source and context, AI overviews synthesize information from multiple sources into a single, authoritative-sounding answer. When someone searches your name, Google's AI might generate a summary that includes your age, address history, phone numbers, relatives' names, and more—all presented as verified fact, even when it's outdated or wrong.

Here's what you need to know about how your personal data ends up in AI search results, and more importantly, what you can actually do about it.

How AI Systems Collect and Use Your Data

Google's AI Overview (formerly known as Search Generative Experience or SGE) doesn't create information out of thin air. It's built on massive language models trained on enormous datasets scraped from across the internet. But the collection process for your personal data happens in layers, and understanding these layers is crucial to protecting yourself.

The primary collection methods include:

  • Web crawling and indexing: Google's bots continuously scan publicly accessible websites, including data broker sites, people search engines, public records databases, social media profiles, and any webpage mentioning your name
  • Structured data extraction: AI systems are particularly good at identifying and extracting structured personal information—names, addresses, phone numbers, email addresses, employment history, and family relationships
  • Cross-referencing and inference: Modern AI doesn't just collect what it finds; it makes connections between disparate pieces of information to build comprehensive profiles
  • Real-time synthesis: When someone searches for you, the AI doesn't just retrieve cached results—it actively synthesizes current information from multiple sources into a cohesive narrative

The problem intensifies because Google's AI treats data broker sites as legitimate sources. Sites like Whitepages, Spokeo, BeenVerified, and hundreds of others compile detailed dossiers on virtually every American adult. These sites scrape public records, purchase data from other aggregators, and use algorithmic inference to fill in gaps. When Google's AI pulls from these sources, it's essentially laundering data broker information into seemingly authoritative search results.

What makes AI overviews particularly problematic:

The conversational nature of AI-generated summaries creates a false sense of accuracy and authority. Traditional search results let you evaluate sources—you can see if information comes from a sketchy data broker site versus a legitimate news outlet. AI overviews strip away that context, presenting a unified answer that blends reliable and unreliable sources without distinction.

Moreover, these systems struggle with temporal accuracy. Your old address from five years ago might appear in an AI overview as if it's current. An outdated phone number could be presented as your active contact information. For anyone trying to escape an abusive relationship, hide from stalkers, or simply maintain basic privacy, this is more than inconvenient—it's dangerous.

Where Your Data Ends Up in AI Training Pipelines

Understanding the journey your personal information takes from initial collection to AI-generated search results helps you identify intervention points where you can actually remove or suppress this data.

The Data Broker Ecosystem

At the foundation of most AI-exposed personal data sits the data broker industry—a largely unregulated network of companies that collect, aggregate, and sell personal information. The ecosystem works like this:

Primary data brokers purchase bulk data from sources like credit bureaus, public records offices, retailers, and app developers. They compile this into master databases containing billions of records. Companies like Acxiom, Epsilon, and Oracle Data Cloud operate at this level.

Secondary aggregators purchase data from primary brokers and repackage it for specific use cases—background checks, people search, marketing, fraud prevention, or "identity verification." This is where companies like Spokeo, Intelius, and PeopleFinders operate.

People search engines represent the consumer-facing layer, offering anyone the ability to search for personal information about others, often for free or for a small fee. These sites are specifically designed to rank well in Google search results, making them prime sources for AI overviews.

Training Data vs. Real-Time Retrieval

It's important to distinguish between two ways your data appears in AI systems:

Training data: Large language models are trained on massive datasets that may include archived web content, including snapshots of data broker sites, social media, and public records. This data becomes embedded in the model's parameters. Removing your information from this layer is nearly impossible without retraining the entire model.

Real-time retrieval: Google's AI Overview uses retrieval-augmented generation (RAG), meaning it searches current web content and synthesizes it in real-time. This is actually good news for privacy—it means removing your data from source websites can prevent it from appearing in AI overviews.

The Public Records Pipeline

Much of the personal information in AI search results originates from public records—court documents, property records, voter registrations, professional licenses, and more. These records are legitimately public, but their aggregation and easy searchability creates privacy risks that didn't exist when you had to physically visit a courthouse to access them.

Data brokers employ specialized scrapers that continuously monitor county clerk websites, state databases, and federal records repositories. They normalize this data across different formats and jurisdictions, then link it to individual profiles. When Google's AI searches for information about you, it finds these aggregated records presented in easily digestible formats.

Step-by-Step: How to Opt Out or Remove Your Data

Taking control of your personal information in the age of AI search requires a multi-layered approach. No single action will completely remove you from AI overviews, but combining these strategies significantly reduces your exposure.

Immediate Actions You Can Take Today

1. Remove yourself from major people search sites

Start with the largest and most commonly cited people search engines. These sites frequently appear as sources in AI overviews:

  • Whitepages: Visit whitepages.com/suppression-requests, verify your listing, and submit a removal request. Note that Whitepages requires you to create an account to verify removals.
  • Spokeo: Go to spokeo.com/optout, search for your listing, and follow the email verification process. Removals typically process within 72 hours.
  • BeenVerified: Navigate to beenverified.com/app/optout/search, find your profile, and submit the opt-out form. You'll need to verify via email.
  • Intelius: Visit intelius.com/opt-out, locate your listing, and complete the multi-step verification process.

2. Leverage legal opt-out rights

If you're a California resident, the California Consumer Privacy Act (CCPA) gives you the explicit right to request deletion of your personal information from data brokers. The law defines data brokers as businesses that collect and sell personal information about consumers with whom they don't have a direct relationship (California Civil Code § 1798.99.80).

For residents of other states:

  • Virginia (Consumer Data Protection Act): Grants deletion rights effective January 2023
  • Colorado (Privacy Act): Provides similar protections as of July 2023
  • Connecticut (Data Privacy Act): Offers deletion rights as of July 2023
  • Utah (Consumer Privacy Act): Grants rights effective December 2023

When submitting requests to data brokers, explicitly cite your state's privacy law and request confirmation of deletion in writing.

3. Request removal from Google Search results

Google offers a specific removal request process for personal information in search results. Visit support.google.com/websearch/troubleshooter/9685456 and select the appropriate category:

  • Personal contact information: For phone numbers, addresses, and email addresses
  • Government identification numbers: For SSN, driver's license numbers, etc.
  • Financial information: For bank account or credit card numbers

Google evaluates these requests individually and may remove results from search (which would also prevent them from appearing in AI overviews) without removing the content from the source website.

Long-Term Privacy Strategies

4. Implement ongoing monitoring

Personal information doesn't stay removed. Data brokers continuously refresh their databases with new public records, purchased data, and web scrapes. Information you removed six months ago may reappear.

Manual monitoring means regularly searching for yourself on Google (use incognito mode), checking major people search sites monthly, and setting up Google Alerts for your name, address, and phone number. However, this approach is time-consuming and incomplete—you're only checking sources you know about.

5. Reduce your digital footprint going forward

Prevention is more effective than remediation:

  • Use unique email addresses for different services (consider email aliasing services)
  • Provide minimal information on social media profiles and adjust privacy settings to "Friends Only"
  • Opt out of marketing data sharing when creating accounts (look for checkboxes during signup)
  • Use a P.O. box or mail forwarding service instead of your home address for public-facing registrations
  • Request that your information be excluded from public records where legally permitted (some states allow voters to register confidentially if they're at risk)

6. Address the source: public records

While you generally cannot remove legitimate public records, you can sometimes limit their accessibility:

  • Property records: Consider using an LLC or trust to hold property, which obscures personal ownership
  • Voter registration: Many states offer confidential voter programs for domestic violence survivors, law enforcement, and judges
  • Court records: In some cases, you can petition to seal or redact personal information from court documents

What the Law Says About AI and Your Personal Data

The legal landscape governing AI's use of personal data is fragmented and evolving rapidly. Understanding current protections—and their limitations—helps you know what rights you can actually enforce.

Federal Privacy Law: The Gaps

The United States lacks comprehensive federal privacy legislation. The closest thing we have is sector-specific laws:

The Fair Credit Reporting Act (FCRA) regulates consumer reporting agencies but exempts many data brokers that don't provide information for credit, employment, or insurance purposes. This is why people search sites can operate largely unchecked—they claim to fall outside FCRA's scope.

The Children's Online Privacy Protection Act (COPPA) protects children under 13 but offers no protections for adults.

The Health Insurance Portability and Accountability Act (HIPAA) covers health information held by healthcare providers and insurers but doesn't apply to health data collected by apps, wearables, or data brokers.

This patchwork approach means that for most Americans, there's no federal right to request deletion of personal information from data brokers or AI training datasets.

State Privacy Laws: Your Best Current Protection

State legislatures have moved faster than Congress. As of 2024, thirteen states have enacted comprehensive privacy laws, with more pending:

California's CCPA and CPRA (California Privacy Rights Act, which amended CCPA) provide the strongest protections:

  • Right to know what personal information is collected
  • Right to delete personal information
  • Right to opt out of sale or sharing of personal information
  • Right to limit use of sensitive personal information
  • Private right of action for data breaches (California Civil Code § 1798.150)

Importantly, CCPA applies to data brokers specifically. California law requires data brokers to register with the state Attorney General and honor deletion requests from California residents.

Virginia, Colorado, Connecticut, and Utah have similar frameworks with some variations in enforcement mechanisms and covered entities.

AI-Specific Regulations: Emerging Framework

Lawmakers are beginning to address AI specifically:

The EU AI Act (adopted in 2024) classifies AI systems by risk level and imposes obligations accordingly. High-risk AI systems face strict requirements around data governance, transparency, and human oversight. While this primarily affects companies operating in the EU, it creates spillover effects for U.S. companies serving European users.

State-level AI bills are proliferating. Colorado passed the first comprehensive AI regulation in 2024, requiring algorithmic impact assessments for high-risk AI systems. California, New York, and Massachusetts have similar bills under consideration.

The Federal Trade Commission has taken enforcement action against companies using AI in deceptive or unfair ways, relying on its existing authority under Section 5 of the FTC Act. However, this is reactive enforcement, not proactive regulation.

What This Means for AI Search Results

Currently, no law specifically prohibits Google from displaying your personal information in AI overviews, as long as that information is publicly available or obtained from data brokers operating legally.

However, you can leverage existing privacy laws to:

  • Request deletion from data brokers that serve as sources for AI systems
  • Demand removal of inaccurate information under state privacy laws
  • Report violations when companies don't honor legitimate deletion requests

The legal strategy is to cut off AI's sources rather than targeting the AI itself.

What's Coming Next in AI Privacy Regulation

The regulatory landscape is shifting rapidly. Understanding emerging trends helps you anticipate new protections—and new risks.

Federal Legislation in the Pipeline

Multiple federal privacy bills are under consideration, including:

The American Data Privacy and Protection Act (ADPPA) would create a national privacy framework with data minimization requirements, opt-out rights for targeted advertising, and restrictions on sensitive data processing. The bill has bipartisan support but faces obstacles around federal preemption of state laws.

The AI Accountability Act would require companies deploying high-risk AI systems to conduct impact assessments and submit reports to the FTC. This could force AI search providers to evaluate privacy risks specifically.

Sector-specific AI bills targeting facial recognition, algorithmic hiring tools, and automated decision-making in credit and housing are advancing separately.

State Innovation and Experimentation

Expect more states to follow California's lead:

  • Data broker registration requirements are spreading, making it easier to identify and contact companies holding your data
  • Sensitive data protections are expanding to cover biometric information, precise geolocation, and genetic data
  • Automated decision-making transparency requirements may force AI systems to disclose when personal data influences search results

International Influence

The EU's regulatory approach is influencing global standards:

The Digital Services Act requires large platforms to provide transparency around content moderation and algorithmic recommendations. While focused on content, similar principles could apply to AI search results.

The right to explanation under GDPR (Article 22) gives EU residents the right to understand automated decisions affecting them. This could extend to understanding why specific personal information appears in AI overviews.

Self-Regulation and Industry Standards

Major tech companies are developing voluntary AI principles, though critics argue these lack enforcement mechanisms:

  • Google's AI Principles commit to avoiding unfair bias and building in privacy protections, but provide no specific commitments around personal data in search results
  • The Partnership on AI brings together companies, researchers, and civil society to develop best practices
  • Technical standards organizations like NIST are developing AI risk management frameworks

The effectiveness of self-regulation remains to be seen, but it may shape regulatory approaches.

What to Watch For

Key indicators that meaningful change is coming:

  • Data broker class action lawsuits that successfully challenge the legal basis for collecting and selling personal information
  • State attorneys general enforcement actions against AI companies for privacy violations
  • Federal Trade Commission guidance specifically addressing AI's use of personal data
  • Platform-level changes where Google, Microsoft, or other AI providers implement opt-out mechanisms in response to user pressure

The trajectory is toward greater regulation, but the timeline remains uncertain. In the meantime, individual action remains your best protection.

How GhostMyData Monitors for AI-Related Data Exposure

The challenge with protecting yourself from AI search exposure is the sheer scale of the problem. There aren't just a handful of data brokers feeding information to AI systems—there are thousands. Manually opting out is like playing whack-a-mole: you remove your data from one site, and it pops up on three others you didn't know existed.

This is where automated monitoring becomes essential. GhostMyData was built specifically to address the scale problem. Instead of covering 35-50 data brokers like most privacy services, GhostMyData scans 2,100+ data broker sites—including the obscure secondary aggregators and niche people search engines that most people never find on their own.

How the System Works

24 AI agents continuously monitor data broker sites for your personal information. When they detect a listing, they automatically submit removal requests using the specific opt-out procedures for each site. This isn't just form-filling—the system adapts to each broker's unique verification requirements, follows up on pending requests, and re-submits when sites don't respond.

The monitoring is ongoing because data brokers constantly refresh their databases. Information you removed last month may reappear from a new source. GhostMyData's system detects these re-appearances and initiates new removal requests automatically.

AI Search-Specific Monitoring

As AI search systems evolve, so does the monitoring approach. GhostMyData tracks:

  • Source attribution: When your information appears in search results, identifying which data brokers are being cited as sources
  • AI-specific data brokers: Some newer data aggregators exist specifically to serve AI training and retrieval systems
  • Emerging exposure vectors: As new AI search products launch (Microsoft Copilot, Perplexity, etc.), monitoring expands to cover their data sources

You can start with a free scan to see exactly what information about you is currently available across the data broker ecosystem. The scan reveals not just that your data exists, but specifically where it's located, what details are exposed, and which sites are most likely to be cited by AI search systems.

The Practical Difference

Consider the math: if you manually opt out from data bro

ai-privacyprivacydata removalGoogle AI overview personal dataGoogle AI privacyAI search personal info

Ready to Remove Your Data?

Stop letting data brokers profit from your personal information. GhostMyData automates the removal process.

Start Your Free Scan

Get Privacy Tips in Your Inbox

Weekly tips on protecting your personal data. No spam. Unsubscribe anytime.

Related Articles