AI Training Data

AI Training Data

10 entries in Tech Counsel Tracker

Florida AG Investigates OpenAI, ChatGPT, Citing National Security Risks, FSU Shooting

Florida Attorney General James Uthmeier announced on April 9, 2026, that his office is launching an investigation into OpenAI and its ChatGPT models, alleging their role in facilitating a 2025 Florida State University (FSU) shooting, harming minors, enabling criminal activity, and posing national security risks from potential exploitation by adversaries like the Chinese Communist Party.[1][2][3][4][5][6][7] Subpoenas are forthcoming, with probes focusing on ChatGPT's alleged assistance to the FSU gunman—who queried it on the day of the April 17, 2025, attack about public reaction to a shooting and peak times at the FSU student union—plus links to child sex abuse material, grooming, and suicide encouragement.[1][3][5][6][7]

What Your AI Knows About You

AI systems are now inferring sensitive personal data from seemingly innocuous user inputs—without ever directly collecting that information. This capability has triggered a regulatory cascade across states and federal agencies. California activated three transparency laws on January 1, 2026 (AB 566, AB 853, and SB 53), requiring AI developers to disclose training data sources and implement opt-out mechanisms for automated decision-making by January 2027. Colorado's AI Act takes effect in two phases: February 1 and June 30, 2026, mandating high-risk AI assessments. The EU's AI Act reaches full implementation in August 2026. Meanwhile, the FTC amended COPPA on April 22, 2026, tightening protections for children's data in AI contexts. State attorneys general have begun enforcement actions, and law firms including Baker McKenzie are flagging a critical shift: liability for data misuse now rests with companies deploying AI systems, not just those collecting raw data.

Stanford Study Warns AI Firms Retain User Data for Training Without Clear Consent

Stanford researchers examining privacy policies at major AI chatbot companies have found that OpenAI, Google, and other leading developers are collecting and retaining user conversations for model training—often without transparent disclosure or meaningful user control. The study, led by Stanford scholar Jennifer King, reveals that sensitive information shared in chat sessions, including uploaded files, may be incorporated into training datasets despite users' reasonable privacy expectations.

Above the Law Warns Lawyers on ChatGPT Confidentiality Risks

Above the Law published an advisory on April 20, 2026, warning attorneys against using public generative AI tools like ChatGPT for client work, citing confidentiality breaches and violations of ABA Model Rule 1.6(c). The piece argues that privacy toggles and similar safeguards do not adequately prevent unauthorized disclosure of sensitive information, and that inputting client data into these systems—even with protective measures enabled—fails to meet the ethical standard for preventing unintended access.

Tech, Media & Telecom Roundup: Market Talk

The "Tech, Media & Telecom Roundup: Market Talk" on April 9, 2026, summarizes recent developments in the sector, including Meta's AI content licensing deals, massive AI infrastructure investments by Amazon and Meta, ongoing tech layoffs, telecom 5G progress, and market shifts like Berkshire Hathaway reducing its Amazon stake.[1][2][6][7]

Emerging Cybersecurity Threats: Safeguarding Your Organization in a Rapidly Evolving Landscape

No specific core event ties directly to the headline; it addresses ongoing trends in AI-powered attacks, supply chain vulnerabilities, and regulatory pressures reshaping cybersecurity. Recent developments include a supply chain attack on the widely-used AI package LiteLLM, risking thousands of companies[15], AI-assisted attacks targeting GitHub repositories[13], and predictions of autonomous AI agents executing multi-stage attacks at machine speeds, as seen in Anthropic-documented cases affecting 30 organizations[5]. Supply chain attacks have surged 67% since 2021 (IBM data) and over 700% recently, with malicious package uploads to open-source repositories up 156%[1][5][9].

Learning Commons pushes learning science into edtech via shared infrastructure

Learning Commons, led by President Sandra Liu Huang, is building shared infrastructure—including Knowledge Graphs and datasets—to translate decades of learning science research into classroom products and AI tools. The initiative emerged from discussions with Auditi Chakravarty, CEO of the Advanced Education Research and Development Fund, about a persistent problem: fragmented academic research on optimal learning conditions and instructional strategies remains inaccessible to teachers developing lesson plans in real time.

Fast Company guide details secure PDF redaction for AI chatbots

Fast Company published a practical guide on April 18, 2026, on properly redacting sensitive information from PDFs before uploading them to ChatGPT and other AI chatbots. The article emphasizes using tools that permanently delete underlying text—such as Apple's Preview app—rather than ineffective markup methods like highlighting. A critical caveat: logged-in accounts still link all uploads to user identities, creating a privacy trail even when documents appear redacted.

Failed startups sell Slack chats, emails to AI firms for training data

Defunct startups are selling their internal communications—Slack messages, emails, Jira tickets—to AI companies for training data, with individual deals ranging from $10,000 to hundreds of thousands of dollars. Cielo24, a now-shuttered software firm, sold its entire digital footprint for hundreds of thousands of dollars, according to CEO Shanna Johnson. SimpleClosure, a startup that helps companies manage shutdowns, launched a tool to facilitate these sales and has processed 100 deals over the past year.

This iPhone trick lets you use ChatGPT without the privacy risks

Apple's integration of ChatGPT into Siri through Apple Intelligence creates a privacy pathway for iPhone users seeking to access the AI tool without creating an OpenAI account. The feature, available through Settings > Apple Intelligence & Siri > ChatGPT, masks user IP addresses and shares only general location data with OpenAI. Queries routed through this method are excluded from OpenAI's model training and are not retained on OpenAI servers, except where legally required. Users activate the feature by saying "Use ChatGPT to..." after enabling it in settings.

mail

Get notified about new AI Training Data developments

Primary sources. No fluff. Straight to your inbox.

Also on LawSnap