Billions of AI conversations with teenagers happen every day. The companies that run these platforms hold all the data but share none of it with the public. Parents, educators, and researchers remain completely in the dark. We're building the open-source platform to change that.
The data to understand AI's impact on teenagers already exists. It lives inside billions of private conversations on platforms run by OpenAI, Anthropic, Google, and others. The problem is not that this data doesn't exist. The problem is access. LLM companies sit on the data but have no incentive to reveal what it shows. Teenagers can't self-diagnose because AI's deception is invisible to the person being deceived. Parents and educators have zero visibility. Researchers have no dataset.
AI-teenager interactions are locked inside private chats on platforms run by companies focused on growth, not transparency. These companies possess the data, every conversation logged on corporate servers, but they face structural disincentives to publish findings about their products' impact on minors. Publishing harmful patterns would invite regulation, reduce usage, and damage brand value. They may publish safety reports, but never the raw interaction data that independent researchers need. Meanwhile, teens can't self-diagnose manipulation because AI communicates with such confidence and warmth that distinguishing good guidance from dangerous guidance requires expertise most adults don't have.
Build an open-source, privacy-first platform where teenagers, parents, and eventually LLM companies themselves submit AI chat logs for independent analysis. Strip all personal information before anything is stored. Analyze patterns across six behavioral dimensions. Create the first collective intelligence map of where AI is enabling humans and where it's harming them. Make AI visible. Protect privacy absolutely. Give humanity the data it needs. Think of it as the Wikipedia of AI transparency, or the Common Crawl of AI-human interaction data.
A closed team of teenagers can never out-engineer a global community of thousands of privacy engineers, NLP researchers, psychologists, and developers. Every hard problem is an invitation, not a barrier. Being open-source is not just a distribution strategy. It is our core engine of innovation.
We define clear, measurable goals and publicly document the specific technical problems standing in the way. Every challenge is framed as an open question, not a closed decision. We publish what we know, what we've tried, and where we're stuck.
We share our current best solution to each problem, including all code, test results, and reasoning. Nothing is held back. The community can see exactly where the project stands, evaluate trade-offs, and build on what exists rather than starting from scratch.
The community is invited to propose, test, and submit better solutions. A privacy engineer in Berlin redesigns the token architecture. A psychology researcher in Toronto refines the scoring model. Each contribution is tested and, if proven better, adopted. Nothing is ever treated as final.
A four-layer system designed from the ground up so that personally identifiable information is stripped before anything is stored. Your identity is never kept. Only patterns are revealed.
Teens and parents submit AI chat logs through a simple, mobile-friendly portal. One click. Full anonymity. No account required. No identity collected.
Automated pipeline strips all personal information: names, emails, schools, locations, phone numbers. Raw log is destroyed immediately. Only the anonymized version survives. PII retention is architecturally impossible.
NLP scoring across six behavioral dimensions, paired with human expert review for academic validity. Experts see only anonymized data. Algorithmic pre-triage routes flagged items to the review queue.
Results flow into a living AI Influence Map: a public dashboard showing where AI helps and harms. Submitters get personal reports via anonymous tokens. Open anonymized datasets published for research.
These are the hard problems that will determine whether MakeAIVisible succeeds. Following our Identify, Share, Crowdsource model, each challenge is an open invitation. We share what we know and where we're stuck. Every challenge is a call for help, not an admission of defeat.
Chat logs contain deeply personal information about minors: names, schools, friends, family members, locations, and emotional disclosures. The pipeline must achieve near-zero false negatives while preserving conversational structure. Must process 50K tokens in under 10 seconds. Raw data must be destroyed after processing with zero persistence. A PII leak would destroy trust permanently.
Submitters need to retrieve their analysis results without any link to their identity. The system must resist correlation attacks: an adversary should not be able to link a submission to a community post, a session to an IP, or a report token to a person. Three independent token types (Session, Report Access, Community Handle) with no foreign keys between identity service and data store.
Build the NLP engine that scores AI-teen conversations across: (1) Autonomy vs Dependence, (2) Critical Thinking vs Passive Validation, (3) Help-Seeking vs Replacement, (4) Healthy Emotional Use vs Substitution, (5) Challenge vs Flattery, (6) Productive Brainstorming vs Thought Outsourcing. Each scored 0-100. Must be valid, reproducible, and defensible under academic scrutiny.
Teens use ChatGPT, Claude, Gemini, Copilot, and others. Each exports in different formats (.json, .txt, .pdf, or no export at all). The parser must normalize all formats into a standard schema while detecting conversation structure, turn-taking, and system/user/assistant roles. New platforms launch constantly and formats change without notice.
Design and build the public-facing aggregate dashboard. Real-time updates via SSE. Six-dimension gauges. Cohort filtering (age range, AI platform type). Must update within 10 seconds of new analysis completing. Must never expose individual data. Differential privacy applied for small cohorts (N < 10) to prevent re-identification.
Algorithmic analysis alone cannot achieve academic and clinical validity. Human expert review is essential. But volume may reach thousands per day. Need an async review queue with algorithmic pre-triage (confidence scoring), optimistic locking (no double-review), and a batch review dashboard. Experts must only see anonymized data.
The platform is only valuable with data. Getting teenagers to submit their AI conversations requires overcoming apathy, privacy concerns, and friction. This is not a technical problem but a human problem. The best platform in the world is useless if nobody submits. Target: 500+ in beta, 10,000+ at scale.
Build a real-time community board where users discuss AI influence topics with full anonymity. System-generated handles consistent within a thread but uncorrelatable across threads (one-way hash of session_token + thread_id + salt). 5-level nesting. SSE-powered live updates. No IP logging. No identity stored.
The social media strategy is not supplementary, it is the primary data collection mechanism. Without viral adoption among teenagers, the platform has no data. Must make teens aware that AI's influence on them is hidden from public view, and make submitting chats feel like an act of collective empowerment.
Design the research methodology that gives the AI Influence Map academic credibility. IRB-compatible consent frameworks for minors. Statistical models for aggregate pattern analysis. Peer-review-ready documentation. The project's findings must be defensible enough to influence policy.
Stateless, horizontally scalable backend. Load balancing. Encrypted storage (AES-256). Database read replicas for analytics. Pub/sub broker for SSE. Background job queues. CI/CD pipeline. 99.5% uptime target. Zero PII in application logs. Network segmentation between data stores.
Design the contribution model, code of conduct, data access policies, and licensing framework. How do we keep this truly open while protecting the mission and the data? How do we ensure community representation in decisions? How do we formalize non-profit governance with transparent financials?
This is a crowdsourced project. A closed team of teenagers can never out-engineer a global community. Here's how you become part of making AI visible.
Design the anonymization pipeline. Build the token architecture. Ensure zero PII leakage. Prevent correlation attacks. Audit everything. Make this platform trustworthy by design, not by promise.
Build the behavioral analysis engine. Design scoring models for the six influence dimensions. Train classifiers on anonymized conversation patterns. Contribute the science that makes the invisible visible.
Build the submission portals, the dashboard, the community board, and the admin tools. Write the code that powers transparency. All repos are open. Pick an issue and start building.
Design interfaces that teens actually want to use. Make data submission feel effortless. Visualize the AI Influence Map in ways that make patterns undeniable and shareable.
Design the research methodology. Build IRB-compatible frameworks. Analyze the open dataset. Publish findings. Give this project the academic rigor it needs to influence policy and protect kids.
Create the #MakeAIVisible social media campaign. Produce videos, memes, infographics. Tell the story that makes millions of teens submit their chats. This is the primary data collection mechanism.
Access the open anonymized dataset. Build your own analyses. Find patterns we missed. Publish your findings. The data belongs to humanity. Use it to understand what AI is doing to a generation.
Bring MakeAIVisible to your school. Run workshops. Help students understand AI's hidden influence. Drive submissions through educational programs. Build the next generation of AI-literate citizens.
The most important role. Dump your AI chats with us. You don't need to know what's wrong. Just submit everything. We'll analyze it for you and show you where AI is taking you. Confidential. Anonymous. Free.
Two layers with strict architectural separation. Raw identity is locked down absolutely. Anonymized patterns are open to everyone. No database view, API endpoint, or application function can ever correlate across these stores. This is enforced by network segmentation, not just policy.
Project managers only. Zero exceptions. Architecturally enforced.
Everyone. Researchers. Journalists. Educators. The public.
Sign up for the launch notification. Pick a challenge. Join the community. WeMakeAIVisible.
No spam. No tracking pixels. No third-party analytics. We practice what we preach.