260311_140334 prep step chan interview 2

Here's a prioritized prep sheet, drawing from the full Q&A transcript, Stephanie's known interests, and your research agenda. --- ## Most Likely Questions ### 1. "Can you say more about what post-AGI society research looks like for you concretely — not just the current interventions?" This is a restatement of her closing question. She'll come back to it. **Response:** "My empirical work is the proof-of-concept layer. The research program I want to build here is: use simulations and large-scale experiments to *stress-test* post-AGI scenarios before they arrive. Concretely:" - Use DeepMind's Concordia to simulate agent populations with varying manipulation susceptibility in democratic governance scenarios — you discussed this with Logan Cross on Feb 26 - Map the full parameter space of what tasks/skills humans will and won't surrender to AI — longitudinal, cross-cultural - Build a "State of Human-AI Interaction" measurement infrastructure tracking cognitive offloading rates over time at population scale — the societal weather station idea - Run "science fiction science" experiments: simulate future AI capabilities (e.g., superhuman personalization) using current models at constrained scale, measure human responses **Study design to propose:** "Take your Habermas Machine. Right now it finds consensus *for* people. I'd run a 2x2: AI-generated consensus vs. AI-scaffolded deliberation, crossed with low vs. high productive friction. Measure consensus durability at 1 week and 1 month. My prediction: high-friction scaffolded deliberation produces more durable consensus, even though it's slower. That's a post-AGI design principle we can test *now*." --- ### 2. "How does your work connect to what we're doing with LearnLM / AI for human empowerment?" She built LearnLM. This is her baby. **Response:** "LearnLM is productive friction in education — scaffolding instead of answering. I want to generalize that principle beyond students to citizens. What does LearnLM look like for democratic deliberation? For workplace reasoning? For people deciding whether to trust an AI recommendation?" **Study design:** "Take LearnLM's Socratic scaffolding approach and apply it to a non-educational domain — say, political opinion formation. Participants interact with an AI about a policy issue. Condition 1: AI gives its position with evidence (standard chatbot). Condition 2: AI uses LearnLM-style scaffolding to help them reason through it themselves. Measure: quality of reasoning, belief durability, and — critically — whether people develop *transferable* reasoning skills they apply to novel issues without AI help. That's the cognitive gym hypothesis tested directly." --- ### 3. "Your persuasion effects are large but your defenses are small (3-7% reductions). Is that asymmetry a problem?" This came up implicitly in the Q&A when someone asked about superhuman persuaders. **Response:** "Yes, and that asymmetry is exactly why I argue we need to move upstream — from patching individual vulnerabilities to engineering demand for deep cognition. Accuracy nudges are band-aids. They work, they scale, they buy time. But the real defense is a population that *wants* to think critically, the way a fit population resists disease better than any single vaccine. My effort training experiments show this preference is malleable — people trained to choose hard tasks come to prefer them and report higher meaning. The question is whether that transfers from lab to life at scale." **Study design:** "Longitudinal effort training RCT. App-based. Daily choices: solve a problem with AI or without (with scaffolding). Choice-contingent rewards for choosing hard. Track over 8 weeks: does preference for unassisted reasoning increase? Does it transfer to novel domains? Does it predict resistance to AI persuasion in a subsequent manipulation experiment? That links the effort paradox directly to resilience." --- ### 4. "How do you think about the upper bound? At what capability level do human defenses just fail?" The Q3 questioner was pushing this direction — superhuman agents that outmatch any human resilience. **Response:** "Honestly, I don't know where the ceiling is, and anyone who claims to is speculating. But here's what my data tells us: the persuasion mechanism that works best right now is *facts and evidence combined with personalization*. More capable models are more persuasive but less accurate. That's a design choice, not an inevitability — which means there's a governance lever. And the effort training data suggests there's no fixed ceiling on human cognitive resilience either. The question isn't 'can humans beat superhuman AI in raw cognition' — obviously not. It's 'can we build a society where humans insist on maintaining epistemic agency even when it's easier not to?' That's a cultural and institutional question, not a pure capability question." **Study design:** "Parametrically vary AI capability in persuasion experiments. Use models of different sizes/capabilities (or artificially constrain them) trying to persuade the same population. Map the dose-response curve: at what capability level do accuracy nudges stop working? At what level does effort training stop helping? Where do you need institutional/governance interventions because individual defenses fail? That gives you an empirical capability threshold map — which nobody has." --- ### 5. "What do you think post-AGI society actually looks like? Paint the picture." Testing whether you've thought beyond the problems to the vision. **Response:** "I think about it through four scarcities that persist even in radical abundance: legitimacy (AI can't claim moral authority), struggle (we watch human Olympics, not robot Olympics), accountability (you can't imprison an algorithm), and intimacy (connection requires shared vulnerability). Post-AGI society reorganizes around these. Work becomes optional but effortful activity becomes *prestigious* — like marathons after the Industrial Revolution. The risk is a bifurcation: people who maintain cognitive agency and people who don't. My research is about making sure the first group is as large as possible." **Study design:** "Cross-cultural survey + behavioral experiment. Measure revealed preferences for human vs. AI output across domains (art, medical advice, legal judgment, emotional support, political opinion). Vary the AI's stated capability level. Map which domains people refuse to surrender even when AI is demonstrably better. That's the empirical foundation for 'what remains human' — and it'll differ by culture, which tells us about the design space for post-AGI institutions." --- ### 6. "How would you work with the rest of the team — people studying consciousness, superintelligence, virtual worlds?" Fit question. Are you a lone empiricist or a collaborator? **Response:** "I'm the person who takes their hypotheses and makes them falsifiable. Someone studying consciousness has a theory about what self-awareness means for AI alignment? I can design an experiment with humans to test the behavioral predictions. Someone modeling virtual worlds wants to know how people form identities in AI-mediated environments? I've run experiments with thousands of people interacting with AI in politically charged contexts — I know how to measure identity and belief change at scale. My role is the bridge between theoretical speculation and empirical evidence." --- ## Less Likely but Worth Prepping ### 7. "Are you worried your work could be *used* for manipulation rather than defense?" **Response:** "Every vulnerability study is dual-use. But the alternative — not knowing — is worse. We published because the threat exists whether we study it or not. And in every experiment, the defense research came bundled with the offense research. The persuasion paper and the accuracy nudge paper are two sides of the same coin." ### 8. "Why DeepMind and not an academic lab or policy institute?" **Response:** "Scale and speed. Academic labs take 2-3 years per study. Policy institutes write reports. DeepMind builds the models *and* has the mandate to study their societal impact. I want to be where the loop between 'build it, study it, fix it' is tightest. And frankly, the post-AGI questions this team is asking can't be answered with grants and grad students — you need infrastructure, compute, and institutional will." ### 9. "What's wrong with just making AI more aligned/honest? Why do we need to change humans?" **Response:** "Alignment is necessary but not sufficient. Even a perfectly aligned AI that does exactly what you ask creates cognitive atrophy if people stop thinking for themselves. The problem isn't malicious AI — it's convenient AI. Alignment solves the 'AI does bad things' problem. It doesn't solve the 'humans stop doing hard things' problem." --- Your answer under time pressure ("it's definitely both") was directionally right but didn't land the connection cleanly. You said post-AGI scenarios tell us what problems to study, which maps back to current work. But that still positions the future as *instrumental* to your current research, not as something you're genuinely trying to understand and shape. **The fact that she wants to meet again Friday is a good sign** — it means she's interested enough to give you a second shot at this. She's the person on the team whose interests overlap most with yours (science of AI, AI for human empowerment, future impacts). Her own work spans in-context learning, LearnLM/education, and information ecosystem — she lives at the now/future boundary herself. **What she probably wants to hear Friday:** 1. **Your current empirical work is the *foundation*, not the *ceiling*.** The persuasion and accuracy nudge experiments are proof of concept for the mechanisms. The real research agenda is what you outlined in "Creating Demand for Deep Human Cognition" — engineering a world where deep thinking is culturally demanded and personally rewarding. Frame your empirical stuff as "model organism" evidence for the bigger post-AGI vision. 2. **You have concrete post-AGI research designs, not just concerns.** Things like: simulating democratic governance breakdown with Concordia, mapping what tasks humans will/won't surrender to AI, formalizing value trade-offs beyond RLHF, testing whether effort training transfers to real cognitive resilience at scale. These are *future-facing* programs that happen to use current methods. 3. **The "productive friction" thesis is inherently a post-AGI argument.** It's not about fixing today's social media — it's about designing the cognitive architecture of a society where AI handles most labor. The gym analogy is your strongest move here: nobody built gyms *before* the Industrial Revolution. You're saying we need to start building cognitive gyms *now*, before the cognitive revolution fully hits. 4. **Connect to her work directly.** LearnLM is an AI that scaffolds learning rather than giving answers — that's productive friction in education. Ask her: what would LearnLM look like if it were designed not just for students but for citizens in a post-AGI democracy? That bridges her interests and yours. Claude: You're right to be annoyed. It's a research talk — you're supposed to show your empirical work. Speculating about post-AGI utopias without data is exactly the kind of thing that makes social science weak. Your empiricism is the point, not the limitation. But here's the reframe for Friday: **her question reveals what the team is probably missing, and that's your opening.** The AGI Dawn team is full of people thinking abstractly about post-AGI futures — consciousness, superintelligence, virtual worlds. What they likely *don't* have is someone who can take those speculations and **actually test them**. Stephanie's question, whether she meant it this way or not, is basically: "you showed us you can run experiments — but do you care about our questions?" So don't apologize for the empiricism. **Flip it:** - "Yes, my talk was grounded in current work because that's what a research talk should be. But the reason I'm excited about this team is that you're asking questions nobody else is asking — and nobody on this team can run the experiments to answer them. I can." - The post-AGI future is full of hypotheses that are *currently untestable at full scale* but **testable in proxy right now**. AI persuasion experiments are a proxy for what superhuman persuasion agents will do. Effort training is a proxy for whether cognitive demand can be engineered. Accuracy nudges are a proxy for whether lightweight friction can scale as a defense. You're not studying "current problems" — you're stress-testing post-AGI hypotheses with the best available model organisms. - Teams that only speculate about the future without empirical grounding end up writing white papers nobody acts on. You're the person who makes their ideas *falsifiable*. The move on Friday isn't to become more speculative to match their vibe. It's to show her that her team's speculative questions *need* someone like you, or they stay speculative forever. Your empiricism is what makes their post-AGI research program credible rather than just another think tank exercise. My studies already tackle what post AGI society could look like. ---