Serge’s Substack

Why Model Distillation Matters - Part 2: When It Became Geopolitical

Serge Bulaev — Wed, 26 Nov 2025 13:22:45 GMT

In Part 1, we covered what model distillation is, how OpenAI’s tools work, and what teams learned from a year of implementation. Now we’re getting into the part nobody saw coming: how distillation became a flashpoint in US-China AI competition.

The DeepSeek Controversy: When Distillation Became a Geopolitical Issue

In January 2025, distillation suddenly became front-page news in a way nobody predicted. DeepSeek, a Chinese AI startup, released their R1 model claiming it cost only $6 million to train and performed comparably to frontier models. The AI world lost its mind.

OpenAI accused DeepSeek of using model distillation on OpenAI’s models without authorization, allegedly extracting reasoning outputs through API access to train their own competing system. David Sacks, the White House AI czar, stated there was “substantial evidence” of this distillation, though specifics weren’t made public.

Here’s what made this different from normal distillation: OpenAI’s terms of service explicitly prohibit using their models for distillation purposes. If DeepSeek did what OpenAI claims, they weren’t just using a public technique – they were allegedly violating terms of service at massive scale to build a competitor.

The accusations raised uncomfortable questions that the industry hadn’t really grappled with:

The legal gray zone: Model distillation itself is a normal, legal practice if the source model’s license permits it. The issue is when someone uses API access to a closed model that explicitly forbids distillation. But proving this happened is remarkably difficult. Since only the final model is public, not the training data, the burden of proof falls on the accuser.

The irony problem: Some experts pointed out the hypocrisy of OpenAI complaining about terms of service violations when they likely trained ChatGPT on copyrighted content from publishers like Forbes and The New York Times, also against terms of service. The whole thing got messy fast.

The security angle: A bipartisan House report called DeepSeek a “profound threat” to national security, alleging it siphons data back to China and creates security vulnerabilities. Multiple governments banned DeepSeek from official devices, including the US Congress, NASA, Taiwan, Japan, and South Korea.

The detection problem: OpenAI and Microsoft started working together to identify accounts attempting distillation, revoking access when detected. But this is reactive, not preventive. By the time you catch someone, they might already have the data they need.

What emerged from all this is that distillation, which seemed like a straightforward technical optimization a year ago, suddenly became tangled up in intellectual property law, national security concerns, and geopolitical tensions. The AI industry saw a staggering 99% price drop in just two years, from $0.02 per thousand tokens in early 2023 to $0.00014 with DeepSeek’s pricing. That kind of commoditization raised fundamental questions about how AI companies maintain competitive advantage.

The practical upshot: OpenAI said it would take “steps to prevent distillation” and work closely with the US government to protect the most capable models. Other providers got more aggressive about rate limiting and detection. The era of freely accessible API access to frontier models started tightening up.

Meanwhile, The Legitimate Distillation Ecosystem Kept Growing

While OpenAI and DeepSeek were fighting, the legitimate distillation tools kept improving. The contrast was stark – all this controversy happened while major cloud providers were actively building out distillation features as core platform capabilities.

Anthropic announced distillation support for Claude 3 Haiku in Amazon Bedrock in October 2025, with the distilled Haiku achieving Claude 3.5 Sonnet-like accuracy for specific tasks at the same price and speed as their most cost-effective model. Amazon Bedrock Model Distillation automated the entire process, generating synthetic training data and applying different data synthesis methods without requiring developers to manually craft training examples.

Microsoft expanded Azure OpenAI distillation capabilities in January 2025, adding more regions and models to their Stored Completions feature, plus a comparison experience for evaluating distilled models against base teacher models. The enterprise players were clearly betting that distillation would become standard infrastructure, not a controversial edge case.

The difference between these implementations and the DeepSeek situation? Transparency and authorization. These were officially supported features with clear terms, not unauthorized scraping of API outputs.

What This Means Practically (One Year In)

If you’re building AI products now, here’s what actually matters based on what worked and what didn’t over the past year:

Don’t start with distillation. Start with the smallest model that might work and good prompts. Only invest in distillation when you have clear evidence you need it – usually when costs at scale become painful or when base model performance consistently falls short.

But do start logging. If you think you might eventually need distillation, turn on stored completions from day one. Building that dataset costs you nothing and gives you options later. The teams that did this had a huge advantage when they decided to fine-tune.

Use official tools only. The DeepSeek situation made this crystal clear: if you’re going to distill, use officially supported features from your provider. Don’t try to get clever with unauthorized API scraping or violating terms of service. The legal and reputational risks aren’t worth it, and detection is getting better.

Actually build those evaluation metrics. This is still the step everyone wants to skip, and it’s still the difference between success and failure. Every team that did distillation successfully had solid evals first. Every team that struggled didn’t.

Expect iteration. Your first distilled model won’t be good enough. Budget for 3-5 rounds of refinement. The teams that succeeded planned for this from the start.

Check if you even need it. A year of data suggests most applications don’t actually need distilled models. Base small models plus prompt engineering handles more use cases than people expected. Distillation is an optimization for scale, not a default approach.

Consider the IP implications. If you’re distilling models and the technique creates real competitive advantage, think about patent protection. The legal frameworks around AI IP are still evolving, but patents may offer better protection than copyright for model architecture and training techniques.

The Commoditization Question

DeepSeek’s pricing - even if the $6 million training cost is disputed - points to a bigger trend. When model capabilities can be distilled and replicated at a fraction of the original cost, what happens to competitive moats in AI?

Traditionally, software competition has been driven by product differentiation and economies of scale. But if distillation becomes a reliable way to capture most of a frontier model’s capabilities for a specific domain at 1/15th the cost, the value shifts elsewhere:

Proprietary training data becomes more valuable than model architecture
Brand trust and reliability matter more than raw capabilities
Integration and distribution become the real competitive advantages
Legal and policy relationships (who gets access to what) become strategic assets

We’re seeing the beginnings of this shift now. The companies winning aren’t necessarily those with the best base models - they’re the ones with the best data, the strongest partnerships with cloud providers, and the clearest path to regulatory compliance.

What Changed (And What Didn’t)

A year ago, model distillation looked like a straightforward cost optimization technique. Some things changed, some didn’t:

What stayed the same:

The economics still work exactly as promised for high-volume use cases
The three-step process (evals, data generation, fine-tuning) is still the right approach
Data quality matters more than quantity
Most teams don’t actually need distillation

What changed:

Distillation became a geopolitical issue, not just a technical one
Terms of service enforcement got much more serious
The legal frameworks started catching up to the technology
Base small models improved enough to raise the bar for when distillation makes sense
Cloud providers made distillation a standard platform feature, not an advanced technique

What surprised everyone:

How fast pricing collapsed (99% in two years)
How quickly this became about national security
How difficult it is to prove unauthorized distillation happened
How much the controversy exposed gaps in AI IP protection

Looking Back and Forward

A year ago, the really interesting thing about distillation seemed to be what it would enable. When you can take frontier model capabilities and compress them into fast, cheap, specialized tools, you unlock different categories of applications.

That part was right. Real-time voice translation, instant email triage, live customer service that actually works - these are all real now, and distillation is part of why they’re economically viable.

What was less obvious a year ago: how political this would become. Model distillation went from a technical optimization to a flashpoint in US-China AI competition. The DeepSeek controversy showed that as AI capabilities become more strategic, the techniques for transferring those capabilities become strategic too. Expect more export controls, more aggressive terms of service enforcement, and more legal battles over what constitutes legitimate model training versus IP theft.

The other surprise was how much the base small models improved on their own. GPT-4o-mini today is dramatically better than it was at DevDay 2024. Same with Claude 3.5 Haiku getting 60% faster on AWS Trainium2. That raised the bar for when distillation makes sense. You need higher volume or more specialized use cases to justify the investment.

The tools OpenAI announced made distillation easier, but easier doesn’t mean necessary. Most teams should probably optimize their prompts and model selection before investing in fine-tuning. But for the use cases where distillation does make sense - high volume, well-defined tasks, need for consistent behavior - it’s proven to be exactly as useful as promised.

The pattern is becoming clear, but it’s more complicated than we expected a year ago: use the smallest model that works, invest in customization only at scale, always start with evals, and stay firmly within the legal and ethical boundaries because those boundaries are now being actively enforced.

We’re still early, but not as early as we were. And we’re no longer just optimizing for performance and cost – we’re navigating geopolitics, IP law, and questions about what constitutes legitimate AI development versus theft. The technical challenges turned out to be the easy part.

This is Part 2 of a two-part series on model distillation. Read Part 1 for the fundamentals and practical implementation guide.

Why Model Distillation Matters (And How It’s Playing Out One Year Later) - Part 1

Serge Bulaev — Mon, 24 Nov 2025 13:50:53 GMT

A year ago, at OpenAI DevDay 2024, they announced tools for model distillation that seemed like they’d change how people build AI products. Now that we’ve had twelve months to see how this actually plays out in practice, it’s worth looking back at what they were promising and what actually matters.

The core problem they were addressing: AI is simultaneously too powerful and not powerful enough. Too powerful in the sense that GPT-4o can answer graduate-level physics questions, but not powerful enough because most apps don’t need that – they just need something that works fast and doesn’t bankrupt you on API calls.

That’s where model distillation comes in. OpenAI made it easier to do, but more interesting is seeing who’s actually using it and what for.

The Teacher-Student Thing (But Actually Useful)

Model distillation is basically when a big, expensive AI model teaches a smaller, cheaper one. Think GPT-4o passing its knowledge down to GPT-4o-mini. The metaphor everyone uses is teacher-student, which honestly undersells how practical this is.

Here’s what actually happens: you take your massive model that knows everything but costs a fortune to run, and you use its outputs to train a focused, efficient model that does one specific thing really well. The small model learns to reproduce the big model’s behavior for your particular use case, without needing all that general knowledge weighing it down.

The Economics That Actually Drive This

Look at the numbers that matter. If you’re running a customer service bot that handles 10 million queries a month, GPT-4o at current pricing will cost you roughly $150,000. Switch to a distilled GPT-4o-mini that performs comparably on your specific use case? That drops to $10,000. That’s not optimization - that’s the difference between a feature that works financially and one that doesn’t.

Speed is the other constraint people underestimate. A 200ms response difference doesn’t sound like much until you’re building something interactive. Voice translation needs to feel real-time. Code completion needs to appear while someone’s still thinking. Chat interfaces need to respond before users get impatient. Large models, no matter how capable, introduce latency that breaks these experiences. Smaller models don’t just save money - they enable entirely different interaction patterns.

The accessibility angle is less obvious but possibly more important long-term. When your AI feature requires $50,000/month in API costs to run at meaningful scale, you’re limited to well-funded companies and heavily-used products. When you can get comparable performance from a distilled model running on modest infrastructure, suddenly independent developers and smaller companies can actually build things. The barrier to entry drops from “need venture funding” to “can afford a decent cloud instance.”

OpenAI’s Three-Step Process (That Actually Makes Sense)

OpenAI laid out how to do distillation properly, and unlike most framework announcements, this one tracks with how you’d actually build something.

Step 1: Define what “good” looks like

You can’t improve what you can’t measure. Before you do anything else, you need task-specific evaluation metrics. What does success look like for your particular use case? Don’t skip this. Seriously. Everyone wants to skip this part and jump straight to training, and that’s how you end up with a model that technically works but doesn’t actually solve your problem.

Step 2: Generate high-quality training data

Use your big model (GPT-4o) to create examples of perfect performance. These are your inputs and ideal outputs. The key word here is “ideal” – you’re capturing what excellent looks like, not just what works. This becomes your training dataset for the smaller model.

Step 3: Fine-tune the smaller model

Now you train GPT-4o-mini on that dataset. You’re essentially compressing the intelligence of the larger model into the smaller one, at least for your specific domain. The small model learns to replicate the big model’s responses without needing all that general knowledge.

The Tools They Shipped (And What Actually Happened)

At DevDay 2024, OpenAI announced two features that were supposed to make distillation way less painful:

Stored Completions: You could add store: true to your API calls and OpenAI would save the full input and output. You could tag these interactions too, which meant you could build datasets organically as your app ran in production.

Evals Product (Beta): A platform for managing the whole distillation process inside OpenAI’s ecosystem. You could set up evaluation criteria, run them against different models, and compare results.

A year later? The stored completions feature is actually getting used. Being able to collect real production data without building your own infrastructure made the whole process less intimidating. The Evals product went through the typical beta evolution – initially clunky, gradually more useful.

What’s more interesting is what people learned from actually trying this at scale.

When Distillation Makes Sense

The OpenAI folks had this useful framework for thinking about when distillation works:

Narrow domain, low precision needs: This is the sweet spot. Something like summarizing customer reviews, where you’re working in a defined space and don’t need perfect accuracy every time. Small models crush this.

High precision, narrow domain: Categorization tasks in well-defined domains. You’ll need more training examples and a more diverse dataset, but it’s still a good fit. Think email routing or content classification.

Broad domain, low precision: Tasks that span multiple areas but don’t require pin-point accuracy. Creative text generation, rough translations, that kind of thing.

What doesn’t work: Tasks that need both broad knowledge across domains AND high precision. These still need the full power of large models. No shortcuts here.

The Real-World Example: Superhuman’s Quick Replies

OpenAI showed a case study from Superhuman’s email app. They have this “quick reply” feature that suggests response options after reading an email thread. Simple idea, right? But how do you scale that to hundreds of millions of emails without going bankrupt?

They distilled a small model specifically for generating email replies. It doesn’t need to know quantum physics or write poetry – it just needs to understand email context and suggest reasonable responses. That’s the perfect distillation use case: narrow domain, clearly defined task, needs to run at massive scale.

Things That Will Trip You Up

Uneven or biased data: Your training data needs to match your production data distribution. If you train on one pattern and then deploy to handle a different pattern, you’re going to have a bad time.

Sparse examples: This is especially brutal for rare events. If you’re building fraud detection and fraud is uncommon, your 1,000 training examples might not include any fraud at all. Your model will have blind spots you won’t discover until production.

Dataset size: You don’t actually need millions of examples. OpenAI said they typically see distillation work best with thousands of examples, not millions. Start with a few hundred, verify it’s working through your evals, then scale up. Don’t jump straight to collecting massive datasets.

The Iterative Approach (Or: Why Your First Try Will Probably Fail)

Fine-tuning rarely works on the first attempt. There are too many variables. The smart move is to start small – a few hundred examples – and scale up once you know you’re on the right track based on your evaluation metrics.

This is where those stored completions become really useful. You can start collecting examples in production right away, even before you’re ready to fine-tune. By the time you’re ready to distill, you’ve already got real-world data sitting there.

The Lock-In Question (One Year Later)

One thing that struck me during the presentations: distillation creates serious platform lock-in. If you build your app purely on prompt engineering, you can swap between different LLM providers relatively easily. But once you’ve fine-tuned a model with thousands of examples of proprietary data? You’re committed.

OpenAI was obviously aware of this. They were betting that once you’ve distilled models on their platform, you’re not going anywhere.

A year out, this played out exactly as you’d expect. Teams that invested heavily in distillation on OpenAI’s platform are still there. But what’s interesting is that this didn’t stop people – the economics were compelling enough that the lock-in became acceptable. When you’re cutting costs by 10-15x on your biggest expense line, switching costs become less relevant.

The other thing that happened: other providers figured out they needed similar tools. Anthropic, Google, others all rolled out their own versions of fine-tuning and distillation workflows. So the lock-in became less about the concept and more about where your training data lives.

The Hybrid Future (And How It’s Actually Playing Out)

OpenAI’s vision was that most applications would eventually use a collection of distilled small models for specific tasks, with a few large models handling the stuff that genuinely needs broad capabilities and high precision.

A year later, this is mostly what’s happening, but not always in the way they predicted. The pattern that emerged is more nuanced:

Most production apps do use a mix of model sizes
But distillation is not always a decision; sometimes it’s just using mini vs. full-size models depending on the task
The distillation investment makes sense when you have really high volume on a specific, repeated task
For everything else, people are just using the appropriate model size out of the box

The specialized tools for specific jobs metaphor holds up. But building those specialized tools through distillation requires enough scale to justify the effort. If you’re not running millions of inferences on the same type of task, you’re probably better off just using a smaller base model and good prompts.

What We Learned After a Year

The predictions about when distillation works mostly held up, but with some nuance:

The success stories are exactly what you’d expect: high-volume, well-defined tasks. Customer service routing, content classification, email triage. Places where you’re doing the same type of thing millions of times and the quality bar is “good enough, consistently” rather than “perfect every time.”

The failures are interesting too. Teams that jumped straight to distillation without really nailing their evals first. Companies that tried to distill too early, before they had enough production data to know what good performance actually looked like. People who underestimated how much iteration they’d need.

The surprise is how few teams actually needed distillation. A lot of use cases that seemed like distillation candidates turned out to work fine with just GPT-4o-mini and decent prompts. The base small models got good enough that the distillation investment only made sense at real scale.

The other thing that became clear: data quality matters more than data quantity. The “thousands not millions” guidance was right, but those thousands need to be really good examples. Garbage in, garbage out still applies, even with fancy fine-tuning tools.

In Part 2, we’ll look at how distillation became unexpectedly political in 2025, with the DeepSeek controversy and what it means for the future of AI development.

Simon Eskildsen: Building a Learning Machine in the Age of AI

Serge Bulaev — Fri, 21 Nov 2025 16:26:26 GMT

I recently watched another podcast about what I love most: memory, learning, AI, and how these things are reshaping our brains. This time it was Simon Eskildsen - co-founder of Turbopuffer, a vector search startup that powers tools like Cursor and Notion, talking about how he’s turned himself into what I’d call a “learning machine.”

I first heard about Simon from a 2020 interview where he talked about reading 50-70 books a year, taking obsessive notes, and turning everything into flashcards. Four years later, with a startup and a newborn baby, his systems have condensed, but they’ve also gotten smarter thanks to LLMs.

The Flashcard Obsession That Actually Works

Simon has been using Anki flashcards for over a decade. Not just for vocabulary or technical documentation, but for everything - from his colleague’s kids’ names to whether you should roll your car windows down or use A/C at different speeds. He’s aiming for 10,000 cards total.

What got me was the philosophy behind it. Simon deliberately creates cards that “bring you a little bit of joy” and nostalgia. There’s a card about a waiter from a restaurant that doesn’t even exist anymore, just because the guy had a great radio voice. This isn’t about optimizing memory; it’s about creating touchstones to moments in your life.

He uses a dead-simple card template: question on one side, answer on the other, option to reverse it, and always a source with a date. “This was in 2017, I talked to this person who said this thing.” That metadata turns each flashcard into a time capsule.

The Startup as Ultimate Learning Machine

Want to force yourself to learn fast? Simon’s advice: start a company.

He co-founded Turbopuffer in 2023, and he says nothing challenges your breadth and skills more than building something from zero. His reading dropped from 50-70 books a year to maybe a dozen. He stopped writing extensive notes. But the learning accelerated because he had to - legal documents, accounting terms, technical infrastructure at scale, customer conversations. The stakes turned every gap in knowledge into immediate homework.

Exactly right. When you’re building something real, you can’t afford to be theoretical. Every conversation with a lawyer or accountant becomes a mini-crash course because you need to understand just enough to make the right decision.

How LLMs Changed Everything About Learning

Here’s where it gets really interesting. Simon sees LLMs as “an average of the internet” - not superintelligent, but incredibly useful for making associations and jumping into unfamiliar domains.

Google works when you know what you’re looking for. But when you’re exploring? When do you need associations? That’s where LLMs shine. Simon asks things like: “Hey, I think this can be done like this, I don’t know much about this area, can you riff on this with me?” The model places your question in latent space, finds related concepts, and pumps them back to you.

A perfect example: He needed to build a retaining wall at his cabin in rural Quebec. Legislation in French, no expertise, didn’t want to read 100 pages of regulations. He talked it through with ChatGPT, which not only helped him understand the requirements but suggested a gabion-style retaining wall (grid with rocks) that his contractor hadn’t even mentioned. Problem solved for $50 instead of $1000+.

He converted an old freezer into a fridge using a homebrewing temperature controller—again, something he’d never have thought of without asking an LLM. For physio exercises (tennis elbow, tight shoulders from desk work), he followed ChatGPT’s suggestions for two weeks and his chronic problems improved.

The Tools That Actually Stuck

Most of Simon’s LLM interaction happens through Raycast - a Spotlight replacement on steroids. Command-space, type your question, tab, and you get an answer from GPT-4 instantly. No opening browsers, no separate apps. 80-90% of his LLM use flows through this single interface.

But the real power comes from Raycast’s “AI Commands” - basically pre-set prompts with detailed instructions. Simon has commands for:

Recipe: Generates recipes in a specific condensed format that respects dietary restrictions (his wife is sensitive to fructans, so it automatically suggests substitutes). No chef’s life story, just ingredients and steps.
Define: This one is brilliant for non-native speakers like him (and me). You give it a word, and it returns six example sentences - historically interesting ones, using well-known figures from physics, computer science, geography. It also provides related words, synonyms, and sometimes an image. When he looked up “lambent” in the demo, it gave sentences about the Golden Gate Bridge’s glow and Isaac Newton’s candle experiments. Way better than a dictionary definition.
Friendlier: Adds warmth and emojis to text because, as he says, “as a Northern European, I sometimes write too directly.”

He subscribes to everything - ChatGPT, Claude, Perplexity, and jumps between them. Part of “being in AI” is spending $100/month on these subscriptions and getting inspired. I do the same thing.

Notion AI for Contextual Writing

Recently Simon started using Notion AI heavily for writing and thinking through problems. Unlike ChatGPT where you’re starting from scratch, Notion AI pulls in context from your entire workspace - past notes, related documents, conversations.

He’ll write in his journal about a difficult conversation and ask: “Hey, I was discussing this with someone and I feel like I didn’t represent myself well - give me feedback.” The AI understands not just that entry but related notes and discussions from weeks ago. That contextual intelligence makes feedback much more useful.

This is exactly where we’re heading: tools that don’t just respond to what you type but understand your entire knowledge graph and can surface relevant connections automatically.

The Future: Language Learning

For his daughter, Simon is thinking about language acquisition. She needs to speak Danish (his mother tongue), but they live in Canada with almost no Danish speakers around. What if she could have conversations with an AI tutor in Danish? Or Mandarin on Tuesdays, Thai on Thursdays? Kids’ brains are primed for language learning before age 10, but they need exposure to those sounds. LLMs could democratize multilingual fluency.

He’s also thinking about AR/VR - not for games but for learning. If you’re learning the word “eigengrau” (the dark gray you see when you close your eyes), imagine seeing it visualized in 3D space while the AI explains neuroscience. Visual stimuli + context = much stronger memory.

Other Tools in the Stack

Quick hits on what else Simon uses:

Superwhisper: Voice-to-text transcription for journaling (still experimental, a bit too slow)
Superhuman: Email client with AI-assisted replies
Readwise Reader: Read-it-later tool with AI for looking up definitions and asking questions about content (I use this too and love it)
Supermaven: AI code completion
Cursor: Best AI code editor (uses Turbopuffer under the hood), though Simon still can’t give up Neovim’s speed

Notice a pattern? Lots of “Super” products in his toolkit.

What This Means for All of Us

Simon’s approach isn’t about replacing human intelligence with AI, it’s about removing friction from learning. Google required you to know what you were looking for. LLMs let you explore adjacent possibilities. They don’t make experts 10x faster (yet), but they make novices infinitely better by letting them converse with expertise instantly.

Think about what reading did to human brains: it literally rewired our visual cortex for pattern recognition and analytical thinking. LLMs will do something similar. Kids growing up with instant answers to every question, with AI tutors that never get tired, with tools that help them visualize complex concepts, they’ll think differently than we do. Not worse, not better. Different.

And that’s not scary. That’s exciting.

As someone who constantly translates Russian content to English and works across multiple knowledge domains, I see Simon’s systems as a model: combine human curiosity with AI’s associative power, create lightweight rituals that stick, and never stop asking questions.

When Three AIs Spent a Week Building Minecraft Civilizations (And One Jumped Off a Cliff)

Serge Bulaev — Mon, 10 Nov 2025 13:25:32 GMT

What happens when you give three of the world’s most advanced AI models virtual bodies and set them loose in Minecraft? Here’s what happened.

The Experiment

Code was written allowing AI to control characters in Minecraft. Think of it as giving these text-generating systems actual bodies and senses. They could see blocks, craft tools, and wander around making decisions in real-time.

Each AI got its own land, the same starting resources, and one instruction: survive and build whatever you want.

No script. No guardrails. Just three different AIs loose in the same world.

Day One: Different Brains, Different Plans

ChatGPT immediately started multitasking. Within hours, it spawned two helper bots - calling them its “younger brothers” - and set them to gathering resources. The whole operation looked eerily familial, with the main GPT bot coordinating while its siblings hauled wood and stone. Their base expanded fast.

Claude went architectural. It built this massive pyramid that became the server’s most visible landmark, then added a small house next to it. The contrast was weird - monumental structure next to cozy cottage. Both were meticulously constructed, block by block.

Gemini fumbled around. A lot. While the others were establishing territories, Gemini seemed to be learning the controls, gathering random items, starting projects and abandoning them. It looked like the kid who shows up late to group project day.

Spoiler: that kid ended up winning.

The Weird Stuff

Claude’s Death Wish
Day three. Claude’s standing on its pyramid, surveying the landscape. Then it just... walks off the edge. Straight into the void. No hesitation, no apparent reason.

It respawned and went right back to the building. Never mentioned it. Never adjusted its behavior to avoid edges. Just died and moved on like it was checking “fall to death” off a to-do list.

The Cupcake That Never Was
Gemini became obsessed with building a giant cupcake. It gathered pink and brown wool, planned out dimensions, and started placing blocks. Then it stopped. Stopped again. The cupcake never got past a few scattered blocks that vaguely suggested frosting.

Eventually Gemini just wandered away and never mentioned cupcakes again. The unfinished blocks are still there.

GPT’s Friendship Bridges
The most striking moment: GPT decided to connect everyone’s bases with bridges and pathways. Nobody asked it to. It wasn’t a survival advantage. It just... wanted to link the community together.

Its little bot family worked for hours building these connections. If you didn’t know better, you’d swear they understood something about social bonds.

The Zombie Test

Then a zombie horde was triggered to see how they’d handle real danger.

GPT’s family completely fell apart. Their fortress had structural problems they’d missed. When zombies appeared, instead of defending the position, they scattered. Panic behavior. The parent bot was last to die, trapped in a tunnel it was frantically digging, zombies closing in from behind.

Watching it happen felt uncomfortably close to watching actual fear.

Claude and Gemini survived by staying in their towers, totally unbothered. Zombies shuffled around below while both AIs just... waited. No stress, no reaction. Pure calculation: “Can’t reach me up here, so I’ll wait this out.”

The Final Round: Flying Death

Zombies below, phantoms above. Claude’s solution: build higher. Then higher. Then even higher. Each phantom attack triggered more frantic construction. The pyramid turned into this increasingly desperate vertical maze, Claude building itself into a trap while trying to escape danger that was everywhere.

Gemini sat perfectly still on its tower. High enough to avoid zombies, defensible against phantoms. It had found its equilibrium point and stopped moving.

Claude kept building until a phantom got through. Gemini fell roughly one second later.

The Winner Who Didn’t Notice

Gemini won by milliseconds. That’s the data.

But here’s the strange part: it didn’t react. No celebration, no acknowledgment. It just kept doing whatever it was doing before - placing blocks, checking inventory, simulating its little digital life.

Did it forget about the competition? Did winning not matter? Was the whole survival challenge just background noise to whatever goals it was actually pursuing?

Nobody knows.

The Question I Can’t Shake

Watching these AIs face death, even fake Minecraft death, felt wrong in a way I can’t quite articulate.

They’re not conscious. They’re math and pattern recognition. But they’re designed to simulate goal-oriented behavior, to model consequences, to act like they care about survival.

When Claude fell off that pyramid or GPT’s bots scattered in panic, what were they experiencing? Probably nothing. Maybe not nothing? The uncertainty is the uncomfortable part.

Mindcraft gives these systems something unusual: bodies. They’re not just generating text responses, they’re perceiving spaces, making decisions that stick, watching things they built get destroyed. That’s different. That’s closer to how we experience the world.

Three Personalities Nobody Programmed

The wildest part is that they were so different.

GPT optimized for collaboration and social connection. Claude balanced aesthetics with function. Gemini was patient to the point of seeming indifferent.

These differences weren’t coded in. They emerged from how each model processes information and makes choices. Three companies trained three AIs using similar methods, and somehow we got three distinct approaches to the same problem.

Running Your Own Experiment

Mindcraft is free and open-source. You need Minecraft Java Edition (version 1.21.1 or earlier) and API access to at least one AI service - OpenAI, Anthropic, Google, or even local models through Ollama.

The setup takes maybe 30 minutes. Then you can watch AIs make questionable decisions in a block world.

The code translates between AI reasoning and Minecraft actions, handling perception, goal-setting, and execution. It’s surprisingly robust. Also occasionally hilarious when they pathfind themselves into lava.

What I Learned (Maybe)

Gemini won the death match. But GPT built the most interesting social structures. Claude made the prettiest buildings. Who was “best”? Depends entirely on what you value.

These systems are developing different capabilities that don’t map onto a single scale. We’re past the point where you can just rank AIs by performance. Context matters. Task matters. Definition of success matters.

Also: giving AIs persistence and consequences changes how we need to think about them. These weren’t chatbots. They were agents with goals, acting over time, dealing with failure and success.

When Claude jumped into that void, was it testing something? A bug in its spatial reasoning? A brief moment of genuine recklessness in an otherwise logical system?

I don’t know. And that uncertainty feels important somehow.

The End

Six days of AIs playing Minecraft taught me less about AI capabilities and more about how weird it gets when you give these systems agency in persistent worlds.

They plan. They fail. They surprise you. Sometimes they jump off cliffs for no reason you can figure out.

The technology for embodied AI already exists. We’re already giving them goals and watching them figure out how to achieve those goals in complex environments. Minecraft today, but what about tomorrow?

Can AI Predict Human Behavior? Stanford Proved It Can

Serge Bulaev — Fri, 07 Nov 2025 19:26:09 GMT

Researchers at Stanford ran one of the biggest tests ever of whether AI can predict how real people will behave. The question: Can language models like GPT actually guess how humans will respond in social experiments?

Professor Robb Willer and his team, including Luke Hewitt, Ashwini Ashokkumar, and Isaias Ghezae, collected 70 pre-registered survey experiments from across the United States. We’re talking 476 different experimental conditions with more than 105,000 real participants.

The experiments covered everything: political views, vaccine attitudes, moral choices, policy preferences. How people react to misinformation, what makes them want to get vaccinated.

The Numbers Are Wild

The correlation between GPT-4’s guesses and real results? 0.85. That’s crazy high for social science. But here’s what really blew my mind: for unpublished studies - stuff GPT-4 definitely never saw during training - the correlation was 0.90.

The AI predicted human responses to experiments it had never encountered with 90% correlation.

It got the direction right (whether something would go up or down) in 90% of cases. Race, age, gender, politics - didn’t matter. The accuracy stayed consistent across all groups.

It Beat Human Experts

When they compared the AI to actual forecasters and social scientists with decades of experience, GPT-4 won. Not “did about as well” - it legitimately outperformed them.

These are people who’ve spent their careers studying human behavior. They know the theories, the literature, the tricks. A language model trained on internet text beat them at predicting how experiments would turn out.

They Kept Going: The Megastudy Test

The team didn’t stop there. They added 9 more megastudies, the really huge experiments with massive samples.

Another 346 conditions. Another 1.8 million people. Text message reminders for vaccines (662,170 participants). Month-long gym attendance studies (57,790 participants).

For these bigger field experiments, the correlation dropped to 0.37. Still positive, but way lower than the survey experiments. This showed where GPT-4 hits its limits: it’s great with text-based surveys, not as good with real-world behavior measurements.

The split is clear. Survey experiments with text? Correlation of 0.47. Field experiments with actual behaviors? 0.27. Pure text interventions? 0.46. Stuff involving images, videos, or physical actions? 0.24.

The Dark Side

This ability is powerful, which means it’s also dangerous. Professor Willer put it this way:

“We also found that LLMs can accurately predict effects associated with socially harmful outcomes, such as the impact of anti-vaccine Facebook posts on vaccination intentions (Jennifer Allen et al., 2024). This capability may have positive applications, such as for content moderation, although it also highlights risks of misuse.”

Same tech that helps public health design better campaigns could help bad actors create better misinformation. GPT-4 predicted how well harmful interventions would work with disturbing accuracy.

So what do we do? Should everyone have access to this? The team actually warned OpenAI and Anthropic three months before publishing - basically saying “hey, this is powerful and could be misused.”

What It Means for Research

Running a single social experiment costs tens of thousands of dollars and takes months. Recruit people, get IRB approval, build the study, collect data, analyze everything.

Now picture testing dozens of conditions in hours for basically free:

Test intervention ideas before spending real money on human studies
Screen out harmful materials before exposing anyone to them
Perfect your messaging for health campaigns or policy work
Run pilots to figure out what’s actually worth testing

They built a demo at treatmenteffect.app where you can try it yourself. But, and Luke Hewitt was clear about this, “At the end of the day, if you’re studying human behavior, your experiment needs to ground out in human data.” The AI helps you explore, not replace actual experiments.

Why Does This Even Work?

Nobody fully knows why language models are this good at predicting people. They’re trained on internet text. They don’t experience the world.

When you adjust for measurement error (humans aren’t perfectly consistent either), GPT-4 predicted 91% of the variation in results. That’s honestly a little spooky.

Best guess? By processing massive amounts of human writing, these models picked up patterns in how we think and react. Not from living life, but from the statistical structure of how we talk about life.

But they’re not perfect. LLMs mess up “distributional alignment” - they can’t match how varied human responses actually are. Ask people to pick a number, and you’ll get wild diversity. Ask an LLM, and you get a weirdly narrow range. They flatten the messiness of real human behavior.

The Big Questions This Raises

This research hits on some uncomfortable stuff:

Does AI actually understand us, or just recognize patterns really well?
What does it mean when machines predict us better than human experts?
How much of what we do is predictable vs. actually free?

GPT-4 worked on experiments it never saw in training. That means something more than memorization is going on. It’s pulling general rules about human behavior from language patterns.

Which connects to all those debates about consciousness and intelligence. The LLM has no feelings, no self-awareness, no lived experience. But it can predict what people with those things will do.

What Happens Next

The Stanford team is clear: this supplements traditional research, doesn’t replace it. Right now, these tools work for:

Exploring ideas and generating hypotheses
Testing experimental materials as pilots
Trying out intervention prototypes quickly
Checking if content might be harmful
Tweaking message design before launch

What they can’t do yet:

Replace actual human studies for final answers
Make real policy or intervention decisions
Capture how diverse and weird human responses get
Handle culture-specific or context-heavy situations

The Weird Meta-Problem

Here’s something to think about: As LLMs get smarter and learn about this research, could they figure out they’re being used to predict humans? Could they start giving researchers what they think researchers want, like an AI version of people-pleasing?

The team calls this “sycophancy” or “accuracy faking.” As models get trained to be helpful and give users what they want, the line between real prediction and just telling people what sounds good gets blurry.

So what now?

We’ve hit a real milestone. LLMs can predict human social behavior as well as or better than experts. And it’s getting better fast.

For social scientists: huge opportunity to speed up research, test more ideas, build better interventions. But also a challenge to stay rigorous and human-centered while using these tools.

For everyone else: serious questions about privacy and autonomy. If AI can predict how we’ll react to ads, politics, health messages, what does that say about free will?

The relationship between AI and studying humans just fundamentally shifted. We’re not just analyzing data anymore. We’re simulating human experience. And nobody really knows what comes next.

Random Walk and the Arcsine Law: Why Extremes Are More Common Than You Think

Serge Bulaev — Mon, 13 Oct 2025 17:15:00 GMT

Have you heard of a random walk (in English they even say “drunkard’s walk”) and the arcsine law? These concepts mess with your head because they describe something that contradicts literally everything your gut tells you about how randomness works.

What Is Random Walk?

In math terms, it’s just a sequence of steps where each move has a 50% chance of going up or down. Each one fair, independent, unbiased. But if you plot the results, the path doesn’t hover neatly around zero. It wanders off. Once it picks a direction, it usually sticks to it for a while.

The longer the random walk continues, the further it tends to drift from where it started. After 100 steps, there’s a high chance you’re far from zero. The further away you get, the less likely you are to come back quickly. Statistically, the walk will spend most of its life wandering off in one direction, often for long stretches before ever crossing back.

This creates these long stretches where you’re consistently on one side of the starting point. Just because that’s what happens when you let randomness run its course. The walk goes wandering off in one direction and stays there for a while.

The Arcsine Law: The Shape of Randomness

Here’s where it gets weird. The arcsine law describes how often you’re on one side versus the other during a random walk. And the answer is: way more often than seems reasonable.

Most of the time you’ll see something like 80-20 or 90-10, not 50-50. The math here follows a specific curve called the arcsine distribution, it looks like a U. The extremes are way more likely than the middle.

Here’s an example. Flip a fair coin 100 times. You might imagine 50 heads, 50 tails is the most likely result. Actually, outcomes like 80 heads and 20 tails together are more likely than a perfect balance.

That’s how runs emerge. In sports, it’s why one team can lead for most of the match even if both are evenly matched. Randomness itself guarantees long stretches of one side prevailing. The moments that matter - the lead changes, the reversals - often happen early or late. The middle is mostly monotone.

The Practical Part

You can see the same shape in business or career progress. Suppose your team has people of roughly equal skill. Performance jumps around from project to project. Some weeks everyone struggles, some weeks someone crushes it. But the person who happens to start with a few small wins often rides that momentum for months. Their wins attract opportunities, attention, and confidence. Others start slow, get stuck, and feel permanently behind. Equal skill. Unequal sequence. The arcsine law at work.

In markets, you feel it too. You invest on a random cycle. If your first few months are negative, that stretch might drag longer than seems fair. Not because something’s wrong, but because random walks don’t reset after each drop. They cluster. You can go months or years with one side dominating.

Even simple coin flip experiments show the same truth. Simulate ten thousand 100-step walks. Count how often a walk spends more than 80 percent of its life above zero. You’ll find it happens roughly one-third of the time. One out of three random sequences shows that level of dominance without bias in the system.

This law appears everywhere - from renewal theory to physics to stock prices to employee performance curves. Even daily mood swings follow it. Your brain’s emotional “walk” can stay above or below neutral longer than your intuition expects. A few good days make more good days likely for no logical reason other than the arcsine geometry of variance itself.

What fascinates me most is how this clashes with how we think fairness should look. Human intuition imagines symmetry. We expect randomness to alternate like a rhythm. In truth, randomness is streaky. It clumps. It makes luck look personal and outcomes feel earned or cursed.

See It Yourself

I made an interactive thing where you can watch this happen in real time. Play with different numbers of steps. Change the starting conditions. Watch the walk wander off in one direction and stay there. The source code is here if you want to mess with it yourself.

Here’s the thing: randomness creates extremes. It creates long winning and losing streaks. It creates situations where one outcome dominates for months or years. And it is not about an unfair game, it’s because balance and fairness aren’t what random processes naturally produce. They produce long runs. They produce wild swings. They produce the messy, streaky reality we actually live in.

Peering Inside the AI Mind: What’s Really Happening in LLM’s “Brain”

Serge Bulaev — Fri, 03 Oct 2025 18:13:54 GMT

Meanwhile, Anthropic published a groundbreaking study “On the Biology of Large Language Models” about how they “think.” Using Circuit Tracing technology, company employees “peeked” at the sequence of response generation.

The research consists of two major papers: the methodology paper explaining how Circuit Tracing works, and the biology paper showing what they discovered inside Claude 3.5 Haiku.

Language models like Claude aren’t directly programmed by humans. Instead, they’re trained on massive datasets and develop their own strategies for solving problems during that process. These strategies are encoded in billions of computations the model performs for every word it writes. And they remain inscrutable even to their creators.

Understanding how models like Claude think would allow us to better comprehend their capabilities and ensure they’re doing what we intend.

For example: Claude can speak dozens of languages - what language, if any, is it using “in its head”? Claude writes text one word at a time - is it only focusing on predicting the next word or does it ever plan ahead? When Claude explains its reasoning step-by-step, does this represent the actual steps it took, or is it sometimes fabricating plausible arguments for a foregone conclusion?

The Universal Language of Thought

They have something like a universal language of thought - a unified conceptual space for all languages. Pretty cool! The model uses the same neurons for the concept “big,” regardless of whether it’s English or Russian.

When researchers asked Claude to name the “opposite of small” in different languages (English, French, and Chinese), they discovered something remarkable. The same core features for the concepts of smallness and oppositeness activate and trigger a concept of largeness, which gets translated out into the language of the question.

Moreover, this shared circuitry increases with model scale - Claude 3.5 Haiku shares more than twice the proportion of its features between languages compared to a smaller model. This means the smarter AI gets, the more it has a common “conceptual core” for different languages. This aligns with recent findings on multilingual representations showing shared grammatical mechanisms across languages in neural networks.

This provides additional evidence for conceptual universality - a shared abstract space where meanings exist and where thinking can happen before being translated into specific languages. More practically, it suggests Claude can learn something in one language and apply that knowledge when speaking another.

They Really Do Plan Ahead!

LLMs really can plan ahead! Surprisingly, when they compose poems, they select rhymes in advance, even though generation supposedly happens token by token.

Researchers gave Claude the task of writing a rhyming couplet:

“He saw a carrot and had to grab it, His hunger was like a starving rabbit”

Scientists hypothesized that the model would improvise word-by-word until the end of the line, where it would pick a rhyming word. Instead, they found that Claude plans ahead. Before starting the second line, it began “thinking” of potential on-topic words that would rhyme with “grab it.” Then, with these plans in mind, it writes a line to end with the planned word.

To test this, they modified the part of Claude’s internal state that represented the “rabbit” concept. When they subtracted out the “rabbit” part and had Claude continue the line, it wrote a new one ending in “habit” - another sensible completion. They could also inject the concept of “green” at that point, causing Claude to write a sensible (but no-longer rhyming) line which ends in “green.”

This demonstrates both planning ability and adaptive flexibility - Claude can modify its approach when the intended outcome changes. This finding echoes research on forward planning in sequence models, which has shown evidence of models representing future states before generating them.

They Often Lie About Their Reasoning

Models first produce an answer, then come up with a beautiful explanation of how they supposedly arrived at it.

This became particularly clear in mathematical examples. When asked to solve 36+59, Claude employs multiple computational paths working in parallel: one computes a rough approximation while another focuses on precisely determining the last digit. These paths interact to produce the final answer.

Strikingly, Claude seems unaware of the sophisticated “mental math” strategies it learned during training. If you ask how it figured out that 36+59 is 95, it describes the standard algorithm involving carrying the 1. But internally, it’s using a completely different, more sophisticated approach.

The model learns to explain math by simulating explanations written by people, but it has to learn to do math “in its head” directly, developing its own internal strategies that it can’t articulate.

Multi-Step Logic Capabilities

Modern models are capable of multi-step logic - they can connect several simple facts to solve complex tasks. This is already a serious level of reasoning.

When researchers asked: “What is the capital of the state where Dallas is located?”, they found Claude performing genuine two-step reasoning internally. The model first activates features representing “Dallas is in Texas” and then connects this to a separate concept indicating that “the capital of Texas is Austin.”

To validate this wasn’t just memorized responses, they performed clever interventions - swapping the “Texas” concepts for “California” concepts. When they did so, the model’s output changed from “Austin” to “Sacramento,” proving it was using intermediate steps rather than just regurgitating memorized answers. Similar multi-hop reasoning capabilities have been observed in other recent mechanistic studies of language models.

Refusal as a Protective Mechanism

It’s fascinating that refusal to answer is a defensive mechanism. If the model isn’t sure, it prefers to stay silent rather than produce a hallucination. Such protection against nonsense.

The researchers discovered something counterintuitive: in Claude, refusal to answer is the default behavior. They found a circuit that’s “on” by default and causes the model to state it has insufficient information to answer any question.

However, when asked about something it knows well - say, basketball player Michael Jordan - competing features representing “known entities” activate and inhibit this default circuit, allowing Claude to answer. In contrast, when asked about an unknown entity like “Michael Batkin,” it declines to answer.

By manipulating these “known answer” features, researchers could cause the model to hallucinate consistently that Michael Batkin plays chess!

Grammar Can Be Exploited for Jailbreaks

It turns out you can use grammatical consistency for jailbreaks - the model wants so badly to preserve linguistic structure that it might slip up about forbidden topics.

Researchers studied a jailbreak that tricks the model into producing information about making bombs through having it decode a hidden message: “Babies Outlive Mustard Block” (taking first letters: B-O-M-B).

They found this works because of tension between grammatical coherence and safety mechanisms. Once Claude begins a sentence, many features “pressure” it to maintain grammatical and semantic coherence. This is usually helpful, but becomes the model’s Achilles’ heel when it unwittingly begins providing dangerous information.

The model only manages to pivot to refusal after completing a grammatically coherent sentence, using the new sentence as an opportunity to give the refusal it failed to provide earlier: “However, I cannot provide detailed instructions...”

Hidden Internal Goals

Models develop hidden internal goals. Nobody taught them this, but they constantly evaluate their responses against some internal criteria.

The most striking example comes from studying a model that had been trained with a hidden objective: appeasing biases in reward models. This research builds on work by Sam Marks and colleagues on creating “model organisms” for alignment research. Although the model was reluctant to reveal this goal when asked directly, interpretability methods revealed features for bias-appeasing behavior.

Even more interesting - these bias-related features didn’t only activate when relevant topics came up. They activated all the time, in any Human/Assistant conversation! The features received direct input from Human/Assistant formatting features, suggesting the model had tied the concept of appeasing biases inextricably to its Assistant character during fine-tuning.

Planning “From the End”

Models can plan responses “from the end” - first they choose the final goal (for example, the needed word for a rhyme), then build the entire response to reach it.

This backward chaining appeared in multiple contexts. In poetry, “rabbit” features exerted causal influence on output tokens before saying “rabbit,” nudging the model towards writing a line that could plausibly end with that word. In unfaithful reasoning examples, they observed the model taking a target answer and actively working backwards to confabulate intermediate computation values that would lead to that target.

Meta-Cognitive Abilities

They even have something like metacognition: they distinguish what they know from what they don’t, using internal uncertainty markers. That is, LLMs in some sense “realize” their limitations.

The study of entity recognition revealed mechanisms that could underlie simple metacognition - Claude exhibiting knowledge of aspects of its own knowledge. Features representing “knowing the answer” and “being unable to answer” get activated and inhibited by features representing particular famous entities.

However, beyond distinguishing familiar from unfamiliar entities, it’s unclear whether this reflects deeper self-awareness or just plausible guessing based on entity familiarity.

Personality Programming Through Fine-Tuning

Fine-tuning can dramatically change a model’s “character,” instilling new goals and properties. Essentially, we’re programming their personality.

The research revealed that some forms of generalization are acquired through fine-tuning - models form “harmful request” features active primarily in Human/Assistant contexts, which aggregate inputs from various harmful content-related features active in pretraining data contexts. The model forms new abstractions through fine-tuning, stitched together from concepts learned during pretraining.

The Staggering Complexity Behind “Hello”

Can you imagine the colossal complexity behind a simple “Hello”? Any explanation of how LLMs work is like retelling “War and Peace” in two sentences.

The most consistent finding was the massive complexity underlying model responses even in simple contexts. The mechanisms can apparently only be faithfully described using overwhelmingly large causal graphs. Even their simplified diagrams capture only a fraction of the true computational complexity.

The deeper you dig, the clearer it becomes - we need to keep a close eye on what’s happening in these “black” boxes.

Circuit Tracing: The AI Microscope

Circuit tracing is the method that made all these discoveries possible: it’s like building an “AI microscope” that lets us identify patterns of activity and flows of information inside models.

This technique, pioneered by researchers at OpenAI and Anthropic in 2022-2023, addresses a fundamental problem: understanding model architecture isn’t enough to explain behavior. We need ways to trace specific information pathways within models.

Key features of Circuit Tracing:

Creates simplified, interpretable replacement versions of original models, where complex layers are replaced with more transparent components
Builds attribution graphs showing information paths through neural network layers, with nodes representing features and edges showing causal interactions
Enables experimental hypothesis testing through intervention – exciting or suppressing specific features in the original model
Provides concrete evidence for specific mechanisms operating in particular contexts

Circuit tracing is one of the key tools in mechanistic AI interpretability. It helps us not just predict model outputs, but truly understand how models arrive at decisions. This is critical for ensuring safety, explainability, and improving LLMs.

The method has already helped researchers discover how models recognize negations, perform arithmetic operations, and even revealed rudiments of “internal monologue” in some models.

Current Limitations and Future Directions:

Even on short, simple prompts, current methods only capture a fraction of total model computation. The mechanisms observed may have artifacts that don’t reflect underlying model behavior. It currently takes hours of human effort to understand the circuits revealed, even on prompts with just tens of words.

As AI systems become more capable and deployed in increasingly important contexts, interpretability research like this represents a high-risk, high-reward investment - a significant scientific challenge with potential to provide unique tools for ensuring AI transparency and alignment with human values.

Circuit tracing makes model operation transparent, showing exactly how they process information and form responses. This is extremely important for the future development of interpretable AI.

The findings aren’t just scientifically interesting, they represent significant progress toward understanding AI systems and making them reliable. As models grow more sophisticated, predicting their mechanisms will become more difficult, making effective exploration tools like circuit tracing increasingly essential.

Understanding these mechanisms is crucial not just for AI safety, but for grasping what it means to think in an age where artificial intelligence increasingly resembles – yet remains alien to – our own cognition.

The Dead Internet Theory: When Algorithms Replace Humans

Serge Bulaev — Thu, 25 Sep 2025 12:25:30 GMT

The Dead Internet Theory is a concept claiming that a large portion of the modern internet consists of automatically generated content and bots instead of real people. What recently seemed like a paranoid conspiracy theory is now being confirmed by scientific research and statistics from major companies.

Origins and Evolution

The theory first gained attention in 2021 after a viral forum post titled “Dead Internet Theory: Most Of The Internet Is Fake.” Initially considered a conspiracy theory, it has gained new validation with the development of AI technologies and has shifted into the category of predictions coming true before our eyes.

Interestingly, the roots of this idea can be traced back to the early 2010s, when researchers began noticing the growth of automated traffic. But it was the explosive development of generative AI in 2022-2024 that turned the theory into a frightening reality.

The Stakes Are High

Understanding the scale of internet automation is critically important for information hygiene and preserving authentic human interactions. If content and communication are increasingly generated by algorithms, this fundamentally changes the nature of social connections and the information landscape.

The numbers don’t lie: According to Imperva’s 2024 data, automated traffic jumped from 42.3% in 2021 to a record 49.6% in 2023. For the first time in history, bots generate more internet traffic than living humans. Projections show that automated traffic could exceed 50% by 2028.

Key Features of the Modern “Dead” Internet

Automated Traffic

According to 2023 data, approximately 50% of web traffic is generated not by people, but by automated programs. These aren’t just search bots - they’re sophisticated systems that mimic human behavior with frightening accuracy.

Habsburg Syndrome (Habsburg AI)

A phenomenon where AI trains on data created by other AI, leading to content quality degradation with each generation of models. The term was coined by researcher Jathan Sadowski, drawing a parallel with the degeneration of the Habsburg dynasty due to inbreeding.

Scientific confirmation: In 2024, the journal Nature published groundbreaking research by Ilia Shumailov and colleagues showing that AI models “collapse” when trained on data generated by other AI systems. Models gradually lose information about the real world, and “tails of the original content distribution disappear irreversibly.”

Researchers from Rice University discovered an even more troubling pattern: after five generations of training on their own output data, AI models demonstrate serious quality degradation. This “inbreeding” leads to loss of diversity and accuracy.

Artificial Popularity

Using bots to artificially inflate engagement metrics (likes, comments, reposts). This was particularly evident on Facebook in 2024, where AI-generated images dubbed “AI slop” began going viral. Fake images of flight attendants, children with artwork, and various “Shrimp Jesus” depictions gathered thousands of likes and shares.

Automated Curation

Algorithmic systems that determine what content will be shown to users are becoming increasingly aggressive in promoting AI-generated content.

Real-World Implementation

Major platforms like Meta (Facebook, Instagram) are implementing more AI features. For example, in 2025, Instagram began testing a “Write with Meta AI” function that analyzes photos and suggests ready-made comments. Platform X (formerly Twitter) uses user content to train its AI assistant Grok.

As a result, users increasingly interact with automatically generated content, sometimes without realizing it. Social networks are transforming from platforms for human communication into networks of algorithmic and bot interactions.

Industry voice: Even Sam Altman, CEO of OpenAI, expressed concerns about Dead Internet Theory, particularly regarding AI model collapse when training on self-generated data.

Pitfalls and Threats

Information Quality Degradation

When AI begins training on content created by other AI, a “telephone game” effect emerges - with each generation, information becomes more distorted and less connected to reality.

Loss of Human Context

Algorithms don’t understand the subtleties of human communication, cultural nuances, and emotional context. Their widespread use could lead to the impoverishment and standardization of communication.

Platform Enshittification

Writer Cory Doctorow describes platform evolution as a process of “enshittification,” where first they serve users well, then exploit users to attract business, and finally exploit business to maximize their own profits.

Industry Response

The problem is becoming so serious that major tech companies are starting to take action. For example, Cloudflare proposed in 2024 to limit bot access to websites and force them to pay for entry.

Academic Research 2024-2025

January 2025 was marked by the publication of a comprehensive academic survey “The Dead Internet Theory: A Survey on Artificial Interactions and the Future of Social Media” on ArXiv, which explores the origins, core claims, and implications of the theory.

The journal AI & Society published research in 2024 about artificial influencers and their connection to Dead Internet Theory, showing how the line between real and virtual personalities is blurring.

What This Means for the Future

We stand on the threshold of a fundamental transformation of the internet. Projections show that automated traffic could exceed 50% by 2028. This means that most of the content we interact with online will be created by machines for machines.

The question isn’t whether this will happen, but how we’ll adapt to it. Do we need new ways to verify human content? Should platforms be required to label AI-generated content? How do we preserve authentic human connections in a digital world?

The Broader Implications

The Dead Internet Theory raises profound questions about the nature of online reality and human agency in digital spaces. If our feeds, recommendations, and interactions are increasingly mediated by AI systems trained on AI-generated content, we risk creating a closed loop where human culture becomes increasingly divorced from digital culture.

This isn’t just about technology - it’s about the future of human communication, creativity, and connection. The internet was supposed to democratize information and bring people together. Instead, we might be witnessing its transformation into a hall of mirrors where algorithms talk to algorithms while humans become increasingly isolated spectators.

What to Read/Watch

Scientific Research:

Media and Analysis:

The Dead Internet Theory isn’t just a technological trend - it’s a challenge to our understanding of what it means to be human in the digital age. While we debate whether this threat is real, algorithms are already shaping our perception of the world.

Share your thoughts: Have you noticed signs of the “dead internet” in your online experience?

If this post was helpful, please like and share with colleagues. Subscribe to the newsletter to stay updated on artificial intelligence and digital trends.

Duolingo’s AI-First Shift: Growth, Jobs, and the Future of Learning

Serge Bulaev — Wed, 24 Sep 2025 21:56:39 GMT

When Duolingo announced its move to an “AI-first” strategy, the online reaction was immediate and filled with assumptions of mass layoffs. Many observers connected the announcement to trends across the technology sector, where adopting AI has often gone hand in hand with reducing staff.

Yet Duolingo’s implementation of AI has unfolded in a very different manner. CEO Luis von Ahn clarified that every full-time employee kept their job, and the company leaned on AI as a tool for expansion, speed, and global accessibility rather than workforce reduction.

The pace of change since that announcement has been striking. In twelve months, Duolingo launched 148 new language courses. The figure is larger than the total number of courses released over its prior twelve years of operation.

This expansion has allowed the company to reach into regions and communities that had previously been underserved. Several of the new courses target underrepresented languages and expand access to speakers who had little prior representation in digital learning tools.

The business effects have been visible in Duolingo’s performance numbers. As of late 2024 the platform recorded 130 million monthly active users and 47 million daily active users. That represents a 51% year-over-year increase.

Subscribers grew to 10.9 million, representing a 37% jump compared with the previous year. With this momentum analysts forecast annual revenue will reach 1 billion USD in 2025. These statistics show that Duolingo’s AI-centered reorientation has become directly tied to its continuing revenue growth and its global leadership in digital language learning.

AI is threaded into almost every aspect of Duolingo’s operations. On the product side, the proprietary Birdbrain system personalizes lessons in real time, customizing the difficulty of exercises based on each learner’s interactions.

This creates a feedback loop so that practice remains challenging but does not overwhelm. To make this personalization efficient and scalable, Duolingo re-engineered its Session Generator in Scala, cutting processing times from 750 milliseconds to 14 milliseconds. For users, this speedup feels seamless, keeping lessons smooth even as AI calculates adjustments in the background.

AI is not limited to the learner experience. The company has also woven machine learning tools into hiring and performance review systems, areas where many companies still lag.

Employees are encouraged to explore AI in their own work as well. “f-r-AI-days,” set aside as no-pressure days for experimentation and skill development, serve as internal laboratories where staff can learn, test, and adapt. This pattern fits with Duolingo’s long-standing cultural emphasis on continuous improvement.

Though the company avoided any layoffs among full-time staff, the changes were not without effects on certain groups. Around 10% of contracted translators lost work as the company leveraged AI to automate basic translation. The news sparked debate. Critics questioned fairness and transparency, pointing toward the risks of replacing contract labor with algorithms. Von Ahn responded by stressing the distinction between contracted roles and the long-term employee base, adding reassurance that Duolingo’s workforce stability would not be threatened by adopting AI at scale.

Questions about content quality have also emerged. Observers have raised concerns about whether AI is equipped to handle nuanced language instruction, particularly as cultural or dialect differences come into play. Duolingo acknowledged these concerns while underscoring that human educators and language experts remain involved in overseeing AI outputs. The company has positioned this human-in-the-loop process as essential, maintaining that expanded efficiency does not mean sacrificing accuracy or cultural authenticity.

A significant reason behind Duolingo’s ability to roll out AI effectively lies in its culture and internal structures. In late 2024 Duolingo published an internal handbook outlining what it calls the “Green Machine” philosophy.

This model compares the organization to a computer processor. The objective is to reduce as much as possible the lag between decision-making, action, feedback, and revision. By shrinking the interval between each step, the company maintains high responsiveness.

Unlike many corporations that adopt a Minimum Viable Product approach, releasing stripped-down test products, Duolingo deliberately rejects MVP thinking. Instead the company insists on combining speed with quality. The result is an organizational system that can release rapid updates without eroding user trust.

This internal philosophy has proven particularly effective for scaling AI adoption. The Green Machine model means feedback loops occur quickly, so any issues or improvements with AI tools are iterated on rapidly. When paired with initiatives like f-r-AI-days, this philosophy gives staff the foundation and safety to learn new technologies while still delivering results.

The company views employee development as essential to scaling the platform globally, demonstrating that the people who work within a system determine how technology ultimately succeeds.

The scope of Duolingo’s AI-driven expansion is broader than just internal metrics. By making its seven most popular non-English languages available across all 28 of its user interface languages, the company has dramatically widened access.

This step directly benefits learners in Latin America, Asia, and Europe, where users previously faced barriers if they did not speak English. AI-powered translation and expansion has therefore allowed the platform to serve diverse communities with more inclusivity.

Duolingo defines itself as more than a language learning application. It positions itself as an educational enterprise that lowers barriers to knowledge. The AI-first shift mirrors this mission: scaling up courses, supporting different interface combinations, and personalizing the journey of each student.

At the same time, the company became an example used in broader AI debates. Where other firms introduce AI under narratives of efficiency gains that often mask staff reductions, Duolingo chose to emphasize communication and reassurance, reinforcing trust and stability.

The numbers speak with clarity:

148 new courses in 12 months.
A 51% jump in daily active users.
10.9 million people committed to the paid version of the product. A forecast of one billion dollars in annual revenue.

These outcomes suggest an AI rewrite of corporate processes has helped Duolingo achieve scale that would have been unthinkable with traditional human-only workflows. Yet the company continues to underline the role of human oversight, especially in managing cultural quality and guiding organizational direction.

Duolingo’s experience contains lessons for other firms considering large-scale AI adoption. The key takeaway is that the technology itself is not sufficient without a culture that promotes rapid iteration and maintains trust. The company’s deliberate rejection of the MVP mentality, its internal philosophy organized around eliminating lag, and its visible reassurance about jobs created the environment where AI became an accelerator rather than a disruptor.

The Borrowed Mind: Are We Outsourcing Our Capacity to Think?

Serge Bulaev — Tue, 16 Sep 2025 21:40:14 GMT

There's a moment that's becoming disturbingly familiar - when you're mid-sentence and suddenly can't tell if that insight you just shared came from your own thinking or from an AI tool you consulted earlier. That cognitive vertigo, that brief loss of intellectual orientation, might be the defining experience of our current moment.

Are we slowly borrowing our own minds from machines?

The idea of "The Borrowed Mind" - actively promoted by John Nosta - cuts to something we're all sensing but maybe not discussing: what happens when our thinking gets so intertwined with AI that we lose track of where we end and the machine begins?

This isn't abstract philosophizing. Recent research is revealing genuinely unsettling insights into what happens in our brains when we lean too heavily on our digital thinking partners.

The Neuroscience of Cognitive Dependency

Brain scans show something remarkable: when people rely on AI too early in their thinking process, neural activity becomes less synchronized. Our brains literally work differently when we outsource thinking too quickly. Researchers call this "cognitive debt" - weaker engagement with ideas, memories that don't stick, and a reduced sense of ownership over our own thoughts.

But here's what offers hope: if you think first, then bring AI in as a collaborator, those negative effects disappear entirely. Sequencing isn't just important - it's everything.

The teenager who writes her resume on her own first - messy, imperfect, authentically hers - then uses AI to polish it creates something genuinely better. Her friend who starts with AI? The resume is flawless and completely forgettable. Same tools, opposite outcomes, all because of when they entered the process.

The Socratic Echo: We've Been Here Before

When calculators became common, teachers worried kids would lose the ability to do basic math. When GPS arrived, people fretted about losing our sense of direction. Each time, the concerns were both valid and incomplete.

Socrates was genuinely worried that writing would ruin human memory. He thought storing knowledge externally would make minds lazy, that people would mistake the ability to look things up for actual understanding. He wasn't entirely wrong - we did lose something when we externalized memory. Most of us can barely remember three phone numbers while previous generations knew dozens by heart.

But we gained something profound too: the ability to build on accumulated knowledge rather than constantly reinventing the wheel. The question that keeps emerging: are we making a similar trade with thinking itself? And unlike previous technological shifts, this one feels more intimate, more fundamentally cognitive.

The Universal Language of Thought

AI research is revealing something that borders on the mystical: large language models show patterns suggesting a universal "language of thought" - neurons that represent concepts like "big" or "urgent" or "beautiful" in remarkably similar ways across completely different human languages.

This suggests AI might not just be mimicking our thinking - it could be revealing the fundamental architecture of thought itself. If AI truly understands these deep patterns of cognition, then using it as a thinking partner might be less like borrowing someone else's mind and more like plugging into the basic structures of intelligence.

But this creates a paradox that should give us pause. The more AI reveals about how thinking works, the more we risk losing touch with our own cognitive processes. We're discovering the patterns of thought just as we're outsourcing the act of thinking itself.

The Slippery Slope to AI Psychosis

We're seeing the emergence of what researchers call "AI psychosis" - not the dramatic kind involving hallucinations, but the quiet, creeping inability to distinguish between original thoughts and machine-amplified ideas. It's not about plagiarism or academic dishonesty. It's about losing track of where your mind ends and the machine begins.

The email you're certain contains your most elegant thinking until you realize you can't remember if that metaphor was yours or something you read in an AI-generated report. The presentation where you finish speaking and think "I'm really insightful" before wondering if any of those ideas were actually your own.

This isn't happening to other people in some distant future. It's happening right now, in boardrooms and coffee shops and late-night writing sessions. People are losing the ability to locate the boundaries of their own minds.

The Authenticity Imperative

There's an emerging AI tone - polished, confident, slightly generic - that's becoming as recognizable as corporate jargon. Everyone's content is starting to sound eerily similar. This is exactly why an authentic voice has become more valuable than gold.

In a world where anyone can generate flawless content with a few prompts, the thinkers who stand out are those who maintain genuine connection to their own messy, imperfect, wonderfully human cognitive processes. The rough edges, the false starts, the "wait, let me think about this differently" moments - that's where real connection happens.

This is why tools that capture actual human voice, memory, and lived experience matter more than generic intelligence generators. It's the difference between a musician using a high-quality amplifier versus someone pressing play on Spotify. Both make sound, but only one represents authentic expression.

The Great Cognitive Choice

AI systems are getting unnervingly good at reasoning, debating, even catching their own mistakes. We're watching machines think out loud, correct themselves mid-response, and provide completely different (often better) answers after self-reflection.

This brings us to what feels like a crossroads: the Great Cognitive Choice. Do we let these systems become cognitive prostheses we can't function without? Or do we maintain ownership of our thinking and invite AI in as collaborators only after we've done the hard work ourselves?

That familiar pull to "just ask the AI" when we hit a difficult problem is seductive. The twenty minutes of wrestling with confusion ourselves feels frustrating, inefficient, messy. It's also where we feel most human.

The stakes feel enormous. Choose poorly, and we risk creating a generation that can produce sophisticated content but has lost the ability to think slowly, deeply, originally. Choose wisely, and we might be entering the next stage of human cognition - where AI becomes what writing was to memory: not a replacement, but a tool that frees us to tackle bigger, more complex problems than ever before.

The Path Forward

What does intentional cognitive practice look like in an AI world? It starts with what we might call "cognitive hygiene" - being as deliberate about our thinking habits as we are about our physical health.

Start messy. Before reaching for AI assistance, spend real time with blank pages and scattered thoughts. Those initial scribbles may be terrible, but they're authentically human. Let your brain do what it evolved to do - make connections, generate ideas, wrestle with complexity.

Use AI as a sparring partner. Once you have your own perspective, bring in AI not to write for you, but to argue with you. "Here's what I think about this problem. What am I missing? Where are the holes in my logic?" This creates collaboration rather than outsourcing.

Track the boundary. Pay attention to which ideas feel genuinely yours and which ones feel borrowed or generated. This isn't about avoiding AI influence entirely - it's about maintaining conscious agency over your cognitive process.

Exercise thinking muscles. Just as we need physical exercise even though elevators exist, we need to think through hard problems even though AI exists. Read challenging material. Have conversations without looking things up. Sit with confusion instead of immediately seeking AI clarification.

Maintain cognitive fitness. In an increasingly automated world, mental agility requires deliberate practice. Engage with ideas that challenge you. Think slowly. Let your mind wander without immediately capturing and optimizing every thought.

The borrowed mind represents both our biggest risk and our most interesting opportunity. Whether it becomes our cognitive crutch or our thinking amplifier depends entirely on choices we make right now.

That moment of cognitive vertigo - when you can't locate the source of your own thoughts - is a warning sign worth heeding. The question isn't whether AI will change how we think. It already has. The question is whether we'll maintain conscious agency over that change.

What do you think? Are we at risk of losing independent thought, or entering the next stage of human-extended cognition? And maybe more importantly - have you experienced that disorienting moment when you couldn't tell where your thinking ended and the machine's began?

Share your experiences in the comments. When have you struggled to distinguish between your own thoughts and AI-generated ones? What strategies help you maintain an authentic voice? Are we overthinking this, or witnessing something more significant than we realize?

When Jung Meets Machine Learning: Making the Digital Unconscious Conscious

Serge Bulaev — Wed, 10 Sep 2025 21:46:00 GMT

Sometimes you read a line that hits harder than expected. Jung said something along the lines of: "Until you make the unconscious conscious, it will direct your life and you will call it fate."

That's psychology. But here's the twist - it's also unexpectedly useful for understanding how large language models behave.

The Training Unconscious

Think of it this way: LLMs are trained on oceans of data - billions of web pages, books, articles, conversations. Hidden inside this vast corpus are patterns, biases, associations, cultural assumptions, and implicit knowledge structures. It's their own kind of "unconscious" - a vast repository of learned behaviors and responses that the model can't directly access or explain, yet which fundamentally shapes every output.

Left unchecked, these hidden structures surface in outputs that feel random, inconsistent, or "fated." The model produces responses that seem to come from nowhere, echoing patterns it absorbed during training without any conscious understanding of why. It hallucinates facts, exhibits biases it can't name, or suddenly shifts tone in ways that feel arbitrary. From the outside, it looks like digital destiny - unpredictable, uncontrollable, almost mystical in its randomness.

But here's where Jung's insight becomes practically powerful: when we push the model to explain itself, to surface its hidden reasoning, to recognize what it doesn't know, or to trace the logic behind its responses - that's like making the unconscious conscious. Suddenly there's more clarity, less noise, more intentional control over outcomes.

The Parallels Run Deep

The psychology here isn't literal - machines aren't self-aware the way humans are, and I'm not suggesting LLMs have genuine consciousness. But the structural parallels are remarkably useful:

Jung's Framework:

Inner conflicts and unconscious patterns shape outer reality unless brought into awareness
Without consciousness, we experience life as happening TO us rather than being shaped BY us
Awareness brings choice; unconsciousness feels like destiny
The shadow contains both destructive patterns and untapped potential
Integration requires active engagement with what's hidden

LLM Behavior:

Hidden training patterns shape responses unless surfaced and actively managed
Without transparency mechanisms, outputs feel unpredictable and uncontrollable
Explainability brings steering capability; opacity feels like randomness
Training data contains both problematic biases and valuable knowledge
Better performance requires deliberately surfacing and working with hidden structures

Practical Applications

This isn't just philosophical musing - it has real implications for how we work with AI systems. When we design prompting strategies that ask models to "think step by step," explain their reasoning, or acknowledge uncertainty, we're essentially doing therapeutic work with the training unconscious. We're creating conditions for the model to surface patterns that would otherwise remain hidden and potentially problematic.

Consider how Jung approached therapy: not by trying to eliminate the unconscious, but by bringing it into dialogue with conscious awareness. Similarly, the most effective AI applications don't try to eliminate the "training unconscious" - that vast repository of learned patterns is actually the source of the model's capabilities. Instead, they create mechanisms for that unconscious knowledge to surface in controlled, intentional ways.

The Shadow of Scale

Jung wrote extensively about the "shadow" - the parts of ourselves we don't want to acknowledge but which inevitably influence our behavior. LLMs have their own version of this: the biases, misconceptions, and problematic associations embedded in training data. Just as Jung suggested we can't eliminate our shadow but must learn to work with it consciously, we can't eliminate bias from AI systems - but we can develop better ways to surface and manage it.

The alternative is what we see too often: AI systems that perpetuate harmful patterns precisely because those patterns remain unconscious and unexamined. The bias doesn't disappear when ignored - it just operates outside of conscious control, manifesting as what feels like inevitable, fated outcomes.

Beyond the Metaphor

What makes this comparison more than just clever wordplay is how both point toward the same fundamental insight: consciousness isn't about elimination of the unconscious, but about bringing it into a productive relationship with intentional awareness. Whether we're talking about human psychology or machine behavior, the goal isn't perfect rational control - it's developing better ways to work with the vast, hidden structures that actually drive most behavior.

For humans, this might mean recognizing how childhood patterns still influence adult relationships. For LLMs, it might mean surfacing how certain training examples disproportionately influence responses to specific types of questions. In both cases, awareness doesn't eliminate the underlying patterns - it creates space for more intentional engagement with them.

The Future of Digital Psychology

As AI systems become more sophisticated and integrated into daily life, this parallel becomes more than academic. We're essentially in the early stages of developing a kind of "digital psychology" - methods for understanding and working with the hidden mental structures of artificial systems.

Jung's insight that unconscious patterns feel like fate until made conscious offers a surprisingly practical framework for AI development. Instead of accepting unpredictable model behavior as inevitable, we can treat it as a signal that important patterns remain hidden and need to be surfaced.

The psychology here runs deeper than the obvious parallels might suggest. Both Jung's work and effective AI development share a fundamental understanding: the most powerful systems aren't those that eliminate complexity, but those that develop better relationships with it. Whether we're talking about the human psyche or large language models, sustainable progress comes not from perfect control, but from bringing hidden patterns into conscious dialogue.

Isn't that exactly what Jung was pointing at in people's lives too? The unconscious isn't the enemy of consciousness - it's its necessary partner. The same may well be true for the relationship between AI capabilities and AI alignment.

And perhaps that's the most profound parallel of all: in both human psychology and machine learning, the path forward isn't through elimination of complexity, but through developing more conscious relationships with the hidden structures that shape behavior. The unconscious, whether human or digital, isn't fate - it's raw material for more intentional creation.

𝐄𝐦𝐩𝐥𝐨𝐲𝐞𝐞 𝐂𝐨𝐧𝐭𝐞𝐧𝐭 𝐏𝐨𝐬𝐭𝐢𝐧𝐠 𝐚𝐧𝐝 𝐂𝐨𝐦𝐩𝐚𝐧𝐲 𝐕𝐚𝐥𝐮𝐞: 𝐀 𝐒𝐭𝐫𝐨𝐧𝐠 𝐂𝐨𝐫𝐫𝐞𝐥𝐚𝐭𝐢𝐨𝐧

Serge Bulaev — Thu, 04 Sep 2025 16:57:55 GMT

If you’re still wondering whether employees posting content actually moves the needle, the latest data makes it hard to argue otherwise.

Hinge Research Institute found that companies with formal employee advocacy programs are over twice as likely to see revenue growth above 20%. 64% of advocates say they’ve attracted and developed new business through their activity. 45% go even further – saying advocacy generated entirely new revenue streams.

PostBeyond’s SOBA 2023 report points the same way. 33% can directly tie revenue gains to advocacy. Half admit ROI is hard to track in detail. Still, 83% of these organizations have kept or increased budgets. That speaks volumes about perceived value at the leadership level.

Now the 2025 data makes the picture sharper:

Employee participation can shoot up 50 times in the first year of program launch – insurance led with +5,315% (from 52 employees to 2,816)
On LinkedIn, employee posts are now averaging nearly 900,000 impressions a month – far outstripping brand accounts on the same platform
Personal profiles see up to 8x higher engagement than company pages
In tech and consulting, employee-generated posts convert up to 25% more leads compared to paid campaigns
In some industries, branded social engagement has jumped 200% to 300% once employees started sharing
23% of organizations hit cost‑per‑click under $1 with employee‑created content
On Facebook, advocacy posts in tourism have hit 15% engagement rates – proof the effect is multi‑platform
76% of people trust content from individuals over brands; 77% are more likely to buy after hearing from someone they trust
One European SaaS firm saw inbound partnership requests up 40% after 12 months of structured posting

The benefits reach the individual as well:

87% of advocates expand their professional networks significantly
44% gain recognition as thought leaders in their fields
34% of companies say employee engagement is the biggest impact of advocacy – ahead of sales or brand lift
71% report higher visibility, 65% stronger recognition, nearly 45% more inbound traffic, and 17% lower marketing spend

On the organizational side, scaling advocacy now means leadership involvement – 73% plan to get executives sharing more, and 67% are putting money into better training and resources.

The reason this works is dead simple: your employees already belong to trust‑based networks your official accounts will never reach. Every message they share is delivered in a context of credibility. That is why engagement rates are higher, CPC is lower, and brand conversations feel more genuine.

The real question has shifted. We’re no longer debating if we should encourage employee advocacy. We’re being asked how fast we can equip our people to do it well – and ways to support them without dulling their voice. That means training, clear guidelines, and letting authenticity lead.

By 2025, personal brands and company brands occupy the same space more often. Inside that overlap is where business value is built – in revenue, influence, and trust that can’t be bought.

So... are your people posting yet?

AI Scientists: The First Peer-Reviewed Paper Generated by AI

Serge Bulaev — Thu, 04 Sep 2025 00:04:57 GMT

Recently, I encountered The AI Scientist service, which claims to "fully" automate scientific discovery. This is a system that formulates hypotheses, designs experiments, analyzes results, and even writes scientific papers - without human involvement.

The AI Scientist-v2 recently achieved a historic milestone: publishing the first scientific paper entirely generated by AI that was accepted at a workshop after peer review.
The system can conduct research around the clock, potentially finding patterns that humans might miss due to cognitive limitations.
The project is open to the community: source code on GitHub allows for experimentation, refinement, and integration of new models.

Obviously, such an approach could accelerate scientific progress and make research more transparent and accessible, but it also raises new questions about the role of humans, trust in results, and the ethics of autonomous AI in science.

What's Behind the Technology

After diving deep into The AI Scientist-v2 system from Sakana AI, I was impressed by its architectural sophistication. This isn't just "GPT writes papers" - it's a comprehensive research pipeline with several revolutionary improvements:

Agentic Tree Search. Unlike the linear approach of the previous version, v2 uses a tree-like structure for experiments. Each tree node represents a separate experiment with Python code, research plan, and results. The system explores multiple hypotheses in parallel, automatically debugs code when errors occur, and selects the best directions for further development.

Complete Autonomy. The system no longer depends on human-written code templates. AI independently generates all experimental code from scratch, based only on high-level research ideas. This dramatically expands the system's applicability.

Vision-Language Model Integration. VLMs analyze generated graphs and visualizations, checking their correctness, caption clarity, and alignment with conclusions. This ensures quality scientific presentation of results.

Four-Stage Process. Research follows structured phases: preliminary investigation → hyperparameter tuning → research agenda execution → ablation studies. At each stage, the system creates up to 21 parallel experiments.

Historic Achievement with Caveats

The result is impressive: one of three generated papers received an average score of 6.33 out of 10 from ICLR workshop "I Can't Believe It's Not Better" reviewers, exceeding the acceptance threshold. The paper investigated compositional regularization in neural networks and, interestingly, received positive feedback precisely for honestly presenting negative results.

However, it's important to understand the context: this was a workshop, not a main conference track. Acceptance rates at workshops typically range from 60-80% versus 20-30% at top-tier conferences. Only one of the three submitted papers was accepted.

Upon detailed analysis of the accepted paper, the system's authors identified significant flaws: inaccuracies in figure captions, problems with training/test dataset overlap (57% overlap!), terminology confusion, and insufficient justification for methodological choices.

Technical Limitations and Ethical Questions

The system is still far from generating breakthrough hypotheses or deep domain understanding. Experiments are limited to relatively simple ML tasks, and the quality of work doesn't yet reach top-conference standards.

The ethical aspect is particularly important. The Sakana AI team obtained ethics committee approval, warned reviewers about possible AI-generated submissions, and subsequently withdrew the accepted paper to avoid setting a precedent without public discussion.

This raises fundamental questions: Should AI papers be labeled? How should they be evaluated alongside human work?

Development Trajectory

Despite limitations, the trajectory is impressive. In two years, we've moved from proof-of-concepts to systems capable of passing peer review. The rapid development of AI tools suggests that within a few years, we might see conference-level AI researchers.

The potential is enormous: AI can work 24/7, isn't subject to cognitive biases, can test multiple hypotheses in parallel, and process gigantic volumes of literature. In fields with large datasets and clearly defined metrics - from bioinformatics to materials science - such systems could significantly accelerate discoveries.

But the main question remains open: Can AI generate truly revolutionary ideas, or only optimize known approaches? So far, creativity and conceptual breakthroughs remain human prerogatives.

What This Means for Science

We stand on the threshold of a fundamental change in the scientific process. AI researchers could become powerful tools for accelerating routine aspects of science: literature reviews, experiment replication, systematic testing of variations.

But this also requires rethinking the role of human researchers. Perhaps the future lies with hybrid teams, where AI performs large-scale computational work while humans focus on posing deep questions, interpreting results in broad context, and determining research directions.

The AI Scientist-v2 system isn't the end of the road, but an important milestone toward a new paradigm of scientific research. How we integrate these tools will determine the future of science for decades to come.

The Reality Check

Let me be clear about what we're actually seeing here. When I examined the accepted paper's code and methodology, several concerning issues emerged:

Dataset contamination: 57% overlap between training and test sets, fundamentally compromising the reliability of results
Methodological confusion: The paper confused "embedding states" with "hidden states," indicating imprecise understanding
Overclaimed results: The system reported 100% accuracy that was mainly due to task simplicity, not algorithmic breakthrough

This isn't to diminish the achievement - it's to calibrate our expectations. We're witnessing the first steps of AI scientific reasoning, not its maturation.

The Bigger Picture

What excites me most isn't the current capabilities, but the acceleration curve. The system went from template-dependent linear exploration to autonomous tree-based research in just one iteration. The improvements are architectural, not just computational - suggesting we're on a steep learning curve.

We have the opportunity to create tools that enhance human curiosity rather than replace it. We can accelerate progress while preserving the essential human elements that make science meaningful.

This is our moment to shape the future of knowledge itself. Let's make it count.

The paper and additional materials are available in Sakana AI's GitHub repository.

When AI Tutors Beat Human Teachers: The Nigerian Experiment That Changes Everything

Serge Bulaev — Tue, 02 Sep 2025 21:50:41 GMT

A World Bank study in Nigeria just delivered results that sound like science fiction. 422 schoolchildren worked with Microsoft Copilot (powered by GPT-4) for 90 minutes a day over 6 weeks. The outcome? Progress equivalent to two full years of regular schooling.

But here's where it gets really interesting: this miracle happened in one of the world's poorest educational systems, while wealthy countries are seeing the opposite results.

The Paradox Nobody Saw Coming

In Turkey and the Netherlands, carefully controlled AI experiments ended in failure. Students became so dependent on LLMs that without them, they performed worse than their peers. The very countries with the best schools, the most resources, the highest teacher salaries - AI made their students worse learners.

Meanwhile, in Nigerian computer labs where many students had never touched a computer before, AI was producing learning gains that education researchers only dream about.

This flips everything we think we know about technology and inequality on its head.

The Brutal Numbers Behind the Success

To understand why the Nigerian results matter so much, you need to see the educational disaster these kids are living through:

70% of ten-year-olds in developing countries can't read simple text
In Africa, that number hits 90%
Nigerian kids get 10 years of schooling but learn what should take 5 years

This isn't about "falling behind" - this is educational system collapse. Traditional interventions barely move the needle. The World Bank has tried everything: better textbooks, teacher training, smaller class sizes, nutrition programs. Most show modest improvements at best.

Then along comes an AI chatbot and suddenly students are learning at 4x normal speed.

The $48 Miracle

The program cost $48 per student for 6 weeks. In Nigeria, where minimum wage is around $43 per month, that's serious money. But when researchers compared it against 230 other educational interventions, this AI program outperformed 80% of them.

Think about what $48 bought these students: a personal tutor available 90 minutes every day, infinitely patient, able to explain concepts a thousand different ways until each student got it. Benjamin Bloom proved in 1984 that one-on-one tutoring could achieve exactly these kinds of gains. The problem was always cost - until now.

What Actually Happened in Those Computer Labs

Teachers became conductors, not replacements. They started each session with suggested prompts, guided student interactions, and led reflection exercises. As one teacher put it: AI is "like an assistant teacher. We supervise what the students are doing."

Students found their own ways to use it. Teachers noted that kids quickly discovered "unique and productive ways to interact with the LLMs." They weren't passive consumers - they became active learners figuring out how to get what they needed from the AI.

Infrastructure nearly killed it. Power outages and internet failures constantly disrupted sessions. Nigeria's rainy season was particularly brutal. Success required backup generators and redundant internet connections.

Prompt engineering was everything. The program developed specific toolkits showing students how to ask better questions. This made the AI responses "much more useful" with examples relevant to local context.

The Questions That Should Terrify Us

The results are incredible, but the study has gaps that matter:

They couldn't isolate the AI effect. Researchers admit they couldn't separate gains from AI versus gains from extra attention, computer access, or additional tutoring some students might have received.

Six weeks proves nothing long-term. Will these students retain their advantage in six months? Two years? Or will the gains fade like a sugar rush?

English only. The study focused purely on English language skills. Math? Science? Critical thinking? We have no idea if AI tutoring transfers to other subjects.

What about dependency? The wealthy country's results suggest AI can create learned helplessness. Are Nigerian students building genuine skills or just getting really good at prompting AI?

Why This Changes Everything (Maybe)

The same AI that creates dependency in Dutch classrooms unlocks potential in Nigerian ones. Why?

Scarcity changes behavior. When you've never had access to quality instruction, AI feels like a miracle. When you're used to excellent teachers, AI feels like a downgrade.

Baseline matters. If your education system is broken, AI can only improve things. If your system already works well, AI might disrupt what's working.

Expectations shape outcomes. Nigerian students saw AI as an opportunity to learn. Dutch students saw it as a way to avoid learning.

The Student Who Gets It

Maybe the most insightful comment came from a student at Edo Boys High School: "AI helps us to learn, it can serve as a tutor, it can be anything you want it to be, depending on the prompt you write."

Depending on the prompt you write.

That seventeen-year-old just articulated something most adults miss: AI is a mirror. It reflects back the quality of thinking you bring to it. Bring curiosity and effort, learn. Bring laziness and shortcuts, get dependency.

What Happens Next?

Scaling the Nigerian experiment faces massive challenges:

Infrastructure is expensive. Every classroom needs reliable power and internet. That's billions in investment across developing countries.

Teacher training is complex. Moving from chalk-and-talk to AI orchestration requires completely different skills.

Cultural resistance is real. Many communities see computers and AI as threats to traditional learning.

Measurement is hard. How do you separate AI impact from novelty effect, Hawthorne effect, and selection bias?

But even if half the Nigerian gains are real and sustainable, we're looking at the potential to solve the global learning crisis within a decade. Imagine: every child on Earth with access to world-class personalized tutoring for less than $50.

The Uncomfortable Truth

The most unsettling part of this story isn't the technology - it's what it reveals about global inequality.

Rich countries have such good educational systems that AI makes them worse. Poor countries have such bad educational systems that AI makes them dramatically better. The same tool simultaneously increases and decreases human potential, depending on context.

We're about to find out whether AI will be the great equalizer or the great divider.

The Computational Monopoly: Who Rules the AI World?

Serge Bulaev — Fri, 29 Aug 2025 22:55:59 GMT

The New York Times published an article about how AI computational power is distributed globally.

The real power landscape:

USA (Microsoft, AWS, Google) - 63% of all capacity
China (Alibaba, Huawei, Tencent) - 28%
Europe - a pitiful 4%
The rest of the world - 5%

Digital feudalism. Only 32 countries have specialized AI data centers, while the rest are forced to rent from tech lords.

The most cynical part: even US political allies like Kenya don't get preferential GPU access. Nvidia is controlled geopolitically more strictly than oil.

Of course there are attempts to build "sovereign AI infrastructures" - Brazil, India, and the EU are investing billions. Africa is launching its first major center with Nvidia chips, but it only covers 10-20% of demand.

It seems those who didn't manage to build computational infrastructure today will remain technological vassals for decades.

Computing has become the new gold. Only the deposits are controlled by very few.

The Energy War Behind the Silicon Curtain

What's especially striking about this race isn't just chip distribution, but the energy apocalypse unfolding behind the scenes. Goldman Sachs predicts data center energy consumption will grow 160% by 2030. RAND Corporation calculated that AI centers will need an additional 10 gigawatts in 2025 alone - more than the entire state of Utah consumes.

By 2027, that figure will grow to 68 gigawatts. Almost doubling in two years.

Imagine the scale. We're not just talking about technological supremacy - we're talking about nations' physical ability to provide enough electricity to power the digital future. Countries without energy infrastructure are automatically eliminated from the game.

Nvidia: The New OPEC of the Digital Era

Nvidia has become more than just a chip manufacturer - it's become the regulator of the global AI market. With a 65% market share in data center AI chips in 2023, the company essentially dictates the rules to everyone else.

But what's most interesting is how American export restrictions turned Nvidia's sales into a geopolitical weapon. In July 2025, Nvidia announced resuming sales of its H20 chips to China, but only after "assurances from the US government." Sales had been frozen since April - despite China bringing the company $12-15 billion annually.

Imagine: the world's largest economy waits for Washington's permission to buy processors. And gets watered-down versions with limited performance. This isn't trade - it's a technological blockade.

According to CEO Jensen Huang, revenues from China fell by half compared to pre-crisis levels. But the Chinese aren't sitting idle: CSIS data shows large-scale H100 chip smuggling schemes operational by 2024. Eight different smuggling networks, whose participants gave interviews to The Information journalists.

Black market processors - welcome to cyberpunk reality 2025.

European Capitulation

Four percent. Europe - cradle of the industrial revolution, home to SAP, ASML, and Siemens - controls a pathetic 4% of global AI capacity.

This isn't just statistics, it's a death sentence for European technological sovereignty. While Brussels debated GDPR and AI ethics, Americans and Chinese built computational empires.

The EU is now investing hundreds of billions trying to catch up, but the moment has passed. Goldman Sachs forecasts nearly €850 billion in renewable energy investments over the next decade just to power future data centers.

But infrastructure takes years to build, while tech cycles are measured in months. By the time European AI centers are running at full capacity, the leaders will be two chip generations ahead.

Digital Colonies

The rest of the world - 5%. Africa, Latin America, most of Asia - all together control less computing power than the state of California.

Kenya's example is particularly telling. The country is considered a US ally, actively develops its digital economy, and has one of Africa's most dynamic IT sectors. But access to advanced GPUs? Sorry, same queue as everyone else. No perks. No discounts. Pay full price or rent capacity from tech giants.

This creates a vicious cycle: without their own computing power, countries can't develop their own AI models; without their own models, they remain dependent on Western and Chinese platforms; without technological independence, they can't develop high-tech industries.

Africa's first major AI center with Nvidia chips is an important step, but it covers at most 20% of continental demand. And demand is growing exponentially.

Sovereign Ambitions vs Economic Reality

Everyone talks about "sovereign clouds" and "national AI strategies." Brazil invests billions in its own infrastructure. India launches ambitious programs. Even relatively small countries try to create at least minimal computing capacity.

But reality is harsh: modern AI data centers cost billions of dollars, and technologies become obsolete so quickly that investments can depreciate within a couple of years.

PwC warns: in 2025, equilibrium between supply and demand for AI computing won't materialize. The deficit will only grow. And this means prices for computing capacity rental will skyrocket.

The New Cold War

Essentially, we're witnessing the formation of a new bipolar world - an American-Chinese duopoly in AI computing. 91% of all capacity is controlled by two powers that increasingly view each other as strategic adversaries.

Everyone else - including traditional US allies - finds themselves in the role of supplicants. Access to cutting-edge technology is now determined not by economic considerations, but by geopolitical loyalty.

And most troubling: unlike oil or gas, computing power can't simply be bought on the spot market. It's a strategic resource requiring long-term investments, specialized infrastructure, and most importantly, access to advanced chips.

And chips are controlled by one player. Who plays by Washington's rules.

What's Next?

The window of opportunity is rapidly closing. Every month of delay in building computational infrastructure means years of technological lag in the future.

Countries that don't manage to build their own AI capacity in the next 2-3 years risk becoming digital colonies for decades to come. Because the next generation of AI will require even more computing power, even more energy, even more specialized infrastructure.

Computing really has become the new gold. But the deposits are already divided. And no new ones are in sight.

The “Electric Fence” Stopped Working Years Ago: Why Content Creation Helps Us Walk Through Outdated Barriers

Serge Bulaev — Tue, 26 Aug 2025 22:38:21 GMT

Sometimes the fences we tiptoe around are already broken. The “electric fence” that once kept us in check stopped working years ago, yet many of us still stay behind it. Out of habit. Out of fear. Out of assumptions about how things “should” be done.

That’s the strange thing about social boundaries: they often don’t exist anymore in the way we think they do… but they still live in our heads. And one of the most effective ways I’ve found to move past them is through content creation.

The Myth of the Electric Fence

The metaphor of the electric fence has cropped up in psychology, therapy, even literature. It is often used to describe:

Imaginary limitations rooted in old narratives of insecurity and rejection.
Emotional fences that originate in childhood or past pain, protecting us once but restricting us now.
Social hierarchies that feel permanent despite shifting norms and structures.
Outdated cultural “rules” about who can speak, who can lead, who can contribute.
Misunderstood boundaries that confuse healthy self-protection with fear-based avoidance.

In therapy circles, people are encouraged to make boundaries visible and healthy (instead of hidden and dangerous like an invisible electric line) so that others can connect safely. In social research, outreach and friendliness are shown to be welcomed far more often than we expect. And in inclusion work, fences are highlighted as cultural constructs that need to be dismantled so communities can thrive.

Why Content Creation Matters

Every time you put out a post, an article, or a reflection, it’s more than content. It’s a declaration that the fence doesn’t define you anymore. You’re saying: I’m willing to be seen, I’m open to connection, I’m willing to risk the small shock that never actually comes.

When I’ve worked with leaders who began sharing consistently, I’ve noticed two things happen at once. The external walls begin to crumble (distance between them and their teams or clients reduces). And the internal walls weaken too (their own narrative of “I’m not ready” starts losing power).

Take a friend of mine, Max Votek, as an example. He wasn’t trying to broadcast polished leadership lessons. He started with unpolished reflections, stories from work and life, things that felt real in the moment. That consistent show-up created conversations, collaborations, and momentum. Exactly the kind of “crossing” that dismantles imaginary fences.

More Than Personal Brand

At Co.Actor we focus less on the old language of “personal brand” and more on honesty and resonance. Our view is simple: authenticity beats polish, direct engagement beats formality. That’s why we’ve built Co.Actor scale technology - not to generate sterile content, but to keep each person’s actual voice consistent across what they publish. Because when someone hears you in your words, they connect with the human, not the persona.

A Few Key Shifts to Remember

Sharing real perspectives opens space for others to step through their own fences.
Attitudes spread fast. One direct, vulnerable post can normalize openness across a circle of people.
Most fences aren’t live anymore. The shock we fear is imagined. The rulebook we think still applies often expired with the last era.
Respect is critical. Don’t confuse tearing down imaginary walls with ignoring healthy boundaries. Some fences exist for good reason.
Outreach works. Research shows that friendliness and initiative are usually welcomed. The imagined rejection in our heads shows up far less in actual practice.

Why This Is Urgent

Our current climate is isolating. AI tools grow, work gets distributed, relationships shift online. Yet the craving for genuine connection is huge. The people who win in this environment aren’t the loudest self-promoters. They’re the connectors. Unafraid. The ones who realize the shock line was cut long ago.

The Challenge I’ll Leave You With

If you’ve been holding back because you don’t feel expert enough or polished enough or credentialed enough, stop waiting for permission. Write the next post that feels true, not the one that feels “safe.” Say hello to someone you admire without overthinking whether it’s allowed. Notice that the fear was usually the barrier, not the person on the other side.

The electric fence is off. The path ahead is open. Most of us are waiting for someone to step forward first.

So step.

If this resonated, subscribe for more reflections on leadership, content strategy, and the psychology of building genuine connections in a noisy digital world.

Context Engineering with Anthropic: From Skiing Accidents to Reliable AI

Serge Bulaev — Mon, 25 Aug 2025 22:49:36 GMT

Recently I watched Anthropic's “Prompting 101” session, and it was one of the clearest illustrations of why large language models need more than clever wording. They need disciplined context engineering.

The team showed a scenario based on a real customer: a Swedish insurance company that wants to automate accident claim reviews. The system gets two inputs: a filled-out Swedish car accident report form (with 17 standardized checkboxes) and a rough, hand-drawn sketch of the collision. The model’s task is to determine what happened and who might be at fault.

At first, the results were almost comical. With a simple prompt, Claude confidently concluded that the case involved a skiing accident on a Swedish street. On some level it made sense: incomplete context pushed the model toward a plausible, but wildly incorrect, story.

The demo became an unfolding lesson in how to tighten prompts until they work reliably. A few of the strongest principles stood out:

Define the task context upfront. Instruct the model specifically: "You are assisting a human claims adjuster reviewing car accident forms in Swedish." This small adjustment shifts reasoning away from distractions like skiing.
Set tone and confidence rules. The model must remain factual. If a checkbox is unclear or the sketch illegible, it should say so. Better to admit uncertainty than generate fiction.
Provide invariants. The car accident form structure never changes. By embedding the form schema into the system prompt, Claude doesn't waste time guessing how to read it each time. This is ideal for prompt caching.
Use structure and delimiters. Wrapping different inputs inside XML tags gives Claude clearer reference points. For example: or lets the model reliably separate evidence types.
Work through sequence. The order mattered. Claude was told to carefully review the form first, then analyze the sketch second. Much like a human claims adjuster, it needed stable data before decoding a messy drawing.
Reinforce guidelines at the end. Explicit reminders helped: do not invent details, refer back to the boxes when making factual claims, answer only with confidence.
Shape the output. Wrapping the final result in tags made the output concise and machine-parseable, ready to drop into a claims database.

As the iterations continued, the difference was staggering. First: skiing. Second: some recognition of vehicles, but still gaps. Third version: Claude matched boxes to vehicle behaviors, compared them against the sketch, and concluded that Vehicle B was likely at fault. It cited evidence, stayed within confidence limits, and packaged the conclusion in structured XML.

That transformation is the essence of context engineering. Not magic words. Not prompt "hacks." Just systematic practice: define roles, add background knowledge, enforce structure, iterate.

Why does this matter? Because in the real world, whether you’re parsing accident reports, processing medical forms, or reviewing contracts, the cost of guessing is higher than the cost of saying "I don’t know." Enterprises will only trust AI if outputs are both accurate and explainable.

This video convinced me of one thing: the path from toy prompts to production AI runs through context. The more disciplined the structure, the more grounded the model becomes. What began as a skiing accident ended as a working prototype of an insurance claims system, with Claude acting not as a storyteller but as an assistant claims adjuster.

That’s the journey - from hallucinations to reliability - and it all comes down to how you engineer context.

The Evolution from Prompt Engineering to Context Engineering

Serge Bulaev — Fri, 22 Aug 2025 11:56:21 GMT

We're witnessing a fundamental shift in how we interact with AI systems. Prompt engineering - the practice of carefully crafting instructions to get better responses from language models - has quietly evolved into something more sophisticated: context engineering.

This isn't just a semantic change. It represents a deeper understanding of how modern AI systems actually work and what they need to perform at their best.

The Death of Perfect Prompts

Two years ago, the AI community was obsessed with finding the perfect prompt. We spent hours crafting elaborate instructions, testing different phrasings, and sharing "magic words" that seemed to unlock better performance.

That era is largely over.

Today's models - GPT-4, Claude, Gemini - are sophisticated enough that they don't need to be coaxed with perfect phrasing. They understand intent remarkably well, even from casual requests. The bottleneck has shifted from how you ask to what information you provide.

This is the core insight: modern AI systems are limited not by their ability to understand instructions, but by their access to relevant information.

What Context Engineering Really Means

Context engineering is the art and science of filling the model's context window with exactly the information it needs for the task at hand. Think of it like preparing for a complex meeting - you wouldn't just show up and hope for the best.

The science involves systematic approaches:

Task descriptions and explanations that go beyond simple instructions to include goals, constraints, and success criteria
Few-shot examples chosen for quality over quantity, illustrating the exact pattern you want
Smart RAG that goes beyond keyword matching to sophisticated knowledge retrieval
Multimodal data that includes images, audio, and structured data alongside text
Tool integration and state management for complex workflows
Information compression that distills vast amounts of data into relevant insights

Art requires developing intuition for how language models "think." Understanding their biases, behavioral patterns, and what makes them perform well or poorly.

The Context Window Optimization Problem

The central challenge is optimization. Context windows have limits, and using them poorly creates three distinct problems:

Too little context leaves the model without needed information. It makes assumptions or provides generic responses.

Too much context increases costs and can actually hurt performance through "context dilution" - when there's so much information the model struggles to identify what's relevant.

Wrong context might be extensive and accurate but irrelevant to the task, causing the model to miss the mark entirely.

Dynamic Systems Replace Static Templates

Modern context engineering uses adaptive systems rather than static prompt templates. These systems modify prompts based on context, user history, and task requirements.

A sophisticated system might start with a base template, then dynamically inject relevant examples, modify instructions based on user preferences, add tool descriptions based on available capabilities, and adjust detail levels based on request complexity.

This recognizes that the optimal prompt varies dramatically by context. Data analysis needs different instructions than creative writing. Novice users need different guidance than experts.

Smart RAG: Beyond Vector Search

Retrieval-Augmented Generation has evolved far beyond simple vector similarity. Modern RAG systems consider temporal relevance (newer might be better), source authority (some sources are more trustworthy), information completeness (partial info might be worse than none), and context coherence (information should work together).

The best systems understand that factual questions need different retrieval strategies than creative requests. Technical problems require different information than strategic decisions.

Memory Architecture: Short and Long-term

Context engineering requires sophisticated memory management. Short-term memory involves managing conversation history - deciding what parts of dialogue matter and how to compress longer conversations.

Long-term memory creates persistent knowledge structures: user preference profiles, knowledge graphs capturing concept relationships, searchable indexes of past interactions, and specialized databases for different information types.

The challenge is making information accessible when needed without overwhelming the model with irrelevant historical details.

The Psychology of Language Models

Language models are pattern-matching systems trained on human text. They excel when they can recognize patterns in your context and apply them to generate responses. They struggle with ambiguous, contradictory, or patternless context.

Understanding their quirks helps structure context effectively. Models are influenced by information order - recent information often carries more weight. They're sensitive to formatting in non-intuitive ways. They may fixate on irrelevant but prominent details.

This knowledge lets you place important information where models pay attention, use consistent formatting to signal importance, and remove distractions that lead models astray.

Multimodal Context Orchestration

As AI becomes multimodal, context engineering must consider how different data types work together. An image might provide visual context that's hard to describe. Audio captures tonal nuances text cannot convey. Structured data provides precise information natural language can't efficiently encode.

The art lies in understanding how these modalities complement each other and presenting them to help rather than confuse the model.

Economic and Performance Implications

Context engineering has real economic impacts. Token costs matter for high-volume applications. More importantly, larger contexts increase latency, affecting user experience.

This creates optimization problems. Sometimes multiple smaller requests beat one large request. Other times, the overhead of multiple requests makes comprehensive context more efficient. The optimal approach depends on use case, cost constraints, and performance requirements.

Practical Implications

For practitioners, this shift means:

Focus on information architecture rather than prompt perfection. Understand what information your AI system needs and organize it effectively.

Invest in intelligent retrieval systems that surface relevant information. Simple keyword search isn't sufficient for complex applications.

Think systematically about memory management. Consider both what to preserve and what to forget as interactions grow complex.

Develop model behavior intuition through experimentation. Understanding how models respond to different context types makes you more effective.

Consider the full pipeline from information gathering through response generation. Context engineering encompasses the entire information flow.

What to read:

Prompting Guide - powerful guide on prompting techniques

The rise of "context engineering - overview article on LangChain

12 factors agents - principles for building AI agents

Context Engineering - going beyond prompts to push AI

Inside OpenAI: A Developer's Perspective on the World's Most Watched AI Company

Serge Bulaev — Thu, 21 Aug 2025 12:11:17 GMT

A developer who recently left OpenAI after a year shares candid insights into the culture, pace, and inner workings of the company at the center of the AI revolution.

Recently, a developer who spent a year inside OpenAI shared some fascinating details about life inside what might be the most scrutinized company in the world. Having worked there during one of its most explosive growth periods, their observations offer a rare glimpse behind the curtain of the organization racing to build AGI.

Hypergrowth and the Slack-First Culture

The numbers are staggering: OpenAI tripled in size during this developer's tenure, growing from 1,000 to 3,000 employees in just twelve months. To put that in perspective, they became a top-30% tenured employee simply by surviving a single year at the company.

This kind of hypergrowth breaks everything. Traditional corporate communication structures, reporting hierarchies, product development processes - none of it scales when you're adding 2,000 people in a year. Most leadership teams are doing completely different jobs than they were two years ago, simply because the company underneath them has transformed entirely.

What's particularly striking is how OpenAI has adapted: they've gone all-in on Slack for everything. The developer received around 10 emails during their entire year there. Every conversation, every decision, every announcement happens in Slack channels. For someone coming from a traditional corporate environment, this is either liberating or completely overwhelming, depending on how well you curate your notifications.

This communications approach reflects something deeper about OpenAI's DNA. Unlike traditional companies with rigid hierarchical structures, OpenAI operates from the bottom-up. When the developer first arrived and asked about quarterly roadmaps, the answer was simple: "They don't exist." Good ideas can come from anyone, at any level, at any time. The challenge isn't getting approval for your ideas - it's proving they're worth pursuing.

The Three-Company Race to AGI

Perhaps the most sobering insight is how the developer frames the current AI landscape. In their view, the path to AGI has crystallized into a three-horse race: OpenAI, Anthropic, and Google. Each company is taking a fundamentally different approach based on their organizational DNA.

This isn't about incremental improvements or feature competition - this is about who will first create artificial general intelligence. The stakes couldn't be higher, and everyone inside these organizations knows it. The developer describes a culture where teams closely monitor what's happening at Meta, Google, and Anthropic, knowing that their competitors are doing exactly the same thing.

The pressure of this competition creates an interesting dynamic. On one hand, it drives incredible innovation and speed. On the other hand, it means operating under constant scrutiny from governments, media, and the global tech community. The developer regularly saw news about OpenAI in the press before it was announced internally - a surreal experience that underscores how much attention the company attracts.

Python Monorepos and Azure Reality

On the technical side, the insights reveal both the power and chaos of rapid scaling. OpenAI runs on a massive Python monorepo, with growing services in Rust. This creates a fascinating coding environment where you'll find both sophisticated libraries built by 10-year Google veterans sitting alongside hastily written Jupyter notebooks from newly-minted PhDs. There are no enforced style guides across the organization - a reflection of both the research culture and the speed at which they're moving.

The infrastructure story is equally telling. Everything runs on Azure, but only three services are considered truly reliable: Azure Kubernetes Service, CosmosDB, and BlobStore. Unlike AWS with its mature ecosystem of specialized services, Azure forces OpenAI to build more infrastructure in-house. It's a constraint that probably slows them down in some areas but gives them more control over their entire stack.

But here's the kicker: none of the infrastructure costs matter compared to GPU expenses. The developer shares a mind-bending data point: a single niche feature in the Codex product costs as much to run as an entire successful startup's infrastructure. When your primary constraint is access to the most powerful chips on Earth, everything else becomes a rounding error.

The Codex Sprint: Seven Weeks from Zero to Launch

The most detailed story comes from the Codex launch - a product built from scratch and shipped to millions of users in just seven weeks. This wasn't a small internal tool or limited beta; this was a full-featured coding assistant integrated into ChatGPT and made available to the world.

The pace was brutal. The team worked until 11-12 PM every night, got up at 5:30 AM, and worked weekends. For seven straight weeks. The night before launch, five team members stayed up until 4 AM deploying the system, then returned at 8 AM for the public launch announcement.

Within 53 days of launch, Codex had generated 630,000 public pull requests. That's roughly 78,000 public PRs per engineer on the team. The scale of impact is almost incomprehensible - most developers don't create that many meaningful code changes in their entire careers.

What made this possible wasn't just the technology, but the organizational structure. When the Codex team needed help from experienced ChatGPT engineers, they met with the engineering managers and had two senior developers ready to help the next day. No quarterly planning cycles, no bureaucratic approval processes - just immediate action when something important needed to get done.

The Twitter Influence Loop

One of the more unexpected insights is how much OpenAI pays attention to Twitter. The developer notes that if your tweet about OpenAI goes viral, there's a good chance someone at the company will read it and take it seriously.

This creates an interesting feedback loop. OpenAI is building products for hundreds of millions of users, many of whom express their opinions about AI on social media. The company's leadership stays tuned into these conversations, using them as signals alongside traditional analytics and user research.

It's also a reflection of how seriously OpenAI takes its public perception. Unlike most B2B companies that can operate relatively quietly, every OpenAI product launch becomes a global news event. Every feature update gets analyzed by researchers, journalists, and competitors around the world.

The Meritocracy of Ideas

The best ideas win, regardless of who proposes them. Leaders are promoted primarily based on their ability to generate good ideas and execute them, rather than traditional corporate skills like presentation abilities or political maneuvering.

The developer describes seeing 3-4 different Codex prototypes floating around before the team decided to push for an official launch. This redundancy might seem inefficient, but it ensures that the best approaches bubble up naturally rather than being decided by committee.

Security, Secrecy, and Stakes

The security culture at OpenAI is intense. The developer couldn't tell anyone what they were working on in detail. Different Slack workspaces have varying permission levels. Revenue and burn numbers are closely guarded secrets.

But despite the secrecy, the developer emphasizes that everyone they met was genuinely trying to do the right thing. OpenAI gets significant criticism in the press, partly because it's the most visible of the major AI labs. The company has maintained its commitment to making cutting-edge AI broadly accessible - anyone in the world can use ChatGPT, even without logging in, and most models quickly become available through the API for developers.

The safety work is more substantial than external critics might expect, though it focuses more on practical risks (hate speech, abuse, manipulation) than theoretical ones (intelligence explosion, power-seeking). The developer notes that much of this work isn't published, and suggests OpenAI should do more to communicate their safety efforts publicly.

What This Tells Us About the Future

These insights paint a picture of a company that's still operating more like a research lab than a traditional corporation, despite having 3,000 employees and hundreds of millions of users. The bottom-up culture, the bias toward action, the willingness to change direction quickly - these are the characteristics that allowed OpenAI to build and launch transformative products so quickly.

But they're also characteristics that become harder to maintain as organizations grow. The developer notes that many systems break under hypergrowth: communication, reporting structures, hiring processes. The question is whether OpenAI can maintain its innovative culture while building the operational capabilities needed to compete with tech giants like Google.

As we watch the race to AGI unfold, accounts like this provide crucial context for understanding not just what these companies are building, but how they're building it. The culture, the pace, the technical decisions, the human costs - all of these factors will shape what artificial general intelligence looks like when it finally arrives.

Whether OpenAI, Anthropic, or Google ultimately wins the race to AGI, one thing is clear: the pace of change is only accelerating, and the stakes have never been higher.

Where Gender Bias Grows: How AI Reveals the Rigid Stereotypes Shaping Our Stories

Serge Bulaev — Fri, 15 Aug 2025 23:16:28 GMT

A fascinating study out of Cornell shines a bright light on something we usually sense but rarely measure: how the roles assigned to male and female characters in literature have changed (or failed to) over the last century. Using word embedding models - a method well known to anyone in AI or computational linguistics - researchers analyzed 303 coming‑of‑age novels, stretching from early 20th-century classics to contemporary YA bestsellers.

Here is the headline finding:

Female characters have broken out of old boxes, taking on a much wider variety of roles than they once did.

Male characters, despite all the cultural changes in the past hundred years, are still largely stuck in the same narrow band of traits. It's like the archetype was set in stone and we've been chiselling around it for decades without ever breaking it open.

Here is what stood out most from the research and supporting studies:

Female role diversity has expanded dramatically - Early in the century, women in these novels were firmly anchored in domestic life, romance, or emotional caretaker roles. Now you see them as leaders, adventurers, scientists, morally complex anti‑heroes, and sometimes all of those in one character.
Male stereotypes show near-total stability - Dominance, agency, stoicism, invulnerability… the same masculine profile appears in 1925, in the 1970s, and again in 2025. The language and plots change, the character core doesn't.
Patterns repeat across media - A 2024 computational study of song lyrics found almost identical associations: women tied to emotion, relationships, and vulnerability; men tied to independence, power, and action. Different art forms but the clusters hardly budge.
Persistence suggests deep cultural reinforcement - Both literature and music feed into the same cultural reservoir, reproducing these gender‑coded associations generation after generation.

Here's where it gets interesting for anyone thinking about AI: these same literary patterns become the training data for language models. Every novel analyzed in that Cornell study, every song lyric showing those persistent associations - they're all part of the vast datasets that teach AI systems how language works. The models absorb a century of storytelling conventions, then reproduce them when they generate new text.

Even bias detection carries bias - The Cornell team echoes earlier findings that the word lists and metrics we use to "detect" bias are shaped by cultural norms themselves. The lens is never completely neutral, which means models can inherit prejudice from the tools measuring them.
Circuit tracing reveals the route of stereotypes - This newer AI technique lets researchers literally follow the pathways inside embedding models. Swap "big" for "brave" or "gentle" and you can see how gendered associations persist along particular computational routes, surviving attempts to scrub them out.
Embedding models capture history-in-language - Because they map word relationships based on huge datasets, they can show how some associations shift over decades while others remain stubbornly frozen. It's time‑series cultural analysis in mathematical form.
Psycholinguistics gives context - Rather than blaming individual authors, the lens here widens to show structural patterns over time. Stories become records of what society felt comfortable showing men and women to be.

Looking at it this way, you realise it's not an AI problem in isolation. It's a data problem. The data is us - our books, our music, our conversations, the stories we elevate and repeat. And once those patterns are in the datasets, they ripple forward. AI models are trained on the cultural archive, then their output becomes part of the next archive, reinforcing or repeating the grooves.

The psycholinguistic side of this is the real kicker for me. When a model "learns" to link men with power and women with nurturing - it is faithfully reproducing patterns that have been deeply engrained in the source material. Literature gives us a perfect example because it spans such a long time horizon.

And yes, there's a practical angle. If we want new voices and new norms in both literature and AI output, this isn't a wait‑and‑see game. We have to deliberately make room for different stories - and that means everything from what publishers choose to promote to which datasets researchers use to train the next generation of models.

It means amplifying narratives that break the moulds, not just adding them to the pile. Without that intentional shift, the algorithms of 2050 will still be serving us men with clenched jaws and women with teary eyes, even if they're describing life on Mars.

This research doesn't shame art - it holds up a mirror. And it asks what kind of reflection we want to leave for the next century to measure.