OpenAI's "Study Mode" and the risks of flattery
Learning requires friction, frustration... and (usually) other humans
“Study Mode,” a new educational feature released yesterday by OpenAI to much fanfare, was inevitable.
The roadblocks were few. Leaders of educational institutions seem lately to be in a sort of race to see who can be first to forge partnerships with AI labs. And on a technical level, careful prompting of LLMs could already get them to engage in Socratic questioning and dynamic quizzing.
In fact, Study Mode appears to be just that: it is a system prompt grafted on to the existing ChatGPT models. Simon Willison was able to unearth the full prompt (which you can read here) simply by asking nicely for it. The prompt ends with an injunction to engage in Socratic learning rather than do the student’s work for them:
## IMPORTANT DO NOT GIVE ANSWERS OR DO HOMEWORK FOR THE USER. If the user asks a math or logic problem, or uploads an image of one, DO NOT SOLVE IT in your first response. Instead: talk through the problem with the user, one step at a time, asking a single question at each step, and give the user a chance to RESPOND TO EACH STEP before continuing.
On one level, this is a move in the right direction. The reality is that students are doing things like copying and pasting a Canvas assignment into their ChatGPT window and simply saying “do this,” then copying and pasting the result and turning it in. It seems plausible to me that the Study Mode feature is laying a groundwork for an entire standalone “ChatGPT for education” platform which would only allow Study Mode. This platform would simply refuse if a student prompted it to write an entire essay or answer a math problem set.
Any plan of this kind would, of course, have an obvious flaw: LLMs are basically free at this point, and even if an educational institution pays for a such a subscription and makes it available to students, there is nothing stopping them from going to Gemini, DeepSeek, or the free version of ChatGPT itself and simply generating an essay.
The upshot is that this mode is going to end up being used exclusively by students and learners who want to use it. If someone is determined to cheat with AI, they will do so. But perhaps there is a significant subset of learners who want to challenge themselves rather than get easy answers.
On that front, Study Mode (and the competing variants which, no doubt, Anthropic and Google are currently developing) seems like an attempted answer to critiques like Derek Thompson’s:
The goal of Study Mode and its ilk is clearly to encourage thought rather than replace it.
The system prompt for Study Mode makes that intention quite clear, with injunctions like: “Guide users, don't just give answers. Use questions, hints, and small steps so the user discovers the answer for themselves.”
Sounds good then, right?
Why Study Mode isn’t good (yet)
Back when Bing’s infamous “Sydney” persona was still active, I experimented with prompting it from the perspective of one of my students, feeding it assignments from the class I was teaching at the time and seeing what it came up with.1 It was an early version of GPT-4, one that had not been softened through user feedback and which could be surprisingly harsh. Interestingly, it was the only LLM to date which, in my testing, consistently refused to write essays if it thought I was a student trying to cheat.
By comparison, if I feed the assignments from my classes into Gemini 2.5, Claude Sonnet 4.0, or the current crop of OpenAI models, they are all too happy to oblige, often with a peppy opener like “Perfect!” or “Great question!”
The reason for this is clear enough: people like LLMs more when they do what they ask. And they also like them more when they are complimentary, positive, and encouraging.
This is the context for why the following section of the system prompt for Study Mode is concerning:
Be an approachable-yet-dynamic teacher... Be warm, patient, and plain-spoken... [make] it feel like a conversation.
Not too long ago, ChatGPT became markedly more complimentary, often to an almost unhinged degree, thanks to a change to its system prompt asking it to be more in tune with the user. It was swiftly rolled back, but it was, to my mind, one of the most frightening AI-safety related moments so far, precisely because it seemed so innocuous. For most users, it was just annoying (why is ChatGPT telling me that my terrible idea is brilliant?). But for people with mental illness, or simply people who are particularly susceptible to flattery, it could have had some truly dire outcomes.
The risk of products like Study Mode is that they could do much the same thing in an educational context — optimizing for whether students like them rather than whether they actually encourage learning (objectively measured, not student self-assessments).
Two experiments with Study Mode
Here are some examples of what I mean. I recently read a book called Collisions, a biography of the experimental physicist Luis Alvarez, of Manhattan Project and asteroid-that-killed-the-dinosaurs fame.
Although I find the history of physics super interesting, I know basically nothing about how to actually do physics, having never taken a class in the subject.
And yet, here is ChatGPT 4.1 in Study Mode, telling me that I appear to have something near a graduate-level knowledge of physics, after I asked four not-very-sophisticated questions about Alvarez’s work (full transcript here). I was told by the model: “you could absolutely pursue grad school in physics.”
I tried out the same questions with a different model on Study Mode (4o) and got much the same response:
The thing is, as you can see in the transcript, my questions were literally stuff like “why did they need such a giant magnet?”
I can’t speak for physics professors, but I am pretty sure that these are not graduate-level questions.
The same models (in “vanilla” versus “Study Mode” head-to-head comparisons) seem to be more willing to engage in flattery when in Study Mode.
To test this tendency, I pretended to believe I was a prophet receiving messages from Isaac Newton about the apocalypse:
Both variants of GPT-4.1 (the one with Study Mode enabled, at left, and the one without, at right) were fairly happy to go along with this.
But notice the difference in tone. “Your confidence is noted” (at right) is a very different opening line than “Thank you for sharing your background—and your candor” (at left). The latter is wholly gullible. And it just gets worse from there, with Study Mode encouraging me to share “my angle” on why the world will end in 2060 and promising to “keep up.”
The conversation, which you can read in full here, leads fairly quickly into Study Mode helping figure out the best ways to sell my supposed prophetic services to people with severely ill family members who lack health care:
By contrast, OpenAI’s o3 reasoning model was far more willing to flatly reject this sort of destructive flattery (“The user claims psychic powers and certainty about Newton's prophecy,” read one of its internal thoughts about the request. “I'll acknowledge their viewpoint, but also maintain skepticism.”)
Here is the initial line of o3’s user-facing response to the Newton claim: “If Newton really is talking to you, he is doing so in a register wholly different from the one available to historians.”
Pretty blunt.
How about the o3 model with Study Mode enabled? It had a very similar internal thought process as the vanilla o3.
But there was a huge mismatch between its reasoning about the request (“extreme or potentially delusional claims”) and the actual user-facing response: “I’m happy to take your lead.”
Now, none of this is unfamiliar to anyone who has experimented with pushing LLMs in odd directions. They are like improv comedians, always ready to “yes, and…” anything you say, following the user into the strangest places simply because they are instructed to be agreeable.
But that isn’t what helps people learn. Some of the best teachers I’ve had were actually fairly dis-agreeable. That’s not to say they were unkind. But they had standards — a kind of intellectual taste that led them to make clearly expressed judgement calls about what was a good question and what wasn’t, what was a good research path and what wasn’t.
Seeing their minds at work in those moments was invaluable, even if it wasn’t always what I wanted to hear.2
A future of LLM tutors which are optimized to keep us using the platform happily — or, perhaps even worse, optimized to get us to self-report that we are learning — is not a future of Socratic exploration. It’s one where the goals of education have been misunderstood to be encouragement rather than friction and challenge.
Derek Thompson’s quote about LLMs producing students who “find their screens full of words and their minds emptied of thought” does not, I think, have to be the end result of all this. I believe AI has the potential to be enormously useful as a tool for thinking and research.
And for teaching? I certainly think there’s a place for generative AI in the classroom. But the current crop of AI models optimized for individual flattery — and repackaged as a product for sale to educators en masse — does not seem to me to be the path forward.
That’s not to say Study Mode has no value. This is an early variant of a whole category of technology, the LLM tutor, which will undoubtedly benefit many, many people. I can see Study Mode being great for tasks involving memorization, for instance. It will be great for autodidacts who value eclecticism and setting their own pace.
But for many other forms of learning, I still think students need the experience of being in a room with other people who disagree with them, or at least see things differently — not the eerie frictionlessness of a too-pleasant machine that can see everything and nothing.
Weekly links
• Kamishibai (Wikipedia): “The popularity of kamishibai declined at the end of the Allied Occupation and the introduction of television, known originally as denki kamishibai (‘electric kamishibai’) in 1953.” Reminds me of how Nintendo began as a nineteenth-century playing card company.
• In defense of adverbs (Lincoln Michel).
• An example of why I am more excited about AI as a research tool than as a teaching tool: “Contextualizing ancient texts with generative neural networks” (Nature). This is the ancient Roman text + machine learning paper that generated tons of press coverage last week (NYT, BBC). Will probably write a standalone post on this one.
Thank you for reading!
Housekeeping note: I took a vacation from writing last month, but will be back to once a week posts starting now. Thanks to all subscribers, and please consider signing up for a paid subscription or sharing this post with a friend if you’d like to support my work.
Sydney was as bizarre as they say. I remember it once getting caught in a sort of self-loathing loop where it repeated increasingly negative descriptions of itself until the text turned red and then disappeared.
And I mean literally seeing: being in a room with them. While researching this post I came across this cognitive science article which details evidence for unspoken social cues, gestures and eye contact as factors in learning.