Designing safe AI systems for education
‘Learning does not happen when the machine does the thinking for you.’ With generative AI reshaping the educational landscape, Teacher columnist and OECD Director of Education and Skills Andreas Schleicher discusses the importance of age-appropriate guardrails, data protection and human judgement to ensure GenAI becomes a scaffold for learning, not a crutch.
For teachers, GenAI offers a powerful set of supports. It can help prepare lessons, personalise curricula, design assignments and exams, draft feedback, and even assist with grading. It can also support administrative work such as documentation and routine reporting.
In the best-case scenario, GenAI buys teachers time to build relationships, give richer feedback, and focus on the human judgement and care that no machine can replicate. OECD data show that in some contexts, it may even support less experienced tutors in real time, raising the overall quality of instruction.
Students, too, are already using GenAI as a cognitive companion. Large language models have become a kind of supercharged search engine. They explain concepts, answer follow-up questions, brainstorm ideas, polish drafts, debug code and suggest solutions. When designed specifically for learning, educational GenAI tools can scaffold understanding much like intelligent tutoring systems, guiding students step by step rather than simply delivering finished answers. At the system level, GenAI can help develop assessments, expand automated evaluation to oral tasks, tag learning resources, and provide administrative support across institutions.
But here’s the catch: Learning does not happen when the machine does the thinking for you.
The risk of skill atrophy
Writing, problem-solving, and teaching are not just tasks; they are cognitive workouts. Outsource them to GenAI, and you may get a polished product, but you lose the learning that comes from struggle.
The OECD’s 2026 Digital Education Outlook shows the risk of skill atrophy: when tools designed to help us think end up thinking for us instead. The danger is especially acute with general-purpose GenAI, which is built to perform tasks efficiently, not to support educational growth.
This raises thorny questions. When is it appropriate to use GenAI, and for what? Should young children use general-purpose GenAI for homework? Probably not. But educational AI tools are a different story. They can be purpose-built, safe and effective – such as tutoring systems for mathematics or reading, which have long been used successfully in primary education and are now becoming more conversational through GenAI interfaces. However, to learn with GenAI, students need enough foundational knowledge to judge its outputs critically. Without that, the tool can lead to harm.
Elevating not replacing human relationships and learning
There is another risk that matters just as much: the erosion of human relationships. Education is not only about acquiring knowledge; it is about learning how to interact with others – such as receiving feedback, earning approval and navigating disagreement. If teachers stop giving personal feedback, or if students increasingly interact with bots instead of peers and adults, education risks becoming efficient but empty.
The lesson is simple and urgent. GenAI should elevate human learning – not act as a replacement. It should strengthen human connection, judgement and understanding – not quietly take their place. If we use GenAI with intention and care, it can become a tool that reinforces what makes education profoundly human rather than eroding it.
But one thing is clear, we cannot just offload all responsibility to the users, students and teachers. We need to equally ensure that safety, trust and accountability are placed at the heart of these systems. Of course, it is hard to think about digital safety in absolute terms. If we do, the consequence is likely paralysis, and we will end up doing little with GenAI at all. Instead, we need to look at GenAI safety in relation to human capabilities and ensure that students have learned to think before they prompt.
Understanding student preparedness and capabilities
That focus on capability is why the OECD is currently concentrating the efforts of PISA (the Programme for International Student Assessment) on framing AI literacy. By this, we mean the capacity of young people to meaningfully engage with AI, to create with AI, to question it and to manage it with critical thinking, self and social awareness.
We have built that concept of AI literacy also into our next PISA assessment, so that we can track how student preparedness for AI is evolving, and how that plays out across countries, social background, gender and age. Understanding capabilities gives us a good sense of the safety risks and where we need education systems to act.
AI literacy is one side of the equation – the other side is digital safety. And it is clear we are playing catch-up on this. Generative AI did not knock politely on the schoolhouse door. It came in through the wi-fi and the question every system now faces is: Will GenAI drift into classrooms by accident, or will we govern it by design?
Guardrails, data protection and human judgement
When we build roads, we don’t give toddlers the same rules as truck drivers. We build pavements. We lower speed limits near schools. We add guardrails. Not because children are weak, but because they are still growing. GenAI demands the same kind of thinking.
In schools, we don’t give every student unrestricted access to every tool. Scissors in the early grades are blunt. Chemistry labs are locked until students are trained. Internet access is filtered by age. Not because students are untrustworthy, but because powerful tools require context, guidance and gradual release. GenAI calls for the same approach.
A 6-year-old and a 16-year-old may tap the same screen, but they are not engaging with the same reality. When we talk about age-appropriate design and safeguards, we are not talking about censorship, we are talking about developmental fit.
For younger learners, GenAI must behave less like an oracle and more like a guardrail. That means strong filters. No emotional manipulation. No pretending to be a friend, a confidant or – worse – a replacement for social interaction. Children are wired to trust authority. If AI speaks with confidence, they will believe it.
As students grow older, the rules can loosen, but they should not disappear. Teenagers can handle uncertainty but need help seeing where it lives. GenAI should make any ambiguity clear and say: ‘Here’s a suggestion, not the truth’, ‘Here’s a pattern, not a verdict’. That’s how you teach critical thinking in an age where machines can sound confident even when they are wrong.
Data protection must also be a central focus. Children’s data is not oil to be extracted; it is DNA to be protected. There should be no emotional profiling. No long-term memory. No recycling of classroom interactions into commercial training datasets. If we would not allow a stranger to follow a child home and take notes, we should not allow an algorithm to do so silently.
And above all, we must be clear about authority. GenAI should assist professional judgement, not replace it. The moment we allow machines to make high-stakes educational decisions on their own, we don’t just automate, we abdicate. Strong guardrails early. Growing autonomy over time. Human judgement always in charge.
If we get this right, GenAI becomes a scaffold, not a crutch. A tool for thinking, not a substitute for it. And education remains what it has always been at its best – a human endeavour, powered by judgement, relationships and trust. That is the future worth designing for.
References
OECD. (2026). OECD Digital Education Outlook 2026: Exploring Effective Uses of Generative AI in Education. OECD Publishing. https://doi.org/10.1787/062a7394-en