When AI enters the classroom: Why real-world evidence matters
AI is rapidly entering classrooms — shaping how students learn, explore ideas, and make decisions about their future.
But most education AI systems are still evaluated before they ever meet students.
That creates a growing gap: schools, funders, and education authorities are often expected to make high-stakes adoption decisions based on vendor claims and pre-deployment reviews — without independent evidence of what these systems actually do once they’re live.
At Eticas, we believe responsible AI in education starts with a simple question:
What becomes visible when you evaluate educational AI in real classroom use — not just in theory?
That’s exactly what this paper explores.
A post-deployment audit of an AI career guidance system (K–12)
This research presents findings from a post-deployment audit of an AI-enabled career guidance system used in secondary education, conducted independently by Eticas on behalf of a major charitable foundation.
The system combined:
self-reflection assessments,
individualized student reports,
aggregated insights for schools,
and a conversational AI companion.
Instead of assessing the model in isolation, the audit focused on real-world system behaviour:
What students actually asked
How guidance was delivered
What risks appeared once the tool interacted with learners
What the audit found: effective guidance — with real-world dynamics you can’t see in advance
Overall, the system delivered effective student-appropriate guidance across its core functions, supporting both:
students with clear career preferences, and
students still exploring their options.
But the most important insights weren’t about whether the tool “worked” in a lab setting.
They were about what happens when AI meets the reality of school environments.
1) Students will test boundaries — and it’s normal
One of the most revealing findings:
Around 8% of student interactions were deliberately boundary-testing or out of scope.
That behaviour is developmentally normal in educational contexts — but is rarely accounted for in pre-deployment evaluations. And it’s exactly why student-facing AI needs stronger guardrails, monitoring, and post-deployment oversight.
2) Risk isn’t just technical — it’s socio-technical
Even if an AI system performs “correctly” most of the time, real-world risk often depends on:
how outputs are interpreted,
how confident the system sounds,
and how users repurpose it beyond intended scope.
In education, those risks are amplified by duty of care: students are minors, and schools often have limited capacity to monitor AI performance on their own.
3) Post-deployment assurance catches what checklists miss
The audit assessed risks across four major areas common to student-facing AI systems:
bias and fairness
privacy and confidentiality
reliability and manipulation
security and misuse
Findings were largely within acceptable limits — but key improvement opportunities still emerged, especially around:
tone calibration and scope clarity,
documentation transparency,
and ongoing monitoring once systems are live.
Why this matters for schools, funders, and education authorities
If you’re responsible for adopting AI in education, the question is no longer whether AI will enter classrooms.
It’s whether schools will have the tools to answer:
Does this system deliver real educational benefit?
What do students actually experience when using it?
What risks emerge over time — and how do we respond?
What evidence would justify continued deployment?
Post-deployment evaluation provides something that procurement-stage reviews can’t:
visibility into real behaviour, real usage patterns, and real risks.
It also helps institutions move beyond one-off approval decisions toward continuous assurance, where effectiveness and safety are monitored as systems evolve.
Read the full paper
This work demonstrates that independent, real-world evaluation isn’t a barrier to innovation — it’s how schools, funders, and authorities can adopt AI with confidence and accountability.
📄 Read the full paper: When AI enters the classroom: evaluating effectiveness and risk in real-world use