31 items with this tag.
Language models can transform documents into interactive tools in minutes. This post walks through a concrete example, turning a 21-page Word questionnaire into a working web app, and reflects on what that capability makes possible.
Higher education's response to AI has focused on the artefact: detecting it, restricting it, and restoring confidence in what students produce. This essay argues that the structural features of problem-based learning — problem-driven inquiry, collaborative knowledge construction, facilitation over instruction, and metacognitive reflection — are the same conditions under which AI integration becomes educationally productive rather than substitutive. The alignment is structural, not retrospective: PBL was designed around these conditions before AI existed. The argument extends further: AI shifts what category of problem PBL can engage with, expanding access to wicked problems previously beyond students' reach. Investing in PBL's structural conditions is simultaneously investing in AI readiness.
Academic offences committees are investigating the wrong party. When AI is integral to authentic professional practice, assessment that excludes it does not protect rigour — it tests performance in a professional context that no longer exists. Valid assessment measures what graduates will actually need to do; for most health professions graduates in 2025, that includes thinking well with AI. The accountability for assessment design lies with educators, not students.
AI-generated text is fluent regardless of whether its content is accurate or well-reasoned. Fluency was once a reasonable trace of genuine thinking — a student who wrote clearly had usually thought clearly. That relationship no longer holds. Worse, the AI literacy response of teaching output evaluation is a temporary fix: as models improve, output quality converges on expert-level across every artefact we care to measure. The question isn't how to spot current failure modes. It's what you'll do when those failure modes are gone.
When AI could write everything I'd ever written, I had to ask: what had I been doing all this time? The answer changed how I understand both writing and AI — and what it means to be a scholar in a world where words are cheap.
Claude produced the word "contribuves" in a piece of writing, which is obviously not a real word. This is a different kind of error than hallucination, and the distinction matters.
A presentation for students participating in an EU-funded Blended Intensive Programme at Thomas More Hogeschool in Belgium. Examines how AI separates the production of artifacts from the learning they were meant to evidence, what problem-based learning already does differently, how AI changes group work and inquiry, and three practical shifts students can make in how they use AI within PBL.
The structural features of an information source that enable its knowledge claims to be challenged, traced back to evidence, and evaluated against the source's track record. Traditional sources carry it; generative AI largely does not.
Most advice on AI effectiveness focuses on prompt engineering. The real leverage comes from somewhere less obvious; knowing your professional commitments clearly enough to turn them into context an AI can work within. This post describes how to build AI personas for professional practice — structured documents that compress your values, frameworks, and evidence into a form an AI agent can actually use.
A field note on the time Claude deleted my file. The agent followed my instructions precisely and that was the problem. A reflection on a different kind of AI failure mode, and what the model's apology reveals about where responsibility actually sits.
An internal staff development session for the CPC team introducing AI through Microsoft Copilot. Covers what AI is and isn't, safe working practices, structured prompting with the RGID heuristic, and hands-on practice — with the goal of each participant leaving with one specific task to try that week.
A field note on what the recent Claude outage revealed about where I am on the dependency curve, and what the difference between a session limit and an outage tells you about infrastructure.
The previous posts described what makes agentic workflows coherent at the individual level: a plan, documentation as infrastructure, and domain expertise that can evaluate outputs. Together, these form an informal harness; the conditions within which delegation stays accountable. At institutional scale, a personal harness is not enough: multiple people directing agents without shared constraints produce compounding drift that no amount of human oversight can track. This post examines what AI agent governance in higher education actually requires, and why a harness, not better oversight, may be the right frame.
Vibe coding describes using AI tools without maintaining genuine accountability for what they produce; accepting outputs without the scrutiny, direction, or judgement needed to evaluate and improve them. Simon Willison drew the key distinction: vibe coding versus vibe engineering, where the latter uses the same tools while remaining genuinely accountable for every output.
A field note on switching from Claude Opus 4.6 to Sonnet 4.6 as my default in Claude Code, and what I'm noticing after the first hour.
Source details Bearman, M., Tai, J., Dawson, P., Boud, D., & Ajjawi, R. (2024).
Source details Corbin, T., Dawson, P., & Liu, D. (2025). Talk is cheap: why structural assessment changes are needed for a time of GenAI.
Source details Corbin, T., Bearman, M., Boud, D., & Dawson, P. (2025). The wicked problem of AI and assessment.
A mathematical framework demonstrating that AI tutoring systems with 10–15% error rates can achieve superior learning outcomes through dramatically increased engagement compared to more accurate but largely unused alternatives. Drawing on evidence from health professions education, this essay shows that the multiplicative relationship between accuracy and utilisation creates an accessibility paradox: imperfect but engaging systems outperform perfect but unused ones. The argument carries three critical qualifications—errors vary in consequence and safety-critical content demands high accuracy; generative AI poses distinctive epistemic challenges that may undermine conventional error correction mechanisms; and engagement is necessary but not sufficient for learning, with superficial use patterns capable of nullifying predicted benefits entirely. A framework for calibrating accuracy requirements to context and consequence is proposed.
Learned numerical representations of text that capture semantic meaning, enabling similarity-based search and retrieval
A technique that improves LLM responses by retrieving relevant information from external sources and including it in the prompt
A database that stores embeddings for similarity-based retrieval, serving as the knowledge layer for RAG systems
A lightweight programme that exposes specific data sources or capabilities through the Model Context Protocol standard, acting as an adapter between AI systems and diverse data sources.
An open standard enabling AI systems to access diverse data sources through standardised interfaces with fine-grained permission control.
Persistent context included in every message to an AI model, establishing consistent behaviour, knowledge, or constraints across interactions.
Large language models are deep learning models with billions of parameters, trained on vast text corpora using self-supervised learning, capable of general-purpose language tasks.
Rich Sutton's 'Bitter Lesson' from AI research—that general methods leveraging computation outperform human-crafted knowledge—has a direct parallel in education. When AI can produce the kinds of artefacts that assessments have traditionally relied on, it exposes a fundamental problem we have long ignored: we were never really measuring learning, we were measuring the difficulty of producing certain artefacts. This post explores what the Bitter Lesson means for assessment design in health professions education, and why AI makes it impossible to continue pretending otherwise.
When AI can generate text, images, and ideas at scale, what remains distinctively human? This post argues that evaluative judgement—the capacity to assess what is worth creating, what deserves attention, and what matters—becomes the core human contribution in knowledge work. Drawing on research into evaluative judgement in health professions education, it explores how educators can make this capacity explicit and deliberately develop it, rather than treating it as an invisible by-product of experience.
Generative AI presents serious ethical challenges in education—to academic integrity, to equity, to the nature of learning itself. This post acknowledges these concerns while arguing that AI also represents an unprecedented opportunity for learning at scale, particularly for the kinds of personalised, adaptive learning that have always been theoretically desirable but practically impossible to deliver. For health professions educators committed to expanding access to quality education, this opportunity deserves serious, open-minded consideration.
Most commentary on AI in education focuses on what AI cannot do, or catalogues its failures as warnings. This post argues for a different approach—instead of performative critique, demonstrate thoughtful use in your own practice. By modelling considered, reflective engagement with AI tools, health professions educators can critique from experience rather than speculation, help shape how AI is integrated into professional education, and play a better game than the one they're currently losing.
A system-level discipline focused on building dynamic, state-aware information ecosystems for AI agents