The rise of generative AI is changing the way education providers, awarding bodies and employers think about written assessment. The issue is not simply whether students can produce fluent work, it is whether assessment systems can continue to measure originality, judgement and independent thinking at scale, while maintaining confidence in the results.
That creates a relevant market opportunity for businesses with established assessment technology and practical experience in high-volume evaluation. RM’s work in this area highlights how digital tools can support a more consistent approach to judging written content, particularly where large numbers of scripts need to be reviewed quickly and reliably.
The challenge is becoming more visible as education systems seek to place greater value on skills such as creativity, problem solving and unconventional thinking. These qualities are increasingly important in the workplace, but they are harder to assess than factual recall. Written work has traditionally been one way to evaluate them, yet generative AI introduces new risk. It can produce polished responses that may appear credible, making it harder for assessors to distinguish between human originality and machine-generated content.
RM and the Independent Schools Examinations Board tested this problem through an exercise linked to the ISEB’s annual Time to Write competition. The project used 3,017 creative writing entries from students of mixed ages, alongside 18 AI-generated stories that were added without the judges being told which entries were machine-created. The purpose was to examine whether human assessors would reward originality and whether they could identify AI-generated submissions.
The entries were judged using RM Compare, RM’s adaptive comparative judgement tool, which produced a stable ranking of the submitted work. The content was then analysed further using RM Echo, a tool designed to compare large volumes of written material and identify areas of similarity. This combination is relevant because it addresses two connected requirements in modern assessment: ranking quality in a consistent way and reviewing originality or integrity across a large body of text.
The results point to both opportunity and risk. Assessors identified only a small proportion of the AI-generated entries, which underlines the difficulty facing assessment providers as AI tools become more accessible. However, most of the AI-generated stories were judged less favourably than the human-created submissions.
RM plc (LON:RM) is a global EdTech provider of learning and assessment solutions, supporting the full learning journey, from early years through to higher education and professional qualifications.







































