AI in Assessment

NEW: A vision for education and skills at Newcastle University: Education for Life 2030+

Assessments allow us to measure and evaluate student knowledge and competencies in a subject discipline, often by assessing a product or artefact – most commonly in written form. However, Generative AI technologies pose a significant risk to the integrity of those assessments by hindering our ability to accurately and confidently determine if students have met their intended learning outcomes.

This page provides guidance to help you consider your approaches to assessment in a world where students have ready access to AI tools. It brings together academic and professional recommendations from across the Higher Education sector.

The need for change

Awards at Newcastle University are made, and classified, based on evidence that students have met or exceeded the learning outcomes for their programme of study. However, as stated by the QAA, “the rapid rise and ubiquity of Generative Artificial Intelligence software means that some or all the assessments that currently contribute to the evidence base may no longer be confidently ascribed to an individual student.” This situation will only become more challenging as AI platforms evolve and become more seamlessly embedded in the systems we use every day.

In line with Newcastle University's new Education Strategy and 5 Principles for the Use of AI, we therefore need to review our assessments to ensure they measure student knowledge and competencies in a rigorous and AI-resilient manner. However, colleagues must consider their programme's full assessment journeys, and only move to high-stakes exams where absolutely necessary to ensure award integrity. Beyond that, a programme should feature a range of diverse, inclusive and authentic assessment types that challenge students and helps them to develop their skills.

Reviewing your assessment strategies

Many colleagues will choose to focus on redesigning current assessments so that they are less likely to be influenced by AI technology, while others will transition to assessments that permit the use of, or actively incorporate, generative AI tools in their completion. Either approach is acceptable, but the latter will ensure greater relevancy and sustainability in the long term.

Whichever approach you take, your assessment strategy should be planned, designed, and implemented at programme level. Mapping assessments to programme-level outcomes can help you identify which assessments need to be secured/assured to guarantee award integrity, but also where AI susceptibility can be tolerated if the assessment approach benefits learning. This approach can also serve to expose assessment gaps, assessment bunching and assessment redundancy.

Unsure if your assessments are AI resilient? Download our PASS AI checklist below to find out...

Selecting new modes of assessment

If your assessments are identified as susceptible to AI, you should look to redesign them. Working with your programme or faculty education teams, you can either change them to a more secure format or add additional checks to assure student work. High risk open assessments that are critical for learning may also be tolerated, so long as key knowledge, skills and competencies are assessed elsewhere in a stage via more secure forms of assessment.

Some simple ways to increase the security of your assessments include:

Putting the student in front of you: Use short oral defences, presentations, observed assessments, quick code walk-throughs, or “explain your method” interviews to verify student knowledge and ability.
Assessing the process: Mark progress and decision-making in addition to the final product; speak to students, spot-check their work, or use staged submissions or portfolios to continuously assess plans, drafts, responses to feedback, and reflections on choices.
Asking for proof: Require students submit evidence to verify the originality of their work, including raw data, calculations, notebooks, test outputs, or design sketches.
Keeping it in-class: Use lab work, field work, or in-person problem-solving, writing, and collaborative activities to anchor and verify larger take-home pieces.
Adjusting credit weightings: Consider giving more weight to lower risk assessment components, or move high risk assessments to formative/lower credit weighting.
Reviewing your learning outcomes: Check that your assessments effectively measure student progress against learning outcomes, and consider adjusting those learning outcomes if it makes assessing competencies easier.
Moving to an exam: In-person invigilated exams may not always be the most appropriate assessment format, but they do have their place. However, switch to an exam with caution – they can cause high levels of stress and anxiety for students, and often fail to capture their full range of competencies.

Authentic coursework activities that include AI by design

Well-designed coursework activities can offer a variety of authentic assessment opportunities, but consideration needs to be given to their AI resilience. A simple way to do this is to design authentic use of AI tools into coursework using problem-based learning activities relevant to the discipline and aligned with graduate skills. However, in these circumstances, ensure your marking criteria assesses the student’s work and not the outputs of AI tools. Consider also that reflective work and the critiquing of AI-produced content is well within the capabilities of today’s AI, and thus does not “AI proof” your assessments.

Hybrid submissions that combine the output from AI tools with a student’s own work are also becoming increasingly commonplace, but these require more scaffolding to make it clear where the AI ends and the student’s work begins. In these circumstances, the contribution of AI needs to be fully acknowledged in accordance with our institutional policies and guidelines. We provide detailed guidance to students on Acknowledging use of AI which you are encouraged to reference and mandate.

Entering your own assignment briefs into generative AI tools, and working with them to produce an example student submission, can reveal how capable AI is of completing your assignments. However, consider this a baseline check; quality of prompt, the use of powerful (and often paywalled platforms), and the daisy-chaining of AI tools, can usually produce far greater AI responses. Also consider that, due to the speed of AI developments, what may have been “AI resilient” yesterday may not be today or tomorrow. Stay vigilant.

Assessing academic writing

Academic writing is crucial in higher education. It not only encourages students to explore and learn about a subject, but also teaches them to demonstrate critical thinking and synthesise ideas and knowledge – essential metacognitive skills for future education and the modern workplace. However, AI presents a significant risk to the integrity of written work (with research indicating student misuse can gain them an unfair advantage), presenting a dilemma for colleagues who rely on this form of assessment.

Assessments featuring written components will need to be reviewed and adjusted. Generally, it's best to assess the process rather than the product of a writing assignment, unless the originality and authorship of the submission can be guaranteed. One approach is to incorporate short-form writing exercises into the teaching process – nested tasks that feed forward and build toward a larger submission, providing ample formative opportunities for students to practice their writing and receive timely, constructive feedback on drafts.

Moving an assessment to an exam

Assessing students using traditional in-person invigilated exams are an obvious solution to the threat of AI. However, they need to be used carefully and only where required to ensure award integrity. If we prioritise assessment security at the expense of authenticity, equity and alignment to learning outcomes, we risk compromising our assessments in ways that disadvantage the majority of students. Colleagues are encouraged to use a diverse mix of assessment types instead, determined and agreed at programme level.

However, there will always be scenarios in which in-person examinations are necessary and appropriate – especially for large cohorts in lower stages. In these circumstances, you are encouraged to adopt a digital exam approach for accessibility reasons and ease of implementation. If you would like to learn more about creating a digital exam, please refer to our online guidance or speak to a member of the LTDS Digital Exams Team.

Conducting oral exams and vivas

In-person oral exams, vivas and structured interviews offer a means to assess a student’s knowledge and understanding. They can act as a powerful deterrent to the use of AI, but can also be very stressful for students, and so appropriate safeguards need to be put in place (e.g. clearly designed rubrics, consideration for different native languages and accents, and alternative arrangements for vulnerable and disabled students). They can also be very resource intensive for colleagues to conduct, especially for large cohorts.

Oral examinations also allow you to potentially confirm the student was responsible for a written submission. This can be done on an individual basis or, more effectively, as part of a group assessment. But for both, the assessment criteria is key – be sure to measure student knowledge and competencies (e.g. presentations skills and the ability to answer questions) and not the product of AI (e.g. the text content of their presentation).

Running observed exams

Observed exams require students to complete one or more authentic tasks related to their discipline or future employment, which are assessed (usually by multiple examiners) to a well-defined and inclusively designed rubric. Students can also be interviewed after completing their tasks, with assessment taking the form of an oral exam. This approach explores student understanding of related principles and the application of knowledge.

Observed exams are very common in medical disciplines (e.g. Observed Structured Clinical Examinations), where students progress between a series of stations completing work-related tasks which they must orally defend. This approach has also been successfully applied to other subjects, especially in education, the sciences and languages. However, consider that these types of exams are often resource-intensive and careful consideration must be given to scheduling to prevent students from sharing interview questions.

Authentic Assessment Ideas

View our range of JISC-inspired ideas for developing new assessment approaches in an AI landscape.

Assessment Redesign

Our hands-on workshops will help you evaluate your existing modes of assessment and consider new approaches.

Clear assessment briefs

When setting assignment and coursework tasks, be clear to students what AI technologies can and cannot be used – and why! Talk to students in clear, unambiguous terms about your expectations for AI usage and acknowledgement, and remind students of our comprehensive AI guidance.

Take a look at our Writing an Effective Assessment Brief to see examples of how to include AI guidance in your assignment and coursework briefs, which you are encouraged to copy and customise for your context.

Reinforcing the importance of the learning process

In most learning design methodologies, assessment should be used for learning, not of learning. Help students to understand why they are being assessed, how it will build their skills and competencies, and how it will it help them achieve success in their future education and careers. Continually stress the importance and value of the assessment process, and share links to any marking criteria and rubrics. Where possible, give examples of good practice with AI, and make explicit your wider expectations and recommendations for collaboration and the use of digital tools.

Students must also learn the fundamentals of their discipline to be able to understand and critically evaluate AI outputs. Continually remind them of this, and use careful scaffolding of formative and low-stakes assessments to ensure learning outcomes are met.

Identifying and detecting AI

Newcastle University does not recommend or support the use of automatic AI checkers. Text and digital media generated by AI cannot be reliably detected (Weber-Wulff et al; Scarfe P et al), and they provide insufficient detail about how “scores” are generated and what they mean. Low detection rates and a high incidence of false positives in recent research has also confirmed our concerns, and so the AI detection features of our current platforms have been disabled. Due to GDPR and data security concerns, you should also avoid entering student submissions into any third-party AI tools yourself.

Expecting markers to identify AI-generated text is also difficult in most scenarios, and this is not an expectation we should place on colleagues. However, we have a shared responsibility to ensure the academic integrity of our assessments, and so we all need to stay vigilant, be aware of obvious red flags, and respond to any suspicions of AI misuse by following the University’s misconduct procedures.

AI for marking and feedback

There are many tools emerging on the market which claim to assist with the marking of essays and automated provision of feedback. Newcastle University’s AI working groups are currently exploring these platforms and how – if recommended – they can integrate with current systems. In the meantime, colleagues are reminded to review our AI Boundaries of Use guidance and only use AI where permitted for marking and feedback.

And finally...

Each year AI technology improves, and assessments that are AI resilient today may not be tomorrow. The best way to ensure the validity of your assessments is to embrace AI yourself - use the tools your students are using - and stay up-to-date with AI developments.