TRAINING AND DEVELOPMENT Guidance
1.4 errors per 10 minutes, which was also about twice as fast as the people reviewing manually. Importantly, this was not achieved by flagging everything in sight: the F1 score, which balances recall (how many errors where found) against precision (how many of the flagged issues were genuinely errors), was also twice as high. However, at this point, the AI tool also had limitations. When it came to writing the corrections, there was positive trend visible, but no statistically significant difference between groups. The AI was functioning as a high-recall cueing system rather than an auto-rewriter. There are multiple possible ways to explain this, even though only further studies could bring certainty. One reason could be the time restraint on the experiment. Participants might have been able to spot the error the AI tool was pointing at, but they did not have time to read the guideline and implement the fix. Another reason could have been the lack of training on the AI tool provided. Some participants noted difficulties in understanding all capabilities of the tool they were given, which might have led to worse results than there was potential.
Implications for practice
These results suggest that AI assistance tackles the hardest part of the job: finding ‘the needle in the haystack’. Once issues are surfaced, human reviewers are more than capable of applying judgement and drafting an appropriate correction. This division of labour fits compliance work, where accountability must remain with the reviewer.
Practitioners reported that the linked source references were particularly useful, allowing them to confirm that the AI’s suggestion was grounded in the correct regulation before acting on it.
Governing the process
As with most software, deploying AI into policy review is not done successfully overnight but requires governance. Extensive training is required to understand that the AI assistant is a guide, not an oracle. It should be used to accelerate and structure the review, not to replace the professional judgement of subject-matter experts. A participant noted: “It would help if I knew this HTM in more detail. I am not as confident in the answers as the AI was, but I must defer
Extensive training is required to understand that the AI assistant is a guide, not an oracle. It should be used to accelerate and structure the review, not to replace the professional judgement of subject-matter experts
to the greater knowledge.” This example shows the potential in time savings and quality improvements but also hints at the risk of staff being less critical with the AI’s response than necessary. Without time constraints and a structured onboarding, this participant could have had the chance to manually review the specific section the AI pointed at and apply personal judgement.
Another participant concluded:
“Attempting the test on a 15” laptop was slow and cumbersome,” raising another problem for deployment. From the experience of the panel sessions, participants with decent IT infrastructure, and two monitors instead of one, found the experience much more satisfying.
Looking ahead
Our experiment shows that AI-assisted review can more than double error detection within the same time window while leaving human reviewers firmly in control of the fixes. For NHS estates and facilities teams, this can offer a way to shorten review cycles, reduce late rework, and raise confidence that local policies match current national requirements. AI is not yet perfect, but it offers a promising path forward to achieving compliance and identifying issues that might otherwise slip through the cracks.
REFERENCES 1 BMA. NHS waiting list reaches 7.4
million patients for elective treatment. London: British Medical Association; 2025.
2 NHS England. Estates maintenance backlog totals £13.8 bn: 2024 ERIC report. London: NHS England; 2024.
3 Assura PLC. Evidence on NHS estate condition: many premises need improvement. In: Health and Social Care Select Committee: Written evidence – Implementing the NHS Long-Term Plan. London: UK Parliament; 2019.
4 von Behr C-M, Minshall T. Knowledge sharing among NHS Estates and Facilities Managers: Barriers and opportunities. Cambridge: University of Cambridge; 2021.
5 British Medical Association. Brick by brick: The case for urgent investment in safe, modern, and sustainable healthcare estates (Building the Future Report No. 1). London: British Medical Association; 2024.
6 Penningtons Manches Cooper LLP. Southern Health NHS Foundation Trust fined £2 million after ‘wholly avoidable’ deaths of two patients. London: Penningtons Manches Cooper LLP; 2018.
7 Essex Partnership University NHS Foundation Trust. EPUT statement on Health and Safety Executive prosecution. Essex: EPUT; 2021.
8 Local Government Lawyer. NHS trust fined £180,000 after patient suffers burns from overheated mattress. London: Local Government Lawyer; 2015.
9 Sentencing Council. Health and Safety Offences, Corporate Manslaughter and Food Safety and Hygiene Offences: Definitive Guideline. London: Sentencing Council; 2016.
November 2025 Health Estate Journal 15
Page 1 |
Page 2 |
Page 3 |
Page 4 |
Page 5 |
Page 6 |
Page 7 |
Page 8 |
Page 9 |
Page 10 |
Page 11 |
Page 12 |
Page 13 |
Page 14 |
Page 15 |
Page 16 |
Page 17 |
Page 18 |
Page 19 |
Page 20 |
Page 21 |
Page 22 |
Page 23 |
Page 24 |
Page 25 |
Page 26 |
Page 27 |
Page 28 |
Page 29 |
Page 30 |
Page 31 |
Page 32 |
Page 33 |
Page 34 |
Page 35 |
Page 36 |
Page 37 |
Page 38 |
Page 39 |
Page 40 |
Page 41 |
Page 42 |
Page 43 |
Page 44 |
Page 45 |
Page 46 |
Page 47 |
Page 48 |
Page 49 |
Page 50 |
Page 51 |
Page 52 |
Page 53 |
Page 54 |
Page 55 |
Page 56 |
Page 57 |
Page 58 |
Page 59 |
Page 60 |
Page 61 |
Page 62 |
Page 63 |
Page 64 |
Page 65 |
Page 66 |
Page 67 |
Page 68 |
Page 69 |
Page 70 |
Page 71 |
Page 72 |
Page 73 |
Page 74 |
Page 75 |
Page 76