Is it time to scrap single word Ofsted judgements?

Research on a range of measures show mixed performance across local authorities that often does not align with Ofsted judgements

Photo: Brian Jackson/fotolia

By David Wilkins

In a recent report, the educational think-tank EDSK argued that Ofsted’s ratings of schools were probably unreliable and invalid; that inspectors looked at the wrong things and there was no good evidence that inspections helped schools improve.

This is not the first time such problems have been identified. Previous studies have shown that Ofsted inspections can have an adverse effect on exam results, and that some schools put on ‘a performance’ for Ofsted and behave differently when inspectors are not present (surely not…).

But what research has been done about Ofsted in relation to children’s services? The answer is relatively little. This seems odd, given the political weight placed upon Ofsted’s judgements. And it seems even more odd when what we do know is not necessarily reassuring.

No clear evidence of improved performance

La Valle et al and Hood et al both looked at a range of measures and asked whether better Ofsted ratings were associated with better performance.

These measures included whether young adults with care experience were in education, employment or training; the number of repeat child protection plans; the timeliness of assessments; re-referral rates; the offending behaviour and substance misuse of children in care; and workloads and staff vacancies.

Perhaps surprisingly, they found no clear evidence of improved performance between authorities with better and worse Ofsted ratings.

In La Valle’s study, they even found that some of the best performing authorities (in relation to these measures) were ‘inadequate’, while one of the worst performing authorities was ‘good’.

Predictors of a ‘good’ or ‘outstanding’ rating

In Hood’s study, they did find that more timely assessments lowered the risk of an inadequate rating, while more agency staff and more re-referrals increased the risk – but nothing significant in relation to some of the more important measures they looked at.

In my own work, we looked at inspection results between 2014 and 2016 and considered some of the same measures again and a few new ones as well. We found that only three things helped predict whether an authority would receive a good or outstanding rating – low deprivation, more timely assessments and more children recorded as missing from care.

More recently, we have started looking at inspections carried out since 2018 under the new ILACS framework. Despite Professor Eileen Munro finding that the previous framework was well received by the sector, Ofsted introduced the ILACS, in part, because they had been previously too concerned with process and not enough with the experiences of children. We wanted to know if this new framework might more clearly distinguish between better and worse authorities (according to Ofsted) in relation to the kinds of ‘outcome’ measures we currently have the data for.

Small differences between authorities

Largely the answer is no. Good and outstanding authorities still tend to complete more assessments on time and hold more child protection reviews within timescales too. They also ensure more children in care have up-to-date health assessments and dental check-ups. Which are certainly important aspects of good practice.

But on the flip side, inadequate authorities, and those in need of improvement, have fewer children in need absent from education, fewer child protection plans lasting 2+ years and fewer children in care with a criminal conviction. And these are important aspects of good practice too.

These differences are small in almost all cases, and our data needs to be properly tested via academic peer-review. It’s possible we might find that some of these small differences occur by chance (meaning we do not really know whether either group is better than the other). But even so, the real question isn’t whether good and outstanding authorities are better by small degrees in relation to some measures – it’s why aren’t good and outstanding authorities considerably better?

System complexity

There is no suggestion that good and outstanding authorities are not working hard and doing an excellent job for children and families. But either Ofsted’s ratings bear little relation to the kinds of outcomes we currently measure (such as whether adults with care experience are living in suitable accommodation and how many children in care have substance abuse problems) or the difference between very good and very poor practice is, on average, insufficient to ‘move the needle’ very much.

Either way it highlights the incredible complexity of the social care system.

In our previous analysis, we found that the single best predictor of a poor Ofsted rating was high deprivation. In our current study, it looks like the biggest differences between authorities are to be found in their numbers of children in need and children in care.

This could indicate that good and outstanding authorities are better at ensuring children can live safely at home. However, the evidence hints it is unlikely to be that straight forward. For example, a recent report by the What Works Centre for Children’s Social Care found that authorities with lower numbers of poor families had lower rates of children in care.

The impact of deprivation

We also know from Bywaters et al that children in relatively deprived areas are much more likely to come into care than other children, and that some authorities have experienced much bigger funding cuts than others.

So, it could also mean that better Ofsted ratings are at least in part a consequence of lower numbers of children in care, which itself is at least in part associated with poverty and deprivation.

Another possibility is that we are simply measuring the wrong things. Perhaps there are a set of ‘outcomes’ that reliably differ between authorities with better and worse Ofsted ratings. Or perhaps differences will emerge over time now that Ofsted has improved its inspection framework. Or perhaps what Ofsted means by ‘inadequate’ and ‘outstanding’ matters despite the limited evidence of any meaningful differences in outcomes.

There are clearly no easy answers here and the question of outcomes in children’s services is notoriously complicated.

Views of children and families

Given the extent to which Ofsted’s judgements are based on forms of self-report (most often by professionals), with all the problems this entails for evaluation, it would make sense to at least prioritise the views of the people who really matter – children, parents, other family members and adults with care experience. From reading Ofsted’s inspection reports, it is not at all clear that such views are central to how judgements are made at the moment (for example, I could not find any reference to speaking with parents in any of the 48 reports we have looked at so far).

Overall, the evidence about Ofsted inspections and children’s services remains limited and the sector should be careful about drawing too many firm conclusions. Yet, I believe what we do know is enough to suggest we need to move beyond the problematic use of single-word judgements for a start (there are many other inspectorates which manage fine without them). We also need to start taking into account wider social factors when assessing services and finding ways of measuring practice that, as with educational progress, seek to account for baseline need.

Doing good social work in areas of high deprivation with limited resources is much harder than it is in affluent areas with more resources – and it should not be politically contentious, nor seen as a criticism of more affluent authorities, to acknowledge this fact.

Inspection is an important function of public protection – although I question whether it can be both a function of protection and a ‘force for improvement’ . In our current study, we’re not seeing any evidence that authorities improve in relation to outcome measures between inspections). We cannot afford to dismiss the research findings in this area, nor should we stint in our efforts to find the best indicators of good social work practice and improved quality of life for children and families in need of help and support.

David Wilkins is a senior lecturer in social work at Cardiff University and assistant director of the Children’s Social Care Research and Development Centre (CASCADE).


Richmond, T (2019) Requires Improvement: A new role for Ofsted and school inspections. EDSK

La Valle, I., Holmes, L., Gill, C., Brown, R., Hart, Di., Barnard, M. (2016). Improving Children’s Social Care Services: Results of a feasibility study. London: CAMHS Press

Hood, R et al (2016) A study of performance indicators and Ofsted ratings in English child protection services. Children and Youth Services Review

Wilkins, D, Antonopoulou, V (2019) Ofsted and Children’s Services: What performance indicators and other factors are associated with better inspection results? The British Journal of Social Work

P, Bywaters, Brady, B et al (2017) Identifying and understanding inequalities in child welfare intervention rates. Child Welfare Inequalities Project

Wijedasa, D., Warner, N. and Scourfield, J. (2018) Exploratory analyses of the rates of children looked after in English local authorities (2012-2017). London: What Works Centre for Children’s Social Care.

More from Community Care

2 Responses to Is it time to scrap single word Ofsted judgements?

  1. Phil Sanderson April 26, 2019 at 6:24 pm #

    Is it time to scrap Ofsted they are the main driver behind school exclusion and all the problems that causes young people in terms of crime and exploitation. They have virtually destroyed extra curricular activities as schools teach for tests and league tables. The same malign influence is now happening in social care and we should have an inspection system that looks to assist professionals​ in achieving the best outcomes not screaming blame

  2. Daydreamer May 2, 2019 at 2:04 pm #

    The same system has been used by the CQC for Care Homes for years it appears that our regulators all seem to think that very complicated organisations dealing with complicated needs can be assessed and valued in this way, I wonder who taught them that this was possible?