The coding maze: mortality ratios and real life

Much anxiety has been caused by the publication over the past decade of hospital mortality ratios.

They aim to compare hospitals by calculating how many patients die in each, correcting the figures by allowing for factors outside the hospitals’ control – severity of disease, age, sex, route of admission (emergency or elective) and comorbidities.

But a hospital’s quality of care measured in this way often differs sharply from the assessments made by the regulator – formerly the Healthcare Commission, now the Care Quality Commission (CQC). Many statisticians question whether it is possible for mortality ratios to capture the full complexity of variations in care, and doubt that the results are meaningful.

A leading company in the production of mortality data, Dr Foster Intelligence, publish an annual Good Hospital Guide which in 2009 identified five hospitals as among the most improved over the past three years. These included Mid-Staffordshire NHS Foundation Trust, which at about the same time was being branded “appalling” by the CQC. 

How could this conflict of evidence arise? More important, what did it mean to patients worried by hospitals with apparently high mortality ratios, or comforted by Dr Foster’s evidence into believing that a previously-poor hospital had turned the corner? Such were the discrepancies and the embarrassment caused to the Department of Health that Sir Bruce Keogh, Clinical Director of the NHS, has set up a committee to investigate, chaired by Ian Dalton, chief executive of NHS North East.

My own investigations, aided by an analysis (1) carried out by one of Dr Foster’s rivals in the marketplace, CHKS, suggest that the diagnostic codes given to patients by some hospitals may have had undue influence on the published outcomes. The results of these inquiries has now been published in the BMJ (2). This version of the story is aimed at those who do not read the BMJ.

Dr Foster’s version of mortality ratios developed from the work of Professor Sir Brian Jarman at Imperial College, London. He is acknowledged as the leader in the field. From administrative data such as crude death rates and the diagnostic codes given to patients by hospitals to reflect the illnesses from which they are suffering, Professor Jarman developed his Hospital Standardised Mortality Ratios (HSMRs).

These aim to make allowance for every important variable that determines whether a patient admitted to hospital lives or dies, so that what is left is a measure of the quality of care. HMSRs have become an important index, used by many hospitals to determine how well they are doing, and for the Good Hospital Guide they form one of a number of measures used to produce a “league table” of the best hospitals in England.

Like others, I was puzzled when Mid-Staffs emerged in the top ten hospitals in the league table in November 2009, using data gathered during a period when its chief executive was admitting that much still needed to be done to turn the hospital round. Looking closely at the data published by Dr Foster, it appeared that Mid-Staffs had found a new and better way of treating broken hips (known as fractured neck of femur, or FNOF).

This is an injury often suffered by elderly people, and it has a significant death rate of around 10 per cent which has changed little over decades. Confined to bed, patients with fractured neck of femur often develop other conditions and die. There was no real reason to believe that Mid-Staffs had discovered a magic solution to this problem, yet its standardised death rate for FNOF was very low - 19.87, against a national average set at 100. That meant that less than a fifth as many patients with FNOF admitted to Mid-Staffs were dying as in the average English hospital. This is implausible. No other hospital in Dr Foster’s top 30 came close.

I asked the trust if this figure was right, and it confirmed that it was, attributing the result to “substantially improved coding procedures”. That meant that its improved rating for FNOF – well below that for other emergency conditions treated by the hospital – was the result of changes in the way it attributed diagnostic codes to patients.    

Coding is not an exact science. From reports by doctors, the coding departments at hospitals attach codes appropriate to the conditions suffered by the patients. Codes are used to determine what hospitals are paid under the payment by results system. A single patient may have many codes – up to a dozen – but the average code per patient across the NHS was, according to CHKS, around four in June 2009. This represented a rise from about three in April 2005.

There is also a big variation in the number of diagnostic codes per patient from hospital to hospital, from a low of less than three to a high of over five. This may simply reflect a variation between hospitals in the number of comorbidities suffered by their patients – some hospitals treating iller patients than others – or it may reflect a difference in coding practice. It is the view of Professor Jarman and of Dr Foster that coding depth does not have a significant effect on HSMRs.

The graph below is taken from a response  made by the Dr Foster Unit at Imperial College to criticisms of HSMRs published in the BMJ by a team at Birmingham University (3). The points are widely scattered but it does show a link, such that a hospital using only 2.5 codes per patient would show an HSMR about 15-20 points higher than one using 5.5 to 6 codes per patient.
Could increasing coding depth be the cause of the steady fall in risk-adjusted mortality detected by both the Dr Foster Unit at Imperial College and CHKS? In 2009, Dr Foster reported that HMSRs had declined by 7 per cent in a single year, while CHKS has seen a 50 per cent decline in its own mortality ratio measure, called RAMI (risk-adjusted mortality index) over five years. Actual deaths in NHS hospitals in England have not fallen at all, see Table below (figures from CHKS).

So if crude deaths have not fallen but risk-adjusted deaths have, something else must be happening. Professor Jarman’s explanation is that the severity of the conditions treated by hospitals has indeed risen, as milder cases are increasingly treated in the community or as day cases. So he sees the fall (below) as a real effect. CHKS is unconvinced.
At Mid-Staffs, it appears that a substantial part of its  improvements in HSMRs is due to deeper coding. That may be because it was under-coding before, or that it is over-coding now, but it certainly suggests that coding changes are a much easier way of improving a hospital’s results than actually improving care.
In the case of a handful of hospitals, the key to greatly-improved risk-adjusted death rates is the use of the code for palliative care. Over the past five years there has been a big increase in the use of this code, Z51.5, from under 400 deaths a month in 2004 to 1,800 a month by 2009 (CHKS figures). Patients coded Z51.5 are assumed to have come into hospital to die, so allowances are made in correcting death rates. Otherwise, hospitals would be blamed for failing to save the lives of patients whose lives cannot be saved.

Again, there is a big variation from hospital to hospital. In 2007-08, 5 per cent of deaths were coded Z51.5 in English hospitals, but in a handful it was as high as 20 per cent. By 2008-09, the average had risen to 7.8 per cent, but some hospitals were coding more than 50 per cent of deaths as palliative care. By April-June 2009, the all-England average had reached 11.3 per cent, with five hospitals coding more than 30 per cent of their deaths as palliative care. Each bar in the graph below represents a trust, showing the huge range across the NHS in England in the use of code Z51.5

Professor Jarman believes that a handful of hospitals have reduced their risk-adjusted death rates significantly by heavy use of Z51.5. This is not to say they were wrong to do so: some areas lack palliative care facilities and more people die in hospital. But even if the use of the code was appropriate, the powerful effect it has on HSMRs suggests that some hospitals’ improved performance is less to do with better care and more to do with better coding.
One such hospital is Walsall, identified as one of Dr Foster’s “most improved hospitals” over the past three years. It uses the Z51.5 code heavily, and has seen its HSMR fall by more than 30 per cent in the past three years. Dr Mike Browne, the Medical Director at Walsall, says that huge efforts have been made to tackle high death rates identified there in 2002 by Dr Foster. These changes include improving coding, and Dr Browne believes that increased palliative care coding contributed a third of the hospital’s recent improvements in HSMR.
Dr Browne, like CHKS, questions whether risk-adjusted death rates are valuable for making comparisons between hospitals. He sees their main use as pointing to particular services within hospitals that need improving, and says that at Walsall they have proved very useful in this way.
Certainly hospitals that do badly in the Dr Foster guide tend to look to improved coding, as well as improved care, to boost their status. Basildon and Thurrock had the worst HSMR in England in the 2009 guide, despite being rated “double excellent” by the Healthcare Commission just a month before. Its decision on hearing the figures? To recruit not more doctors, but more coders.
The arguments about HSMRs have come to a head because of the health department’s review. It wants to put an end to situations in which the inspectorate finds a hospital satisfactory, only for Dr Foster to declare it dangerous. Last month Professor Jarman issued a list of 25 hospital trusts that had high HSMRs, which he said had contributed to 4,600 excess deaths in 2007-08. What are patients meant to make of such claims? 
Given the wide range of HSMRs recorded across English hospitals, it is far from clear that the differences mean very much at all. Identifying outliers ought to be possible using funnel plots with 99.8 per cent control limits. If variations were random, then just 0.2 per cent of hospitals would lie outside those limits; in fact 40 per cent do. But it is not clear whether this big variation is actually due to differences in quality of care, or to other factors such as coding and data quality.
Two recent papers in BMJ (4,5) concluded that HSMRs were unfit for purpose because the signal to noise ratio is so unfavourable, and the variance simply too great to be attributable to quality of care. HSMRs are, concluded Richard Lilford and Peter Pronovost, “a bad idea that just won’t go away”.
1. Hospital Standardised Mortality Ratios and their use as a measure of quality, by Dr Paul Robinson (CHKS, available online)
2 Patient coding and the ratings game by Nigel Hawkes (BMJ 2010;340:c2153) 
3. Monitoring hospital mortality: a response to the University of Birmingham report on HSMRs
4. Using hospital mortality rates to judge hospital performance: a bad idea that just won't go away by Richard Lilford and Peter Pronovost (BMJ 2010;340:c2016)
5 Assessing the quality of hospitals by Nick Black (BMJ 2010;340:c2066)