Retention of "innocent" DNA: some figures, but too few

The Home Secretary, Alan Johnson, has opened up a little bit on the successful use of "innocent" DNA profiles to achieve subsequent convictions for serious crimes.

On January 18, he told the House of Commons: "I should provide some statistics to the House, because the statistics are interesting.”(see Hansard Column 31). Terrific!

Recall that there are probably 500,000 to 1 million "innocent DNA profiles’"on UK’s National DNA Database because recently they seem to have been acquired at a rate of 100,000 to 200,000 per annum (see past articles on this website). This could mean that, by 2009/10, UK’s retention-tally exceeded 3 million "innocent DNA profile-years". And, we’d stand to add another 1 million innocent DNA profile-years in 2010/11 - but for the recent intervention of the European Court on Human Rights.

The Home Secretary gave three DNA case studies, two of them about detection from innocent DNA profiles neither of which had holding times’ as long as 6 years, the period for which the Home Office plans to retain such profiles (see box, below).
Alan Johnson then got down properly to the business of quantification. In 2008-09 (a single financial year?), there had been 79 matches of innocent DNA profiles to cases of rape, murder or manslaughter; and in 36 out of 79 cases the DNA match was vital to securing a conviction. The Home Secretary’s statistical account would have been exemplary had he also provided offence/date/sex/age details for the crucial 36 ‘detections from innocent DNA’, see my Table.

Yes, this table is empty of data. It needs to be filled before any proper examination of the case for retaining DNA profiles from the innocent can be made. 

I very much hope that Frank Field, David Davis and other MPs who pressed the Home Secretary for statistical information on detections from innocent DNA will ensure that it is completed - not only for the 36 cases in 2008-09 but for all detections from innocent DNA in murder/manslaughter/rape cases since the police’s holding of innocent DNA-profiles was enabled by parliament. Prospectively, completion would be assured if reporting on the National DNA Database came under the auspices of National Statistics, as I have proposed that it should.

Intriguingly, the Home Secretary also informed the House (see Hansard column 33) that: "The latest research we have is being independently peer-reviewed as we speak." 

Chris Huhne (column 60) reminded members about trenchant criticism (including by me) of the May 2009 public consultation, Keeping the Right People on the DNA Database: Science and Public Protection, not least because its use of statistical science did not enhance public trust. The Home Office published an updated analysis in November 2009 which still did not settle matters satisfactorily.
I am therefore delighted that a third report seems to be in hand, and under review. Key issues that were outstanding in the November 2009 analysis and which I hope will now have been addressed to meet both parliamentary and statistical scrutiny are the following:
1. What is the hazard of conviction (not merely of re-arrest) with elapsed time since registration of  an innocent DNA profile? Unless conviction follows re-arrest, re-arrest merely resets the start-date for holding an ‘innocent DNA profile’ and risks the perversity of’ indefinite holding’ by instalments. 

2. How do analysts handle the ‘in limbo period’ between registration of a subject’s DNA-profile on the National DNA Database and its labelling, or recognition, as ‘innocent’? Detections during the ‘in limbo period’, on which parliament should set a maximum, are not forfeited whatever decision is made on the retention of innocent DNA profiles.

3. Hazard analyses are needed separately for adults and children (for example: below 18 years of age when DNA sample was obtained) if policy is age-specific. In the population at large, the hazard of conviction rises in late teens, and decreases from the mid twenties. Thus, the counterfactual hazard of conviction (re-arrest) will differ according to the age-mix of subjects from whom an ‘innocent DNA profile’ was obtained.

4  Whether analysis relates to adults or children, we need estimates how the hazard of conviction is influenced by the sex & current age & ethnicity (or just: white/non-white) of the subject; and by the seriousness of the offence in respect of which ‘innocent DNA profile’ was obtained. For example, how does the hazard of conviction at 22 years of age compare for an unconvicted subject whose ‘innocent DNA-profile’ was obtained at 17 versus 20 years of age?

5. Descriptive statistics (on sex & age + calendar year when DNA sample was obtained & ethnicity), separately for adults and children, were missing for the subjects whose ‘innocent DNA profile’ is retained and for whom hazard of conviction (re-arrest) is being analysed. Descriptive statistics are needed also on the seriousness of the crime for which conviction is obtained in the first, second, third, fourth, fifth and sixth year since registration of an ‘innocent DNA profile’. Always describe the basics of what you are analyzing before recourse to sophisticated statistical methods!

6. Any adopted counterfactual hazards, whether of conviction or arrest, need to match the sex and current age distribution of residual members of the ‘innocent DNA profile’ cohort. Doubtful, and not demonstrated, in the November 2009 analysis.

7. Extrapolation beyond actual data is always tricky. Consider doing so by formal methods, such as the fitting of parametric survival distributions.

David Davis (see Hansard. Column 52) called for robust hazards-analysis of the DNA database so that rational decisions could be made about ‘different strategies for minimizing impingement on people’s liberty while maximizing effectiveness’. 
How wonderful to read statistical lucidity from a Home Secretary, a former shadow Home Secretary and a serving shadow Home Secretary. It augurs well for statistical science in parliament, I think.
(Declaration of interest: SMB serves on Home Office's Surveys, Design and Statistics Subcommittee, but writes in a personal professional capacity.)