Retention of "innocent" DNA: some figures, but too few
The Home Secretary, Alan Johnson, has opened up a little bit on the successful use of "innocent" DNA profiles to achieve subsequent convictions for serious crimes.
On January 18, he told the House of Commons: "I should provide some statistics to the House, because the statistics are interesting.”(see Hansard Column 31). Terrific!
Recall that there are probably 500,000 to 1 million "innocent DNA profiles’"on UK’s National DNA Database because recently they seem to have been acquired at a rate of 100,000 to 200,000 per annum (see past articles on this website). This could mean that, by 2009/10, UK’s retention-tally exceeded 3 million "innocent DNA profile-years". And, we’d stand to add another 1 million innocent DNA profile-years in 2010/11 - but for the recent intervention of the European Court on Human Rights.
The Home Secretary gave three DNA case studies, two of them about detection from innocent DNA profiles neither of which had holding times’ as long as 6 years, the period for which the Home Office plans to retain such profiles (see box, below).
Alan Johnson then got down properly to the business of quantification. In 2008-09 (a single financial year?), there had been 79 matches of innocent DNA profiles to cases of rape, murder or manslaughter; and in 36 out of 79 cases the DNA match was vital to securing a conviction. The Home Secretary’s statistical account would have been exemplary had he also provided offence/date/sex/age details for the crucial 36 ‘detections from innocent DNA’, see my Table.
Yes, this table is empty of data. It needs to be filled before any proper examination of the case for retaining DNA profiles from the innocent can be made.
I very much hope that Frank Field, David Davis and other MPs who pressed the Home Secretary for statistical information on detections from innocent DNA will ensure that it is completed - not only for the 36 cases in 2008-09 but for all detections from innocent DNA in murder/manslaughter/rape cases since the police’s holding of innocent DNA-profiles was enabled by parliament. Prospectively, completion would be assured if reporting on the National DNA Database came under the auspices of National Statistics, as I have proposed that it should.
Intriguingly, the Home Secretary also informed the House (see Hansard column 33) that: "The latest research we have is being independently peer-reviewed as we speak."
Chris Huhne (column 60) reminded members about trenchant criticism (including by me) of the May 2009 public consultation, Keeping the Right People on the DNA Database: Science and Public Protection, not least because its use of statistical science did not enhance public trust. The Home Office published an updated analysis in November 2009 which still did not settle matters satisfactorily.
I am therefore delighted that a third report seems to be in hand, and under review. Key issues that were outstanding in the November 2009 analysis and which I hope will now have been addressed to meet both parliamentary and statistical scrutiny are the following:
1. What is the hazard of conviction (not merely of re-arrest) with elapsed time since registration of an innocent DNA profile? Unless conviction follows re-arrest, re-arrest merely resets the start-date for holding an ‘innocent DNA profile’ and risks the perversity of’ indefinite holding’ by instalments.
2. How do analysts handle the ‘in limbo period’ between registration of a subject’s DNA-profile on the National DNA Database and its labelling, or recognition, as ‘innocent’? Detections during the ‘in limbo period’, on which parliament should set a maximum, are not forfeited whatever decision is made on the retention of innocent DNA profiles.
3. Hazard analyses are needed separately for adults and children (for example: below 18 years of age when DNA sample was obtained) if policy is age-specific. In the population at large, the hazard of conviction rises in late teens, and decreases from the mid twenties. Thus, the counterfactual hazard of conviction (re-arrest) will differ according to the age-mix of subjects from whom an ‘innocent DNA profile’ was obtained.
4 Whether analysis relates to adults or children, we need estimates how the hazard of conviction is influenced by the sex & current age & ethnicity (or just: white/non-white) of the subject; and by the seriousness of the offence in respect of which ‘innocent DNA profile’ was obtained. For example, how does the hazard of conviction at 22 years of age compare for an unconvicted subject whose ‘innocent DNA-profile’ was obtained at 17 versus 20 years of age?
5. Descriptive statistics (on sex & age + calendar year when DNA sample was obtained & ethnicity), separately for adults and children, were missing for the subjects whose ‘innocent DNA profile’ is retained and for whom hazard of conviction (re-arrest) is being analysed. Descriptive statistics are needed also on the seriousness of the crime for which conviction is obtained in the first, second, third, fourth, fifth and sixth year since registration of an ‘innocent DNA profile’. Always describe the basics of what you are analyzing before recourse to sophisticated statistical methods!
6. Any adopted counterfactual hazards, whether of conviction or arrest, need to match the sex and current age distribution of residual members of the ‘innocent DNA profile’ cohort. Doubtful, and not demonstrated, in the November 2009 analysis.
7. Extrapolation beyond actual data is always tricky. Consider doing so by formal methods, such as the fitting of parametric survival distributions.
David Davis (see Hansard. Column 52) called for robust hazards-analysis of the DNA database so that rational decisions could be made about ‘different strategies for minimizing impingement on people’s liberty while maximizing effectiveness’.
How wonderful to read statistical lucidity from a Home Secretary, a former shadow Home Secretary and a serving shadow Home Secretary. It augurs well for statistical science in parliament, I think.
(Declaration of interest: SMB serves on Home Office's Surveys, Design and Statistics Subcommittee, but writes in a personal professional capacity.)
Nicholas Bohm (not verified) wrote,
Mon, 25/01/2010 - 16:01
A question (which I am unclear whether or not you have covered in your comments) which ought to be answered is just why presence of an innocent person's profile on the database predicts their future increased likelihood of conviction (if in fact it does).
If, for example, presence of a person's profile leads to their inclusion among "the usual suspects", and that fact, rather than any greater involvement with criminal activity than their peers, is what leads to future convictions; then it seems to me that no justification for including the profile has been established.
Robert Whiston (not verified) wrote,
Thu, 28/01/2010 - 02:24
The hyperlink of 'analysis' (just below the red insert) is very much worth a read. It's called "DNA RETENTION POLICY: RE-ARREST HAZARD RATE ANALYSIS" and as the article implies it premise is most 'iffey'. (http://www.homeoffice.gov.uk/documents/cons-2009-dna-database/dna-retent...)
The DNA article reports that records of people arrested, but not cautioned, or convicted of any crime have their DNA sample taken. By this wrinkle Harriet Harman avoided her DNA being put on the police database in perpetuity.
The people arrested have no previous conviction yet their personal liberty is grossly infringed.
If, in 2008/09, there were 413 rapes and 226 murders and manslaughters that involved DNA matching from a data base of 1 million and only 36 successful hits, why is the DNA of people with no previous conviction being tested along with known felons ?
The assumption being made by the HO appears to be that if you are arrested, say, as a teenager for a really petty crime it will be easier to trace you when you grow up to be a rapist later in life.
Apparently the 'exploration' revealed that 46 (11%) and 23 (10%) respectively of the rape and homicide matched samples belonged to individuals who did not have a conviction at the time of the match.
Robert Whiston (not verified) wrote,
Thu, 28/01/2010 - 02:25
This post follows on from the one above:
The reason given for this blanket DNA sampling is that 'offenders tend to be relative generalists across the course of their careers'. This would seem to run contrary to the serious crime group where serial and multiple murderers and rapists lurk. Why involve those falsely accused or wrongly suspected when they were juveniles ? The leap from illegally smoking cigarettes will not be straight to rape and homicide, if their theory is true, but to robbery, car theft and violence etc which is the proper stage at which to collect DNA.
And why will no HO report discuss false allegations ? There are approx. 12,000 rape claims per annum yet only a few hundred, even with DNA evidence, end in a conviction. What happens to those remaining 11,500 men arrested but falsely accused ?
So for the sake of 10% we abandon long held practices of fair play, innocence until proven guilty, presumption of innocence etc, and protection from an overly interfering state machine ? As the reports itself has to admit "a zero retention [policy]... might result in around 10% fewer DNA matches for these serious offences."
Anonymous (not verified) wrote,
Fri, 12/02/2010 - 10:11
this may have been answered in your previous posts, if so please point me in the right direction but what is the probability of a false positive match? The general public seems to assume that a match means 'it was you' but my understanding of the science is that a match has a non-zero (but small) probability of being a false positive?
Robert Whiston (not verified) wrote,
Mon, 08/03/2010 - 22:46
"Anonymous" raises a valid point. I can't speak for this article or the author but the possibility was explored in a 2004 article entitled "DNA Doppelganger".
The possibilities of false positives was hinted at even then and later confirmed by DNA's 'inventor', simply because the number of markers used was too few.
Whether the number of markers has been increased would be a development worthy of knowing.