Beyond 2011: a privacy campaigner’s nightmare?

I see Capita is recruiting 35,000 people to help deliver the Census next year. At least the exercise will help a few people off the unemployment register for a few months – even if that is all it will achieve.
 
In the 2001 Census, there were missing pieces of personal data for 28 per cent of the population, while 6 per cent of the population was estimated to have been missed altogether and their values had to be “imputed” (guessed). Then another 230,000 had to be added when it became clear that the survey to survey how comprehensive the surveying had been, had, er, failed to survey the same people that the census had failed to survey.
 
Further number-crunching suggested many young men and non-English speakers had "disappeared", while there were a high number of missing households in inner city areas in the midst of regeneration, and in local authorities with lots of gated communities whose inhabitants are hard for collectors to reach. So another 278,000 people were added. After all that, some of the data had to be swapped around so that researchers using the census data would be unable to identify any person or local business from it.

None of this is very surprising: you try and survey the entire population of the UK without a population register or a complete address list.

But it does raise the question whether there isn’t a better way of doing this. No other countries seem to go through quite the 32-page palaver that we do (I stand ready to be corrected on that). The US census only collects brief information – age, sex, race, tenure and relationship information – while carrying out a more detailed survey of 3 million households. Canada asks seven questions, Israel five, New Zealand 48. In 2001, Spain offered everyone the chance to complete their census online; the take-up rate of 0.1 per cent was, as the ONS noted, “disappointing”. In countries with population registers, such as much of northern Europe, the process is simpler and more reliable.

And so the ONS has been examining whether there might not be a better way of doing things here. Initially called the Integrated Population Statistics System (IPSS), it is a combination of census, survey and administrative data linked at the individual and household level and updated over time. It would be made available to researchers, government and the private sector: a superdatabase of personal details, medical records, school results, benefit claims and earnings.

A consultation conducted by the ONS which reported last year, found respondents overwhelmingly supported the IPSS – which, when you look at who the respondents were, is less than surprising: councils, social and market researchers, and a couple of charities (so useful for targeted fundraising).

There were, however, concerns raised over public acceptability: “there would need to be absolute guarantees that data will not be misused, but even so many people might refuse to complete forms if they knew that their personal data, provided in confidence for a particular purpose could be linked via a unique reference number to personal data provided for other purposes to other organisations.” You’re telling me.

Look at what the plans envisaged: “The IPSS vision is to combine census, survey, and administrative data, linked at individual person level, to create a single, comprehensive population statistics database, which is updated over time.”
 
A “full 2011 census operation” was described as one of the key elements of the proposal in a note last year, together with the creation of a linked statistical database, combining administrative and survey data.  
 
 “The aim would be that, following the 2011 Census, the linked statistical and census databases would be combined to create a linked population statistics database. This linked population statistics database would be updated using administrative records, survey data, the address register, and any future population register.” It’s a privacy campaigner’s nightmare.
 
Things have changed a little since last year. The IPSS is now called Beyond 2011 but is still very much in the pipeline. It will not, however, ONS officials stress, use data provided in your Census return; instead the 2011 Census is to be used as a benchmark, to examine what sort of longitudinal survey might best complement all that data from government departments, the Inland Revenue and the NHS. You can see a note on it, produced for a UN conference this July, here.
 
The note expresses understandable concern that publicity about Beyond 2011 will affect returns to the 2011 Census: “There is therefore a risk that public communication on alternatives to a census or modifications to the current approach has an adverse effect on the success of the 2011 Census which takes place on 27th March 2011. User engagement tasks will therefore need to be very carefully planned and carried out in order to minimise this risk.”

It’s surprising that you haven’t heard a squeak about this in the media, during a period of mass hysteria about databases and ID cards. Perhaps it’s because planning for the new superdatabase is complicated, long-winded and jargon-filled and comes with incomprehensible charts such as this: 
 
               

In the meantime, the Economic and Social Research Council has started its own longitudinal voluntary study of 40,000 households, which will combine data from questionnaires with information from NHS, educational and eventually economic records - including benefits, savings and earnings.
 
Called Understanding Society. it includes questions about lifestyle, life satisfaction, crime, consumption, social life and attitudes to the environment. Due to the depth and breadth of the study it is likely to prove more useful to researchers than the information collected by those 35,000 temporary Capita employees ever will.