How common is research misconduct in the UK?

Scientific misconduct is anti-science and deserves to be treated seriously. But attempts to measure how serious it is by means of a survey require serious attention to how that survey is designed.

Last Friday a survey by the BMJ concluded that such misconduct is “alive and well” in the UK, with more than one in ten (13 per cent) of UK-based scientists and doctors having witnessed colleagues intentionally altering or fabricating data, and 6 per cent saying they were aware of research misconduct at their own institution that had not been properly investigated.  How sound was the basis on which these claims are made?

I have submitted my criticisms as a Rapid Response to BMJ, but others will be aware of the claims, which were reported in a number of papers, including The Independent and the Financial Times. For those who read those papers but not the BMJ , it is worth reiterating those criticisms here.

The response rate by the surveyed UK authors/reviewers was low at 2782/9036 (31 per cent). Does BMJ ordinarily accept for publication surveys with response-rates so much lower than 60 per cent as their own survey achieved?

Three questions were asked, one of which was to ascertain whether each respondent was primarily clinician, academic or both. However, the information on type of respondent could not be used for covariate adjustment (that is: to determine if report-rate on misconduct differed by type of respondent) because the free software that had been used for the survey apparently did not allow access to respondent-type. Thus, BMJ has collected data that it cannot disclose.

Question 2 asked if the respondent was aware of “any cases of possible research misconduct at your institution that, in your view, have not been properly investigated?”. The balancing question – asking for awareness of “any cases of possible research conduct at your institution that, in your view, have been properly investigated?” – was not asked. Answers to the balancing question could potentially have been cross-checked institutionally. Survey design requires objectivity and balance in how questions are selected.

Question 2 elicited 163 affirmations of awareness of possible misconduct. As BMJ slides acknowledge, there may have been multiple reporting of the same institutional ‘cause celebre’; or respondents may have had awareness of more than one case in his/her institution. The phrasing of survey questions matters intensely. Because of poor analytical and reporting standards, we cannot be sure that the 163 affirmations do not emanate from a single institution or relate to a single case.

Time-related information can be essential for the interpretability of surveys. In general, the longer a scientist’s career in a particular institution, the longer the period over which the scientist may have observed research misconduct there. Other things being equal, a scientist whose career at institution A has lasted 20 years should be 10 times more likely to have encountered research misconduct at institution A than a fellow-scientist whose career length at institution A is 2 years. Well-designed surveys build-in checks such as this which assure data-quality and because an analysis plan has been thought out in advance.

Such analytical forethought immediately points to the necessity of knowing how long  respondents have worked in their current institution as well as the number of different “cases” of misconduct that he/she is aware of during the institutional-period so that “number of misconduct-cases per 1,000 science-career-years” can be computed.

Worst of all was question 1. Unlike question 2, question 1 is not restricted to one’s own current institution and so encompasses, for example, any nefarious analytical adjustment that I, as a statistician, may detect – and seek to sort out – as a referee. That being so, it is amazing, indeed incredible, that the affirmative rate for question 1 is as low as 13 per cent (versus 6 per cent affirmative rate for question 2). Did most respondents emanate from somewhat troubled institutions or are they less critical as peer-reviewers than as institutional-observers?

But the different phrasing between question 1 (Have you witnessed, or do you have firsthand knowledge of, UK-based scientists inappropriately doing Y . . . ) and question 2 (Are you aware of any cases of  X . . . at your institution) probably defies the sort of comparison that I sought to draw above. Of course, the survey should have been decently designed so that some form of comparison could be made, for example by asking if the most recently witnessed/firsthand knowledge behaviour had occurred in the respondent’s own institution .  

Back to the wording of question 1: the witnessed/firsthand knowledge behaviours which the BMJ asked about range from inappropriately i) adjusting, ii) excluding, iii) altering data to iv) fabricating data either a) during their research or b) for the purposes of publication. Question 1 is, in effect, a set of eight questions!

If there is any consulting statistician who has not helped scientist-colleagues to do some regression analysis (ie statistical adjustment) more appropriately, I’d be concerned about their inexperience. “Inappropriately adjusting”, as stated in question 1, does not specifically rule-out  as justifying a “yes” answer the incorrect application of regression methods, which is part of the job of statistician-referees to detect and correct and which are more often errors of comprehension rather malice  aforethought.

Just as the BMJ’s errors in survey-design, analysis and reporting are due to lack, not malice, aforethought. However, the first sentence in Tavare’s BMJ news piece is mischievously wrong – he omits to mention that question 1 asked about behaviours i) and ii) as well as about iii) and iv).  Correction in the second paragraph is too late . . .  a headline has been achieved by dint of ii) (if I’m generous) or iii) (if I’m sceptical).

As Dr Godlee has remarked in another context: UK science and medicine deserve better.  And statistical science requires better for the design, analysis and reporting of surveys. Designers of surveys with response rate as low as BMJ’s should look to their laurels; and not use statistics as the drunk uses a lamp-post: for support rather than illumination.

  1. Tavare A. Scientific misconduct is worryingly prevalent in the UK, shows BMJ survey. British Medical Journal 2012; 344: e377