Anti-Social or Anti-Statistical Behaviour?

Eight police forces yesterday launched six-month trials of "what works best" in logging calls and responding to complaints about anti-social behaviour.

 But Mark Easton of the BBC has revealed on BBC News and on his blog that there will be no common method for assessing the results of such research. Without such a method, the results will not be comparable and the trials less effective than they might have been in discovering what really does work best.

A common definition of anti-social behaviour (ASB) is the first issue. Easily said, less easily codified because in contention are: nature of the alleged Offence, offence-Location, Victim, Caller, about the Perpetrator (if known/suspected), and about potential Witnesses or other crime-scene evidence.

Importantly, Mark’s television report showed the free text recorded about a particular ASB call; and made clear forces’ intention that, in future, calls relating to a first-ASB-event should be distinguished from serial-ASB-events. Some readers may be astonished that this was not already the case . . .

The second issue is therefore how free text will be analysed (either per-event, across linked serial-events, or across the entire ASB call-database); or codified. Codification takes extra time – for example, if each of OLVCPW (see above) were to require 10 questions for their proper description, then 60 questions would need to be answered to codify each call. Analysis would then rely on the codified-data, rather than free text. Of course, the eight forces may initially adopt different 10-question sets . . . See below for how methodology can help to square that circle without undue burden.

Third in a long list of technical issues is how the relatedness of calls is itself to be defined – by O or L or V or C or P? Proper linkage analysis should envisage all of these possibilities. Even if we consider just Location – does this mean an address, street, ward, or some natural geography such as park by which to link ASB calls?

Fourth, having recorded the call, the ASB event needs to be risk-classified and different risk-classification schemes are likely to be adopted for first-events (tier 1) versus serial-ASB-events (tier 2). The eight police forces may, of course, adopt different 2-tier risk-classifiers (A to D, say). Forces may also differ in their expert judgment on how to respond to a tier-1 risk-A event . . .

Per police-force, thus far we have 10 x 10 x 10 x 10 x 10 x 10 {event-codifications for OLVCPW } which map onto 2 {tiers} x 4 {risk-classifications}, with each of which is associated a recommended response-grade (I, II or III). Thus, 1 million event-codifications are each to be summarised, either programmatically or by professional judgment, into one of 2 x 4 x 3 = 24 tier/risk/response categories. For practical operational reasons, recommended and actual response-grade may differ so that both need to be recorded. Finally, a system is required by which officers who deliver a grade II or III response sign-off on the ASB-event by providing an outcome-Assessment for which, again, there may be up to 10 codes. 

How can methodology help? Telephonists will not have time to record free text and undertake codification. Thus, ASB-calls or telephonists could be randomized to record free text or to codification if the force wants to determine which approach serves them or the public better.

Codification may prolong calls unduly, but ensures that every question is asked & answered. It does not deal well with unanticipated detail, which free text excels at. Free text, however, may overlook key basics because the caller does not volunteer them. Forces may therefore opt to have a random sample of ASB calls audited, for example by someone else telephoning back the caller to repeat the data elicitation to ensure that both codification and free text (versions 1 & 2) are available for a random sample of ASB calls.

Sampling may be so designed that if the ASB call pertains to a serial (not first) ASB event, then the data elicitation is audited for all ASB calls in the series.

The police force may opt to base its analyses on a database which comprises only the randomly sampled first and serial ASB events in order to reduce the burden of data-recording. Thus, for these calls only does the officer need to provide a fully coded outcome-Assessment.

Methodology can also be used to check on whether the recommended response-grade (I, II, III) - as initially envisaged by the force – is actually correct. Thus, a subsidiary sampling scheme can be established, separately for tier-1 and tier-2, in which a number of ASB calls per week (25, say) which were recommended for response-grade I (lowest) are promoted to response-grade II; and similarly a number of calls per week which were recommended for response-grade II are promoted to response-grade III (highest). Random downgrading could also be considered.

Officers’ outcome-Assessments would then be available for 100 randomly selected tier-1 ASB-calls monthly that would otherwise have had minimal response, based on which force-commanders can adjudicate whether any of the extra responses paid-off in a manner that would lead them to adjust their original schema for grading which tier 1 ASB calls warrant only response-grade I. And similarly for tier 2. How else, in a formal sense, can forces learn whether their expert judgement was as expert as they’d hoped?

Finally, forces whose codification, risk-classification or response-grading differed might pair up, and agree to adopt the other force’s method for a random sample of their own ASB calls to determine for themselves whether the other force’s approach had merit in their patch. And vice-versa.

Any outline plan can be improved by adopting formal design principles . . . lacking even an outline plan for how to make inter-force comparisons or for intra-force auditing is anti-statistical behaviour!