Speaking Data: January 2011

Statutory Statistical

The Statistics and Registration Service Act 2007, has to be one of the shorter bits of legislation at only 44 pages long. It’s purpose is promoting and safeguarding the production and publication of official statistics that serve the public good ( Part 1 Section 7(1)).

That public good being: informing the public about social and economic matters, and assisting in the development and evaluation of public policy. And to promote and safeguard:

(a) the quality of official statistics;

(b) good practice in relation to official statistics, and

So put 17,000 words of the Act into a (wordle) wordcloud and this is what you get....

The Act created the UK Statistics Authority (technically the Board), which is at the heart of the Act, and which has produced as the Code of Practice for Official Statistics and it’s primary tool. The code has real breadth and depth. While the code is wide in scope (8 principles and 3 protocols) it’s also specific in terms of practices, of which there are 74 practices to support those principles and protocols. And those practices strong on the “what” rather than the more prescriptive “how”.

And here's what that Code looks like...

It is of course aligned with international context. This includes the Fundamental Principles of Official Statistics (from the United Nations Statistics Division) and EU’s European Statistics Code of Practice: For National and Community Statistical Authorities. Plus it is specific about the interpretation of the Cabinet Office Civil Service code to statistics:

Integrity: putting the public interest above organisational, political or personal interests.

Honesty: being truthful and open about statistics and their interpretation

Objectivity: using scientific methods to collect statistics and basing statistical advise on rigorous analysis of the evidence

Impartiality: acting solely according to the merits of the statistical evidence, serving equally well all aspects of the public interest.

So as well as that broader context, in practice the code represents a very effective consolidation of best practice for the management of statistics and data more generally. Of course there would be an implicit degree of proportionality applied here, but no real counter arguments. As a standard, it’s accepted that it is a high one, with 74 real business practices with which to be technically compliant for any set of data. So the picture take on this....

get the graphics

While designed as a code of practice for “official statistics”, the early emphasis has been on the more specific and formal national statistics, typically those produced by government departments. The scope of official statistics is a bit more tricky to pin down, and technically includes “statistics produced by any other person acting on behalf of the crown” (Part 1, Section 6(1,a,vi) ).

There’s now enough substantive extra that sit around that central code of practice. If the Code is the “law” then there’s plenty of subsequent “precedent”. Firstly here is the emerging structural emphasis across that code. The early strong emphasis has been around users, and engaging and understanding needs. In short “meeting users' needs is at the heart of the Code of Practice”. Here's the wordcloud of that (monitoring) report, something visually here about official stats focussing towards users....

Then perhaps less conspicuously, but arguable more profound, has been the emphasis on the value of statistical commentary, as itself being a public good for the public domain. And then some reinforcement of standards for statistical releases. That is all reinforced by the “value” emphasis in the value for money equation…the value of statistical work refers to the “benefit delivered by the use of the statistics to influence decisions and actions, now and in the future”. And that reinforced by the description of the Use of Official Statistics……

Then there’s the rolling work programme with the practical audits. With an early run on “breech of code of practice” (17), although less so now. Now there’s 84 Assessment reports and counting, which works out at one a week, over 50 a year. Lots of official confirmation of national statistics status, reflected in the wordcloud...

Scratch under the surface of operational statistics, to see some powerful progress, to establish a new management model. The machinery is modernising. A stronger independence for the statistical head of profession, with a management air gap with the policy and media folk. At the same time, there’s less ‘pre-release’ time for exclusive internal access before stats being made public, and increasing consistently at 9:30 in the morning. Then there is the conscience role. Pronouncing on formal mis-use in the public eye, which have even resulted in Ministerial apologies.

Once might reasonably expect that at some point the open statistics movement might helpfully co-join with the open data and visualisation momentum and maybe spawn an open insight movement. After all, the whole can be greater than the sum of the parts…..So blend some open data, engaging presentation and some statistical process and get some collective insight to drive change and even transformation.

Meta 'Sweta' Data

We now have the idea of meta data – data about data – relatively well established. That is information which is used to describe the attributes of a data set. This might include filename, formats, date of creation and version numbers for example. Equally this can include attributes of the data variables themselves, such as a variable name, description, format, validation limits and so on.

As well as well as meta data, I now propose another quite distinct sort of “data about the data” – that which describes it’s usage. This I’ll call “sweta” data. While meta is derived from the Greek, sweta is also derived from the more colloquial “sweat the asset or data”. In keeping with the analogy, this relates the degree of exertion, the extent to which data is ‘exercised’ or used.

So as well as the data itself, we have…

Meta Data… the information necessary to be able to make use of the data.

Sweta Data…the information about the extent of use of the data.

get the graphic

There’s really two sorts of sweta data, macro and micro. Micro is about the use of individual fields within a data set. Macro is about the overall use of a data set, which might be then compared to other data sets.

get the graphic

Use and Value

Having managed several annual cycles of national data collection requirements, usage of fields is a very strong practical driver to what stays and what goes to make way for the new. There’s an emerging opportunity to strengthen this right now. If we apply some of the more generic thinking about prioritisation of data, then that of ‘used and useful’ is a fair test. That ‘extent of use’ will be a proxy for value, although biased towards quantity rather than quality. Given than value can be such an elusive thing to measure, more measurable usage at least provides a starting point for that debate. The usage is more a sense of output, and value more a sense of outcome.

With the overall transparency agenda, and more specifically with the national data portals, there’s a real opportunity to understand usage. So www.data.gov.uk has over 5500 links to data sets after the first year. And there’s the web based data hub of the Office for National Statistics provide a wide scope of national and regional level data on a daily basis. Plus increasing numbers of other independent data consolidator sites that may well bring these sweta matters into sharper focus sooner, especially those with an eye to the future financial viability. That would be a first port of call for the Public Data Corporation, after all a first place to look for biggest revenues will be a biggest uses.

And in age of austerity, tougher prioritisation and decisions need a tighter evidence base, at both macro and micro data levels. So then let’s put the collection cost beside the usage? A starting proposition might reasonably be that low-cost-high-use data sets might well be of more value than high-cost-low-use. And if something has to go (….). So some usage information – macro sweta data - on the extent to which specific data sets are more or less used will be more centre stage.

Apply some Web Analytics

Because of the web have become the default portal for data, this might not be as difficult as it seems. This is because the web has inbuilt counting mechanisms. This used to be simply ‘hit counters’ but now is an emergent analytical domain in its own right – web analytics. The really helpful starting point here is that these web site usage statistics are generally prepared to industry standards (ABCe), the electronic equivalent of the established media circulation, viewing and listening figures. At the simplest level, the measures include the number of web pages viewed (in a specific period), the number of visits (each visit may view more than one page) and unique visits (as there may be multiple visits). So plenty of potential for some easy usage assessment.

This also applies more broadly, not just to the data web portals. At the simplest level, a web site is a tool for communication, often a primary tool for many organisations or functions. These usage analytics provides a starting point for evaluating its extent and effectiveness, especially given its relative ease and availability. For example, www.direct.gov.uk (the established government portals which provide public information about government services) has 30 million visits a month. The government’s transparency programme is using the web as the prime means of publishing all sorts of government data for open use and challenge.

Clearly with the web being used as a key channel of information provision to users, so we might expect to see more on usage of public service web sites.

Official Statistics….about official statistics

Certainly in the world of UK official and national statistics there has been an increasing emphasis on understating the user engagement (UK Statistics Authority), so understanding in a quantitative way the degree of data set usage is a helpful starting point to understand frequency of use and to some extent even user behaviour (repeat visitors for example).

Perhaps more important - although less conspicuously driven - is the (public) value attributed to interpretation of the data (UK Statistics Authority). This is perhaps the most profound indicator of the importance of the output and value end of the data production chain. So much closer to the purpose for having the data in the first place – to extract messages, meaning, and implications - and that data being a means to that end rather than an end in itself. Usage is a window on that value end of that data production chain.

So that’s mainstreaming these other data related aspects, as opposed to just the data. Then it’s a simple leap to see how the quantification of usage is also something that is equally critical for the public domain and for public debate.

Now the question begs whether those statistics usage of official statistics should be available in their own right….oh yes…official statistics about official statistics… that would make it meta statistics.

This usage information could easily be mesh with the costs of data collection to provide a simple if crude index of usage per pound spent….meta money statistics. With the consolidation of the national data asset through data.gov.uk so it won’t be long before these two dimensions start to be seen side by side. Especially since a lot of the public data is collected under contract, and the value of such contracts are becoming available.

In fact the official web site for the Prime Minister (www.number10.gov.uk) has been publishing the usage statistics on a monthly basis. And typically within one or two days of the end of the month. That’s the “official usage statistics for the official web site”. Here’s the pages viewed per visit, a crude “interesting index….

get the graphic

So ‘roll on’ better 'sweta' meta data.

Project Prognosis

Busy, busy, busy. There seems to be a natural tendency to be busy, and to like to see others busy too. But the key question is whether that is ‘right busy’ or ‘wrong busy’?

‘Right busy’ is doing the right things in the right way. ‘Wrong busy’ is anything else. The difference between right busy and wrong busy is mostly planning. Exemplified by the longstanding quote….”failure to plan is planning to fail” as used by Winston Churchill.

get the graphic

So this is about sufficient planning to improve the chances of doing the right thing in the right way. Doing the right thing is effectiveness. Doing it in the right way is efficiency.

Doing the right thing in the right way is about smart use of resources (of people, time and money) to achieve purpose. So herewith the two key tests with some simple ‘treasure’ and ‘travel’ examples.

Q1. Are we doing the right thing?

In short this is about the “What”. Defining what we are trying to achieve. Led by a specific purpose or objective and seeking a specific outcome. This is effectiveness.

Treasure: If the purpose is to find the treasure, this is knowing where to dig the hole. (Dug a whole in the wrong place? Now there are potentially two jobs, fill in the wrong whole and dig another).

Travel: If the purpose is to get to Newcastle from London, it’s about going in the right direction, in this case north.

Q2. Are we doing it the right way?

This is about the “How”. How we go about achieving the “what”. This might be thought about as doing it right first time or quickly. More broadly about the right tools, skills, experience, resources, time, and processes that get brought to bear. This is efficiency.

Treasure: This is about knowing how best to dig the whole. Digger, spade or spoon?

Travel: This is about using the right mode of transport. London to Newcastle by train for speed, rather than cycle.

So if we leave this to luck and toss a coin to decide on the answer to each of these questions, we end up with four possible outcomes…..

A. Right thing, right way… efficient success

Treasure: Digging the whole in the right place with the right tools and techniques. First to find the treasure!

Travel: From London, heading north to Newcastle by train.

B. Right thing, wrong way... inefficient success

Treasure: Digging the whole in the right place but with the wrong tools and techniques. Might find the treasure eventually with time, perseverance and plenty of spoons.

Travel: Heading to Newcastle by cycle.

C. Wrong thing, right way… efficient failure

Treasure: Digging the hole in the wrong place really well. We build a good hole, really quickly, but in the wrong place. No treasure, but at least we find out soon.

Travel: From London heading to Plymouth (south) by train.

D. Wrong thing, wrong way... inefficient failure

Treasure: Digging the whole in the wrong place with the wrong approach.

Travel: From London, heading to Plymouth (south) by cycle.

So three out of four (75%) of these outcomes lead to some degree of failure. So leaving this to luck, the prognosis for a project at the start is only 25% success, at best 50% of we’re happy to waste time, energy and money by being inefficient. So at the very least this means wasted time and effort, at the worst going in entirely the wrong direction. So a sensible degree of planning on the what an the how is the mitigation.

get the graphic

Good to keep a watchful eye out the use of “efficient and effective” as a glibly used double act, and especially in that order too. Should at least be effective before efficient, after all it’s probably preferable to go in the right direction slowly than the wrong direction quickly….

It’s not all bad news. Perhaps best exemplified by Poper’s “Theory of Falsification”. In short to find out that something is definitely not true is better than not knowing either way. We now know it’s a blind alley, rather than not being sure….the treasure is definitely not at the bottom of this hole. That is still some progress.

There are more up sides to lack of planning of course, “The nicest thing about not planning is that failure comes as a complete surprise, rather than being preceded by a period of worry and depression.” (Sir John Harvey-Jones). And of course the educational benefits….“mistakes are the usual bridge between inexperience and wisdom” (Phyllis Therous). And that points to applying that experience and wisdom to achieve just the right amount of planning.

So step boldly forward with a plan. Be ever so slightly curious about the fate of previous bold ventures.....

The art of the “prossible”

‘Prosibile’ is my term for a particular interaction possibility and probability, which often emerges when attempting to explain findings from data. Interaction is perhaps a little generous, more innate mix up. It’s where a single explanation for circumstances has been reached which is quite possible but not necessarily the most probable…. improbably most possible.

Possible – capable of existing, taking place, or proving true without contravention of any natural law. (Collins)

Probable – likely to be or to happen to be but not necessarily so. (Collins)

get the graphic

Apparently the human brain is hard wired to seek an explanation for circumstances. A possible explanation is sufficient. Having done so it then move on to another question. Of course action and resources may well then flow from that explanation. If that’s the not the most probable explanation then that means at best inefficient use of resources, and at worst counter-productive activity.

Magic works on the same principle. Of course nothing impossible, just generally improbable solutions based on a combination of skill, technique and tools. What captures our interest is that desire to seek an explanation. And once we have come up with an explanation we can rest easy, rather the seek a range of possible explanations and evaluate the most likely.

So we continually need some ‘soft-wiring’ to give a little more rigorous focus on ‘probability’ to compensate for that hardwiring. There are of course complex cause and effect interactions at play, and it’s often the case that there are multiple explanations. Nowhere more true than trying to explain local crime trends for example, a mix of national, regional, local activity, both criminal and preventative.

So worth providing dipping a toe into the cause and effect thinking….

1. What are the influencing factors?

2. Which are positive and which are negative influencers?

3. How large or small are each of these influencers?

So before resting on that early explanation, worth a moment next time to reflect on it. Is it possible – almost certainly. But to what extent is it probable, and certainly avoid taking action on that ‘prossible’ explanation.

Units of Mis Measurement

We have a great set of units for measuring and communicating… height, weight, speed, money, light, time and so on. An very cleverly they have become very scalable over time. As soon as we have 60 minutes we start counting hours, when we get to 7 days, we start counting weeks, at 100 pence we start counting pounds and so on.

However these are not always applied as sensibly as they might. That just creates unnecessary barriers to understanding.

A publically odd one is that associated with roadworks….. “Works starts here from 20 January for 15 weeks”. This seems part information part quiz. To effectively triangulate this – and help us more easily understand the impact on us - we also need the end date. Deduceable but not explicit. So here counting weeks when months might be more logical. Maybe it’s something about making the end date less explicit to more easily cope with a possible over-run.

This information is presented in a way that makes you have to do some unnecessary mental maths while driving. It might go something like this... “20 weeks…that means 20 divided by 4 weeks and a bit per month on average means about 3 and half months. And 30 days hath November etc… which means they should be done by…………. 4^th April”. That’s concentration that’s not being applied to driving, and especially in an area which needs extra care as there are roadworks. This is one way to improve national levels of numeracy, but probably not the safest.

So I propose a standard sign, with start, duration and finish. No distracting mental maths here.

Ready, Aim, Fire

Targets are about achieving things in the future. The future is inherently unknown, so targets can be inherently problematic. The secret to a good target is it having a good process to set one.

There’s plenty of tests to define a what a good target might look like. Perhaps the most popular being the SMART test, that a target is Specific, Measureable, Achievable, Realistic and Time specific. However there’s less to steer the process to get to such a target which passes all these tests.

Targets are one component to a balanced assessment of performance, but only one. The others are (a) trend – the levels of performance until this point and (b) benchmark – the level of performance being achieved by comparable others. See What a performance on how to do that.

get the graphic

Simply using targets as a measure of success or failure can be quite crude, unless they’ve been very well constructed. After all the measure is binary – achieved or failed, and to be fair, the world is typically not that simple. The secret to setting a good target is that it should be determined by first assessing performance on those other two dimensions of trend and benchmark.

So three parts….

Part 1: Trend

As this is typically using like-for-like data over time, this is probably the best starting point. The current level, past levels and a trajectory. It can be helpful to consider a range of time periods, rather than blandly compare one end of year figure with a previous one. Much better to understand the underlying trend, and help separate the real trend from any noise (natural variation). Simple visual tools - such as plotting the measure over time - can give quick and easy insight, without having to rely on clever statistical techniques, as the eye is very good at seeing patterns.

Part 2: Benchmarks

This may not be perfect like for like comparison but at the least provides an indication of the levels of performance being achieved elsewhere. In most cases there will be no single comparison, rather a range of sensible comparitors. In corporate terms this might be geographical (national, regional, local) or other organisations, or other sectors.

The next level of sophistication would be to assess the trend for these benchmarks - are they going in the same direction as your trend - to provide a more thorough benchmark assessment. Again simple visual tools can be both easy and effective.

Part 3: Previous target performance

A view to previous targets, and in light of the above trend and benchmark, will provide a sense to which they may have been about right, or over or under ambitious.

Next Steps

The next level of refinement would be to give a sense of relative emphasis to these three dimensions. Here a starting point.

Trend. This is perhaps the strongest predictor of future possibility, and should have the strongest weight. There is also some implicit assumption that in setting the target there is a very reasonable degree of control over it. Weight = 3

Benchmark. The risk here is comparing apples and pears, but still should be seen as broadly indicative. Weight = 2

Previous Target. Given its arbitrary nature, and unless set well previously, the least significant indicator. Weight =1.

get the graphic

So ready, aim fire:

... ready – set a sensible target using a process

... aim – focus activity to achieve that target, ‘targeted effort’

... fire – a sensible target and focussed effort give the best chance of success.

The assessment of trend and benchmark, is really just a proxy for understand the cause and effect relationships at work which influence the measure for which a target is being set. After all a target is only sensible where there is a reasonable degree of control of the measure. These relationships can be quite tricky to map and measure exactly, so the trend and benchmark simply summarise the current and past state of play.

There are alternatives of course….”to be sure of hitting the target, shoot first, and call whatever you hit the target”. Ashleigh Brilliant

Data Provenance

In the world of art and antiquities, then the quality of the 'provenance' (‘to originate’) affects the value, and the same is true of data, although typically less explicitly.

Lots of data looks straight forward enough. However in any data there will be a number of stages before it gets to be analysed and reported, and where things might go awry to a greater or lesser extent, but may not be particularly noticeable. There in is the danger.

So here’s the broad journey. This is intended as a simple 12 step guide to think about where in this journey there may be issues which might impact on that analysis, reporting and messages that emerge. In the very simplest of terms those 12 steps combine into three broad parts: a planning part, an implementation part, and a data management part.

get the graphic

PART A: Planning

1. Purpose.

This is the point at which the headline purpose of the data collection is determined. There may be multiple purposes, but a the very least a question to be answered. The clarity here is the benchmark against which all further stages in this journey are tested and aligned. Often data collected for one purpose may be used for another purpose further down the line. This may be fine but equally may be inappropriate depending on the effect of other decisions in the process. Age data cannot be used as flexibly as date of birth data for example.

2. Requirements.

If ever there was a case for a “start with the end in mind” approach this is it. At this stage there is a question to be answered, with a specific degree of granularity and confidence. That granularity might be geographical (national, regional, local authority), or socio-economic (male/female, age group and so on) or any other dimension, and any multiple combination of those dimensions.

3. Constraints

Worthy of consideration in their own right, but broadly an extension of requirements. These constrains often include financial, quality and time. Constraints might not always be up front and obvious until after the event, so really worth considering up front. A cautionary tale… All police forces measured the level of satisfaction of members of the public involved in road traffic collisions (nee accidents). A once (and only once) overlooked constraint was to not seek the satisfaction level of those known to be fatally involved. Enough said.

4. Design

So with the question to answer then there’s the overall methodology to determine. Ask everyone (census) or just some (sample). May be mainstream quantitative or more qualitative approaches. This will also include setting acceptable tolerances for data quality to be applied further down the line.

PART B: Implementation

5. Definitions

This is about making sure there is a clear definition of measure or questions to ensure that responses can be differentiated. Great policing example…. When is a crime not a crime? Front door shows significant marks around the lock….attempted burglary from using a jemmy, just some (criminal) damage, or the home of an unsteady hand?

6. Specifications

This might be the units of measurement, or the number of decimal places. Needs to be consistent to ensure quality further down the line.

7. Collection

This might be ask, measure, count, read, hear, feel, smell, taste.

8. Recording

Having collected some data we then need record it somehow. That might be the write it down, data entry to a computer, audio recording, and in some cases it might even be ‘remember’.

PART C: Data Management

9. Enter

Having recorded something in one form this may need entering into a separate system for consistent handling further down the line… such as storage. Crime details transposed from a notebook to a computer crime system for example.

10. Validation

Ensuring data is valid, that it conforms to specific criteria. In the simplest of terms this might be ensuring that data is technically correct, a date is not 32^nd day of the 13^th month. This can then extend to logical validity, where technically correct data is cross-referenced with each other, for example checking the date of birth is before the data of marriage. These might be thought of as “hard” validations, where the tests are quite certain, as opposed to ‘soft’ validations, where the more unusual events are highlighted to be queried. So the more subjective soft validation might be to check that there is a minimum, say 16 years between date of birth and date of marriage.

11. Processing

There may well be some processing of the data, perhaps to organise this more efficiently for data storage. Or indeed to split the data into various time categories, perhaps monthly. One of my favourite examples I’ve seen was the processing of car crime data in order to do some hotspot mapping. Car crime is predominately a mixture of theft of motor vehicle and theft from motor vehicle. In this case to do the mapping required the postcodes.

12. Storage

The data gets stored, new data is added and backups are typically made, whether automatically or manually. Hopefully no data gets lost or over-written.

And then the analysis starts….and that’s another story.

This all looks straightforward enough, and mostly can be. The point here is that before there is any data analysis all these stages will have to have been addressed to a greater or lesser degree. Any inadequacies earlier in the journey cannot be compensated for further down the line, so in short the data ‘is what it is” when it gets to the analysis. Knowing enough about 'what it is' and indeed 'what it is not' is the foundation for a genuinely insightful analysis.