Speaking Data: Precisely Wrong?

Ahh the lure of decimal places. This is the fallacy of precision. Express something in a precise way – make it look very exact – and somehow it seems so much more convincing.

Precisely accurate data can be quite an elusive thing outside the scientific lab. Accuracy and precision are two important – and different - dimensions to at least get a feel for in any analysis. In short a sort of proxy for statistical confidence. Not helpful that these terms are often used interchangeably, even to some extent in their dictionary definitions of each other.

So accuracy is about being “faithful measurement or representation of the truth”. In a class of young children (with actual ages between 5 and 7 years) an accurate measure of the average age might be 6 years old. An inaccurate measure would be 9 years old, so not faithfully representing the truth.

Precision is about the “exactness of measurement”. In that class of children, a precise average of their ages would be 6.25 years.

So accurate and precise 6.25 years…precisely right

Accurate and imprecise 6 years…generally right

Inaccurate and imprecise 9 years…generally wrong

Inaccurate and precise 9.75 years…precisely wrong

One of my favourite examples can be found in a footnote in the formal UK Economic Accounts. “Estimates are given to the nearest million but cannot be regarded as accurate to that degree”. (Quarter 2 2010 Edition 71. Table A1. Note 1). So those final units of millions are in fact of no value at all.

Here’s an alternative overview of accuracy and precision, in a dartboard sort of way, aiming for the bulls eye.

get the graphic

Accurate…… On target. Correct. Close to what you’re aiming for or trying to measure.

Precise…. Tightly clustered. We can consider this as having a small deviation.

So ideally we want something both accurate and precise. This give us four simple outcomes form an analysis.

Precise and accurate… Exactly right.

Precise and inaccurate… Exactly wrong.

Imprecise and accurate… Generally right

Imprecise and inaccurate… Generally wrong.

Worth recognising that in the world of laboratory science there are rules about what happens to the precision in measurement when different data are combined. In the simplest of terms, when we add and subtract data we add their respective errors, and when we multiply and divide data we multiply the errors. There’s also something called significance arithmetic, which tries to simplify this sort of thinking to be application for simple calculations.

This all provides a healthy reminder to be thinking about the levels of accuracy and precision when looking at data. Worth pausing to consider if the data is sufficiently precise and accurate for the job in hand. And certainly not rush to decisions based on data that insufficient or unknown, especially where it might generally wrong or even exactly wrong.

There can be some interesting debate about whether precision or accuracy takes priory, but in practice these need to be considered together. Intuitively something which is generally right (accurate but imprecise) seems preferable to exactly wrong (precisely inaccurate). In shooting, a high degree of precision or clustering is desired. This is about consistency, which if slightly off target this can be compensated for by adjusting the sights to make the cluster more accurate. Also works for manufacturing where consistency is important and variation is minimised. A whole different debate opens up here about Statistical Process Control to measure variation in processes to help understand and reduce the different sorts of variations.

So worth having a conscious prod at the accuracy and real precision of the data, rather than take this at face value.