sexta-feira, dezembro 09, 2011

Cuidado com os outliers

"Increasingly, I observe statistical sophisticates indulging in analytic advocacy — that is, the numbers are deployed to influence and win arguments rather than identify underlying dynamics and generate insight. This is particularly disturbing because while the analytics — in the strictest technical sense — accurately portray a situation, they do so in a way that discourages useful inquiry.
I always insist that analytics presentations and presenters explicitly identify the outliers, how they were defined and dealt with, and — most importantly — what the analytics would look like if they didn't exist. It's astonishing what you find when you make the outliers as important as the aggregates and averages in understanding the analytics.
Always ask for the outliers. Always make the analysts display what their data look like with the outliers removed. There are other equally important ways to wring greater utility from aggregated analytics, but start from the outliers in. Because analytics that mishandle outliers are "outliars.""
Trechos retirados de "Do Your Analytics Cheat the Truth?
Em linha com uma das primeiras lições que aprendi no mundo da Qualidade: Nunca confiar nas médias. Se me apresentam uma média... pedir o desvio padrão também.

Sem comentários: