sexta-feira, setembro 27, 2013

Acerca do "Big Data"

Nestes tempos em que se fala cada vez mais de "Big Data"
"Uncertainty is caused by not having the information we need. Therefore, adding more information will reduce uncertainty. That certainly seems simple enough.
.
Claim 4: We can reduce uncertainty by gathering more information.
.
The people in our sample agreed with this statement. Their average rating was 5.57. Of the 164 people who responded to this statement, eight indicated some degree of disagreement.
.
Disclaimer
There are different types of uncertainty. Sometimes we are uncertain because we don’t have the information we need. That’s the type of uncertainty that claim 4 covers. Sometimes we have the information but we don’t know if we can trust it. Sometimes we trust it but it conflicts with other information we also believe. And sometimes we believe it but we can’t figure out what it means.1 Claim 4 covers only the first type of uncertainty, which stems from missing information.
.
When we are faced with the other types of uncertainty, adding more information may not help at all. If I’m going to doubt the accuracy of any information I receive, adding more information just gives me more to doubt. If I believe in a data point but it conflicts with others, then adding more data may add to the conflicts instead of resolving them. And if the situation is too complex to sort out, adding more data may increase the complexity, not decrease it.
A useful way to think about uncertainty is to distinguish between puzzles and mysteries.3 A puzzle is easily solved with the addition of a critical data point. For example, as I write this (in 2008) we don’t know exactly where Osama bin Laden is hiding. That is knowable. He is somewhere. We just don’t know where he is, or even if he is alive. But if an informer were to provide bin Laden’s current location, the puzzle would be solved.
.
A mystery isn’t solved by critical data. It requires more analysis, not more data. If we want to know what the future will bring to Iraq, no data point will give us the answer. No amount of data will eliminate our uncertainties about whether China is a potential business partner of the United States or an inevitable military, political, and commercial threat.
Claim 4 aims to solve puzzles, not mysteries. Mysteries emerge from ambiguous and complex situations. Even if we have the data we need, and know what data points to trust, and they aren’t inconsistent with each other, we still aren’t sure how to explain past events or anticipate future ones. Mysteries require sensemaking. Adding more data doesn’t necessarily improve success in resolving mysteries.
...
Too much information can make things worse. As we add more and more information, the value of each successive data point gets smaller and smaller while the strain of sorting out all the information keeps increasing. Eventually, we may reach a point where the additional information gets in our way. We would do better to stop gathering more information before this point, but most of us keep seeking more data. We can’t stop ourselves. We have become data junkies.
...
If we adopt claim 4 as our creed, we will know just what to do when we feel uncertain.
We will gather more data. The more the uncertainty, the more strenuous the data gathering. We won’t stop until our uncertainty disappears or we become exhausted, whichever comes first."
Trechos retirados de "Streetlights and Shadows: Searching for the Keys to Adaptive Decision Making" de Gary Klein.

Sem comentários: