Big Data

Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz

Grade: B+

In Everybody Lies, the author used Google Trends, Google AdWords, and other sources of Big Data. He found some interesting things. For example, it can be proven that people lie on surveys and in polls. If, for example, people used as many condoms as they claim to on surveys, it would equal far more than the number of condoms sold.  Another example is that when you ask people if they’re registered to vote and plan to vote in the next election, most will say yes, but some of them are lying, as the actual registration and voting numbers show.

Google searches are supposedly a better predictor of voter turnout than surveys are. People who are really interested in voting tend to look for information about it. I don’t doubt that. Personally, I did at least two web searches before the last election. First, I double-checked my polling location, because it changes from time to time. Then I looked for a sample ballot.

So, to some degree, people are more honest with Google than they are with studies and surveys. This makes sense. People go to Google because they want an answer, and they know they won’t get the right one unless they ask the right question. That means being honest. Google doesn’t care what you say anyway. It’s just a machine. People, on the other hand, tend to judge us.

Other potentially interesting things mentioned in the book:

  • Horses with large left ventricles are more likely to be good racehorses.
  • Areas with a lot of racist search queries tended also to be areas where Trump had strong support in 2016 (and where Obama had had little in previous elections).
  • Since the crackdown on abortion began in certain places earlier this decade, the number of Google searches for self-induced abortions has gone up. When all the numbers are examined, in places where abortions are hard to get there are fewer live births than there ought to be, suggesting that a lot of women do indeed figure out how to self-induce an abortion.

But, I had some issues with the information and how it was collected. First, as a person who uses Google in strange ways, I’m leery of reading too much into anonymous search query data. More importantly, I’m not keen on how Google (and other big tech companies) use our search history and web activities as data to be mined for more ways to exploit us. These companies are not beneficent. They do not collect the data for our benefit. Any good that comes from it is unlikely to be enough to compensate for the harm it can do. So, I don’t share the author’s enthusiasm for Big Data, and reading about it made me cranky. I recommend the book, though, because everyone ought to know more about the data that they’re surrendering through their apps and web services.

This entry was posted in Reading. Bookmark the permalink.

Leave a Reply

Your email address will not be published.