Wednesday, December 17, 2014
Dataclysm: Who We Are*
Christian Rudder’s “Dataclysm: Who We Are (When We Think No One’s Looking)” claims that, in order to understand racism, sexism and bias, we must look to the patterns in large numbers of individual instances. In this sense, the subtitle’s “We” is almost literal. For the first time, data about how individuals’ biases play out in spontaneous interactions is available for analysis on a massive scale. OkCupid, the online dating site of which Mr. Rudder is co-founder and president, alone has some five million users. This opens up all sorts of possibilities. Instead of asking people survey questions, he goes and looks at what actually happens when 100,000 white men and 100,000 black women interact in private.” Mr. Rudder used this evidence to explore the mathematics of human attraction, publishing the results on the site’s often provocative OkTrends blog. How many years does the camera flash add? Seven—compared with natural lighting, flash-lit photos causes the same drop in attractiveness rating as being seven years older. Do OkCupid users with higher average attractiveness ratings get more dates than users with lower average ratings but with more variance in the individual ratings they receive? OkCupid users whose photos got wildly different attractiveness ratings from different suitors went on just as many dates as those judged more uniformly appealing. In fact, Mr. Rudder found, the best strategy for getting dates was to play up one’s most polarizing feature (tattoos, odd hair), which produces more enthusiastic responses. “Dataclysm” emerges from the OkTrends blog as a more comprehensive discussion of the provocative results that Mr. Rudder and like-minded researchers in the social sciences and in tech are producing from this sort of data. The book is divided into three broad topics: sex and relationships; culture and politics; and the ways in which individuals identify themselves. Tidy questions about some of the most hotly debated topics are given straightforward answers that range from amusing to unsurprising to unpleasant. As a researcher, Mr. Rudder clearly possesses the statistical acumen to answer the questions he has posed so well. As a writer, he keeps the book moving while fully exploring each topic, revealing his graphs and charts with both explanatory and narrative skill. He offers explanations of what the data can and cannot tell us, why it is sufficient or insufficient to answer some question we may have and, if the latter is the case, what sufficient data would look like. He shows you, in short, how to think about data. He closes with a section reflecting on the risks and rewards that come with companies having free and open access to their users’ data. While “Dataclysm” aligns itself with user research, Mr. Rudder himself is associated, in the eyes of many users, with a more worrisome kind of user experimentation and commercialism. OkCupid is an ad-supported business, after all. “Dataclysm” may make an excellent case for the necessity of user data in social-science research, but it does little to justify experimentation on users. Interestingly, many questions addressed in the book didn’t require such experimentation to answer. The answers were already in the data; for better or worse, it was just a matter of looking.