NY Times: Majority of our time is spent on data cleaning, not data analysis

Yet far too much handcrafted work — what data scientists call “data wrangling,” “data munging” and “data janitor work” — is still required. Data scientists, according to interviews and expert estimates, spend from 50 percent to 80 percent of their time mired in this more mundane labor of collecting and preparing unruly digital data, before…

When discrimination is baked into algorithms

A recent ProPublica analysis of The Princeton Review’s prices for online SAT tutoring shows that customers in areas with a high density of Asian residents are often charged more. When presented with this finding, The Princeton Review called it an “incidental” result of its geographic pricing scheme. The case illustrates how even a seemingly neutral…

Malcolm Gladwell is skeptical of big data

Malcolm Gladwell had some bad news for mobile marketers: Just because you have more data doesn’t mean you’re going to make better decisions. At Tune’s 2015 Postback conference on Thursday, Gladwell outlined the gap between what we may think we know about audience and the truth. In normal Gladwellian-style, the best-selling author of The Tipping Point…

The Supreme Court, big data, and citizenship

The case, Evenwel v. Abbott, poses a question: whether the Constitution’s long-standing “one person, one vote” principle requires equal numbers of voters per district instead of equal numbers of people, as is current practice. Most commentary on the case has focused on its implications for political parties and racial groups. But focusing on the politics, or even on the…

Some schools understand the value of data

In this small suburb outside Milwaukee, no one in the Menomonee Falls School District escapes the rigorous demands of data. Custodians monitor dirt under bathroom sinks, while the high school cafeteria supervisor tracks parent and student surveys of lunchroom food preferences. Administrators record monthly tallies of student disciplinary actions, and teachers post scatter plot diagrams…

Artist helps scientists visualize data

For the past year or so genetic scientists at the Albert Einstein College of Medicine in New York have been collaborating with a specialist from another universe: Daniel Kohn, a Brooklyn-based painter and conceptual artist. Mr. Kohn has no training in computers or genetics, and he’s not there to conduct art therapy classes. His role…

Innovator’s Dilemma sounds as valuable as big data

The theory of disruption is meant to be predictive. On March 10, 2000, Christensen [author of The Innovator’s Dilemma] launched a $3.8-million Disruptive Growth Fund, which he managed with Neil Eisner, a broker in St. Louis. Christensen drew on his theory to select stocks. Less than a year later, the fund was quietly liquidated: during…

How to get into an Ivy League College guaranteed

Former hedge fund analyst uses data on student admissions to craft contracts guaranteeing admission into certain types of schools, proving yet once again that quants rule the world. Note the somewhat hysterical comments by admissions personnel, who refuse to admit that he can find generalities in his data about their behavior. http://www.businessweek.com/articles/2014-09-03/college-consultant-thinktank-guarantees-admission-for-hefty-price#p1