Many of our analyses are probably based on crappy data

This thought crossed my mind once again when I read this passage from India: A Portrait (p. 118):

The politician Harold Cox, who had taught mathematics at Aligargh Muslim University, was once advised by a judge in India: “Cox, when you are a bit older, you will not quote Indian statistics with that assurance. The Government are very keen on amassing statistics – they collect them, add them, raise them to the nth power, take the cube root and prepare wonderful diagrams. But what you must never forget is that every one of those figures comes in the first instance from the chowty dar [chowkidar], who just puts down what he damn pleases.”

From Wiktionary:

chowkidar – watchman, caretaker, gatekeeper; one who inhabits a “chowki”, police station or guard house.

This quote pretty much describes IPEDS, the Common Dataset, and rankings data.