Pretty damn random:
The NIPS consistency experiment was an amazing, courageous move by the organizers this year to quantify the randomness in the review process. They split the program committee down the middle, effectively forming two independent program committees. Most submitted papers were assigned to a single side, but 10% of submissions (166) were reviewed by both halves of the committee. This let them observe how consistent the two committees were on which papers to accept. (For fairness, they ultimately accepted any paper that was accepted by either committee.)
In particular, about 57% of the papers accepted by the first committee were rejected by the second one and vice versa. In other words, most papers at NIPS would be rejected if one reran the conference review process (with a 95% confidence interval of 40-75%)
This result was surprisingly large to most people I’ve talked to; they generally expected something like 30% instead of 57%. Relative to what people expected, 57% is actually closer to a purely random committee, which would only disagree on 77.5% of the accepted papers on average [given the targeted acceptance rate both committees were supposed to meet].