As the nineteenth centrury British Prime Minister George Canning said, “I can make statistics tell me anything except the truth.”
This is a top class analysis of the biases that can occur unintentionally in the production of statistics, but there are also those biases that occur intentionally. An example of this was the so called “shy Trumper effect in the 2016 US election (which spawned lots of adolescent jokes in Britain as trumping is a euphemism for farting.) The explanation for Trump’s victory having turned polling data upside down was that many people who supported Trump were reluctant to state their position. The same thing may have occurred in our Brexit referendum, (but without the cheap jokes,) however in both cases I believe a lot of the error was due to Trump supporters or Brexit supporters simply declining requests to take part in information gathering exercises. Whatever their reasons, I recall reading on the UK Polling Report blog that in the 1980s polling organisations would get 50% positive responses from people they approached to take part in polling, in the past few years this had dropped to just above 20%.
Now whether this reluctance is spread evenly across the population or applies more to some groups than others is anybody’s guess. Would it be older, more conservtve people, the group most people might assume would be more reticent or young people who would simply rather listen to music or browse the internet? Though neither group would be likely to think of it in such terms, both would be intentionally skewing the sample.
The biggest problem with statistical analysis is how do you get a grip on data that is slipperier than an eel.