You’ve heard the expression good things come to those who wait. But what about those who weight?
How good is your online survey?
That depends. We’d have to look at number of factors and how we measure what “good” is. But how representative your survey is, is a pretty strong indicator of data quality.
What Does Representation Really Mean?
The basic idea is that your sample (the people you interview) should accurately reflect the population (the group you’re studying). The people who answer your survey should be as similar as possible to the people you’re trying to understand.
This can be difficult to achieve. Respondents are rarely exactly the same as the entire group. Often, surveys garner too many responses from some types of people and not enough responses from others. We call this over- or undersampling.
The different groups within your population are defined by the characteristics of your respondents. (In the data world, we call these groups strata.)
Different groups tend to vary in personal preferences, lifestyle choices, and social behaviours. For example, labourers often have a different perspective than their managers. Youth lead very different lives than their grandparents. Maintaining your population’s mix of characteristics is the key to getting results you can rely on.
With me so far? Good.
How Does Weighting Help Online Surveys?
While you can’t control who responds to your online survey, you can control how much impact each respondent’s answers have on your results. This is called weighting. Weighting makes your online survey results represent the actual population you are gathering information on.
I’ve already talked about how to weight your online survey results manually. In this post, I’m going to focus on how Veracio uses weighting in online surveys. (Short answer: automatically and without any extra work for you!)
Veracio uses something called post stratification to weight your online survey results.
- Post, (after) because the calculations are done after the data is collected
- Stratification because we use known strata (group characteristics) to correct for the fact that your sample doesn’t effectively represent your population
What Exactly Does Weighting Do?
Let’s imagine we’re doing a survey of the entire US. Using US Census data, we find an estimate of how much of the population is men vs. women. (Let’s imagine it’s 50/50.) These are the population parameters.
Now say 100 people responded to our survey — 28 women and 72 men. Right away, we can see our sample data is not representative of the whole population across the gender strata. We have oversampled men and undersampled women.
To correct this, we create a new variable — a weighting variable. This variable has higher values for undersampled groups. It pushes the statistical importance of oversampled groups (in this case, men) down and the statistical importance of undersampled groups (women) up. Each individual gets their own factor based on their characteristics. Veracio applies that variable to analysis to make final results more accurate.
Each man gets a weighting variable of 0.69; each woman gets a weighting variable of 1.79. When we create tables or charts with the data, Veracio weights each individual’s answer with these variables.
What Kind of Magic Is This?
It might seem like magic (math is kinda magical, if you look at it right), but it’s not, really. To apply weighting to our online survey results, Veracio figures out:
- How many people there are in each strata in the population
- The number of people there are in each strata of the sample
- What the level of non-response might have been for the population
- The probability that each person might have been selected for the survey
An individual’s weight is higher:
- If there is a high level of non-response in their strata
- By the probability of them being selected for the survey
If only a very few people from a specific group respond to the survey, they are assigned a very large weight. That means if 3 men and 97 women in our 50/50 population responded, the male responses would be weighted very heavily and their answers would count as very important to the survey.
This poses a new problem.
With only three responses, we don’t really know if those responses are reliable. Giving them increased weight puts us at risk of biasing results. When we have worryingly large weights attached to very small pieces of questionable information, it’s natural to want to limit the high weights — even at the risk of introducing bias — to prevent any single piece of data from having too great an impact on the final results. This is called trimming.
Trimming sets the maximum weight allowed in your survey results. While there is no hard and fast rule as to the appropriate limit for weighting, best practices currently dictate that three is a reasonable cutoff. (Three is the number we use, meaning no individual can be weighted higher than three.)
Now You’ve Got It!
You now know the basics of how weighting improves your online survey — pretty awesome, right? But what’s even better is that to use Veracio you don’t really need to know any of this because the tool takes care of all the work for you.
If your newfound weighting expertise has inspired you to create a new survey (or even recreate an old one to see what kind of difference weighting makes!), why not get started right now?
Still have more questions about weighting and online surveys? No problem. Next week we’ll dig even deeper and learn about using more than one weighting variable at a time — but if you just can’t weight, erm, wait, get in touch with us now.