How representative is online research?

Representativeness is like the El Dorado of market research: a mystical place whose existence is passed down by word of mouth, even though it is hardly documented in literature. We aim to assess the representativeness of online research and scrutinize four significant biases that can potentially impact the outcomes of online studies.

What does representative mean?

Generally, the term representative refers to something that accurately reflects or corresponds to a larger group or whole. In market research, we are referring to sample representativeness. A representative sample is intended to mirror the characteristics of the decided population. In this setting, a population means a defined group of people sharing a set of similar characteristics (e.g. demographics, gender, or other variables like current dog owners). The idea is that the sample should make valid inferences about the larger group.

Therefore we say that a sample is representative if the results from analysing the sample can also be deemed true for the whole corresponding population.

If you analyse this definition of representativeness, you will note that there is no representativeness as such. The term representative always requires a “for”, e.g.

  • representative for the general population (a.k.a. national representative)
  • representative for the general population between 18 and 65 years
  • representative for the internet population between 18 and 65 years

Statisticians claim that only random samples can be representative. By their logic, quota samples cannot be representative because it is impossible to know the right quotas before conducting a representative study. However, this is a very theoretical argument. In practice, researchers have a lot of knowledge about how variables are distributed among their target groups and can set the quotas accordingly. In addition, census data from the national statistical offices allow researchers to validate and adjust their models continuously.

At the same time, it is getting harder to draw a perfect random sample. Telephone or personal interviews for example have to deal with a growing proportion of people refusing to participate in research projects. So even these methods have their limitations. At the end of the day, random sampling techniques are much more expensive and, for many research questions, provide little added value compared to quota sampling.

Is online research biased?

Four steps need to be taken before a group of people from the general population can leave their answers in a data file. The following section will analyse these four steps and stress the corresponding limitations for representativeness.

Coverage bias

If you want to assess the representativeness for the general population, you should start having a look at internet penetration. In Europe, an average of 84% of the people use the internet daily. Especially in Western and Northern Europe, the internet penetration is significantly higher, with excellent values in Norway, Denmark and the Netherlands (each 96%). In these countries, online research misses out on less than 5% of the overall population.

Of course, the overall internet penetration is only relevant, if you want to draw conclusions that are representative for the general population. In many cases the results just need to be representative for the online population though, especially when the research topic is about e-commerce, online advertisement or similar topics. In this case you don’t have to worry when missing out on the so-called “nonliners”.

And who are these nonliners exactly? In the past, the online population has been considerably younger, better educated and more interested in technology. These differences have vanished during the last years and continue to disappear, as the overall internet penetration increases. Especially the advent of mobile internet usage has brought less educated and older people to use the internet.

All in all, the coverage bias is only relevant if you want to draw conclusions for the general population and not just the online population. In this case you should try to figure out if your research topic is correlated to education or technical affinity to address the possible bias. Unfortunately, there is nothing you can do about it as a researcher. If you really worry about the data quality, you have to choose a different method of data collection. However, the coverage bias is not too much an issue for most topics and in most countries and thankfully even declining.

Selection bias

Before digging deeper into panel recruitment, we have to introduce a very important distinction. Whenever you are actively selecting and inviting eligible panel members, you are in perfect control of who is joining the panel. This is what we call active recruitment. In contrast, whenever you allow people to subscribe to the panel themselves, you are not in control of who is joining the panel. This is what we call open recruitment.

Most panels operate with open recruitment, because it is a very cost effective and uncomplicated way of growing a panel. However, there is a clear downside to it. First of all, you will mainly attract people who have a self-interest in taking surveys and these are not representative for the general population. Secondly, it is nearly impossible to keep people out of a panel who do not meet the required demographics or do not comply in any other way with your quality standards.

At Norstat, we believe in the quality of active recruitment and, therefore, don’t have a public registration page to our panels. In countries where we have call centers, we randomly select people from the general population and invite them on the phone to join our panels. This makes our panels as representative as possible. In all other countries we use a very broad mix of recruitment channels to avoid that any single source can have a major impact on the overall quality. Registrants from these sources are directed to a hidden subscription page that can be shut off at any moment. In fact, this happens every time we identify quality issues or fraudulent behaviour coming from a specific recruitment source. Finally, we can boost any demographic group with targeted recruitment, if we feel that the panel structure is not balanced.

In a nutshell, if panel providers invite the wrong people to join their panels or miss out on a specific target group the panel is not representative. All the efforts at this stage may not be visible to many buyers, but they definitely make the difference between a high quality research panel and a non-representative mailing list. Therefore, you should always compare panel providers by how they recruit into their panels. And by the way, our registration forms are 100% mobile friendly to make sure that we don’t exclude mobile users from joining our panel. You would be surprised by how many panels this is not a standard, yet. We’ll come back to this topic later on.

Non-response bias

Let’s assume we have a representative online panel, now. If we would draw a random sample of panel members, this sample would also be representative. Unfortunately, not all of the members would click on the link in the invitation email. Some groups would respond very quickly, others would require more time and some would hardly respond at all. At the end of the field period, the data set would not be representative for the target population, although it would have been a perfect random sample at the beginning.

Non-response is a serious issue – not only for online research but for all methods of data collection. Interestingly, especially online panels achieve satisfactory response rates, because their members expect to receive survey invitations on a regular basis. In addition, they have learned over time that they can trust the panel provider with regard to privacy and data protection. This is probably the biggest difference to the calls of telephone interviewers, which are unsolicited. Furthermore, in online research you could simply extend the duration of the field period and send reminders to increase the response rate.

To avoid skewed distributions in your final data, you should set quotas. These can be either soft quotas (meaning the project manager will try to achieve the desired distribution on best effort) or hard quotas (meaning that respondents won’t be able to complete the interview, once you have collected enough feedback in this target group). A third alternative is working with weighting factors to adjust for over- or under-represented subgroups.

All in all, the non-response bias can distort your results dramatically. Fortunately, Norstat achieves response rates above average due to our rigorous focus on our members’ motivation.

Exclusion bias

Last but not least, we have to talk about mobile devices. If your survey template does not allow mobile users to answer your questionnaire, you will miss out systematically on a large proportion of the population. Depending on the country, about two thirds of the population frequently use the internet with a smartphone, tablet or computer. Hence, excluding them from a survey may have a massive impact on the overall quality of your study.

Please note that very often the exclusion bias is forgotten when talking about representativeness, especially if you are not serious about data quality in the previous steps. If the subscription page of your panel is not mobile friendly, you systematically exclude mobile users at an earlier stage. And if you don’t use responsive email templates or a mobile app when inviting your panel members, the response rate may be smaller, because you also miss out on mobile users.

To keep it short, if you strive for representative results, you should make sure that your survey is suitable for all devices. At Norstat, we have a longtime experience in questionnaire design and will be happy to script your survey in the best possible way and for all devices.

The conclusion

So how representative is online research? We believe it can be very representative, as long as you don’t compromise with quality. At Norstat, we give our best to deliver the premium our clients expect from us. And we try to be as transparent as possible in this process, because we have nothing to hide.

But you can also see that a lot of things can go wrong when conducting online research with panels. This doesn’t mean that other methods have more advantages, their challenges are just different. In any case, online research requires professional care in order to generate reliable and representative results.

Feel free to drop us a line, if you want to learn more about our quality standards. We’re looking forward to hearing from you!

Are you going to collect data for a market research project?

We’re keen on becoming part of your success story.

Get started