What sample size is representative?

What makes a sample representative?

Let’s start with a story that actually has become the founding myth of market research. A century ago, the American journal The Literary Digest started to conduct opinion polls among their ten million readers to predict the results of the presidential elections. In five successive elections their predictions were absolutely right until they massively failed in 1936, even though they conducted about 2.4 million interviews among their readers. To their surprise, George Gallup was able to predict the result of this election correctly with “only” 50,000 interviews.

So, what happened? The Literary Digest’s sample failed, because their readers weren’t representative for the general population. They had a different age structure, a different average income – and, apparently, different political preferences. On the contrary, Gallup understood, that representativeness is not so much about the sample size but depending on the right composition of the sample. He simply used quotas to make sure, that every group of people was correctly represented in his sample. This break-through discovery was the starting point for market and opinion research as we know it today.

For representativeness it is not the size that matters but the right composition. But is that plausible? In the 1960s, A.C. Nielsen Jr. gave an interesting answer to those, who believed that a higher sample size would increase its representativeness.

“If you don’t believe in random sampling, the next time you have a blood test, tell the doctor to take it all.” – A.C. Nielsen Jr.

Despite its undeniable sarcasm, this quote provides us with a very comprehensible analogy. It doesn’t matter if you analyse a drop of blood or if you take a whole liter of it: the analysis findings will always be the same. One drop of blood perfectly represents all of it.

Why does sample size matter?

Obviously, sample size is still important. But why exactly does it matter? Whenever you have a representative sample for a population, by chance some of the target variables may be over- or underrepresented in your sample. Unfortunately, “by chance” means, there is really nothing you can do about it, when collecting the data.

At least, statistical calculations can help you to estimate the likelihood that your error is within a certain margin, e.g. that such deviations from the real value are less than x% at a confidence level of 95%.

For opinion researchers, a confidence level of 95% is the most common option. Here, your risk is less than 5% that the real value is outside the corresponding margin of error. However, in other disciplines, a confidence level of 99% might be the standard (e.g. in the pharmaceutical industry, as statistical errors can be a question of life and death).
Given the confidence level, you can calculate the margin of error for each value of a distribution. Let’s say your survey result gives you a market share of 50% and your corresponding margin of error is 3% (at a 95% level), then your risk is less than 5% that the real market share is lower than 47% or higher than 53%.

If you want to reduce the margin of error (given a certain confidence level), you basically have only one choice: you have to increase the sample size.

How to decide your sample size?

To determine your sample size, it is often required to start from the end and working our way backwards to the beginning. However, for the sake of clarity, we will briefly walk you through the interviewing process in the right order and explain the final statuses a respondent can get.

The final statuses a respondent can get from invitation to non-response, to screen out, to quota fail, to break off, to complete

It all starts with sending invitations to our panel members. Out of all those invited, only a portion will actually click on the link and start the survey. That’s what we describe with the response rate (the percentage of responses relative to the total number of invitations to participate). Furthermore, at the beginning of a survey, we typically have some screening questions to identify the desired target group. The percentage of eligible respondents at this stage is reflected in the incidence rate (the percentage of individuals in a target population who meet a specific criteria required for a study). After we have made sure to have the right target group, we will assess possible quotas and end the interview for those respondents, whose quotas have been filled already. Quotas are usually assessed after the screener to make sure we can measure the right incidence rate without the interference of quotas. If respondents fit into an open quota, they can participate in the main survey. Nonetheless, some may break off during the interview and never reach the end page. Finally, those reaching the end of the survey will be counted as completed interviews.

Break offs

As mentioned earlier, the process of determining feasibility begins with the required number of completed interviews and then involves working backwards to calculate the necessary number of invitations. So, let’s say we are conducting a study requiring a total of 1,000 interviews. The first step is estimating the amount of break offs during the main interview (also referred to as “drop outs”, “partials” or “abandonments”).

So, what’s a reasonable assumption for the break off rate? It mainly depends on the survey itself. If the questionnaire is lengthy, repetitive or about a topic that is not too relevant for the respondents, more break offs can be expected. But also technology plays an important role. If the survey relies on outdated technology (e.g. Flash) or is not mobile friendly (e.g. responsive), users may have a hard time completing the survey. Our experienced project managers will be happy to help you optimise your questionnaire to keep the amount of break offs as low as possible!

Now, let’s assume a drop-out rate of 2% in our example, that means we’ll need 1,020 respondents starting the main interview.

Quota fails

The next step involves estimating the amount of quota fails, which is often the most challenging task and requires an experienced project manager.

Quota definitions can be quite complex. They can include numerous variables, they can be interlocking or non-interlocking and sometimes respondents even get assigned to them by chance (think of monadic tests). In theory, the available variables of our panel members’ profiles should help us to invite only the right participants and avoid any quota fails. However, this isn’t always possible in practice. We may not always have access to all the required profiles, and if the field period is too short, we might not have the opportunity to gradually and meticulously meet the different quotas.

In summary, quota failures are almost inevitable in the majority of cases. Their extent depends a lot on the specifications of the study (i.e. quota plan, field period), but also on the project manager’s experience. Successfully meeting all quotes within the time frame while maintaining the panel can pose a significant challenge, and it distinguishes experienced samplers from inexperienced ones.

Let’s assume 20% quota fails in our example, so we’ll need 1,276 screened respondents, including the break-offs.

Screen outs

Estimating the amount of screen outs is relatively easy, as the incidence rate is usually part of the proposal. This incidence rate should ideally equal the proportion of respondents that make it through the screener and is typically independent from any other factors.

Let’s assume an incidence rate of 50% for our example, that will give us a required amount of 2,552 starters.

Response rate

The last step in our calculation is an answer to the question of how many members we’ll have to invite, in order to get 2,552 starters. The response rate slightly depends on external factors (such as daytime, weekday, weather, holiday season, etc.). In addition, also the quality of the panel plays a role, and, last but not least, the parameters of the study itself: if the survey is suitable for mobile devices, we can push the invitation to our panel app and thereby leverage the response rates.

If we say it is 45% for our example, we would need a total sample size of 5,669. That’s the minimum amount required to meet the specifications of this exemplary study. But as you would see in our panel book, even our smallest online panel is big enough to carry out this kind of study.

How good is good enough?

And this leads us to a very important business question: How good is good enough? There is definitely no general answer to it, but we’d like to discuss three scenarios to illustrate possible ways of thinking about it:

Concept Test: Let’s assume that a company has two alternatives for an advertisement campaign. But which one works better? You would just need to identify the winner and go with it! Assuming that the outcome is not to tight, about 500 interviews can be sufficient (which corresponds to a margin of 4.3% at a 95% level – so the best option should lead with at least 9%).
Election research: When forecasting the popularity of political parties at elections you’re probably interested in more than individual ratings. You will wonder about which parties could form a coalition to gain a majority. If you have two parties with a 3% margin of error each, it will become quite hard to predict it, especially if the outcome is expected to be tight. In this case you should increase the sample size to reduce the margin of error.
Subgroups: Very often, in addition to overall statistics you want to analyse subgroups of your sample: Who are these heavy users exactly? How do men differ from women? What kind of products prefer readers of a certain magazine? If you just use a smaller subset of your main sample, the available number of interviews for your subsequent analysis will be reduced as well. In this case you should work with an increased sample size, too.

At the end of the day, the art consists in having enough interviews that allow you to draw dependable conclusions and still being reasonable with the overall costs of fieldwork.

Summary

So how many interviews are recommended to obtain representative results? This question simply cannot be answered. You can have small samples that are very representative and large samples that are not representative at all (very often: “Big Data”).

Representativeness is about the right composition of your sample. It indicates if your sample gives you the right picture about reality. If it is a bit blurry, it will still allow you to get the big picture correctly.
The size of a sample defines how clear you can see. If your sample is not representative, a large size will enable you to see very clearly – but it will be a false picture, a misrepresentation of truth.

English

Dansk

Nederlands

Eesti

Français

Deutsch

Italiano

Latviešu

Lietuvių

Norsk bokmål

Polski

Svenska