SEO Marketing Research

SEO Marketing Research header image 2

Sampling

19 Comments · Measurement instruments

Once the researcher has decided how primary data is to be collected, the next task is to obtain a sample of respondents that is representative of the target population.

The main sampling techniques can be divided into probability and non-probability methods.

In probability sampling each element of the population has a chance of being selected. In such cases it is possible to compute sampling variation and project the results to the entire population.

In the case of non-probability sampling, the chance of selection of a particular population element is known and, strictly speaking, results cannot be projected to the entire population.

Although sampling can be technically rigorous, the need to be so does depend on the particular application.

There are two general reasons why a sample is more desirable than a census. First, there are practical considerations such as cost and population size that make a sample preferable.

Taking a census is expensive as consumer populations may number in the millions. Second, typical research firms or the typical researcher cannot analyze the huge amounts of data generated by a census.

Although statistical software can handle thousands of observations with ease, programs slow down appreciably with tens of thousands, and most are unable to accommodate hundreds of thousands of observations.

In fact, even before a researcher considers the size of the computer or tabulation equipment to be used, he or she must consider the various data preparation procedures involved in just handling the questionnaires or responses and transferring these into computer files.

 

The sheer physical volume places limitations on the researcher’s staff and equipment.

The sampling process make the sampling experience less complex:

Step 1: determine the target population
The population is the total group to be studied, the target population (universe).

 

It is the grand total of what is being measured: consumers, stores, households or whatever.

If the purpose of the study has been well-defined, the population is also well-delineated.

 

This is crucial if the study is to be significant and practical for the guidance of marketing management.

“Target” refers to the conditions that separate those who are of interest to a research project from those who are not. For example, common boundary conditions in marketing research could include:

? whether a person has bought the product in question within some qualifying time frame;
? whether that person intends to buy within some time frame;
? whether that person is in the geographic market; and
? whether that person is an adult.

Finally, population boundaries may be set by cost. For example, a telephone survey to measure opinions about a supermarket might be limited to certain area codes that are expected to account for most customers, even though customers who come from outside the area will be missed by this definition.

Step 2: identify the sampling frame
After defining the target population, a frame of the population must be obtained before sampling can begin.

A sampling frame is a list or system that identifies every member of the target population so that a sample can be drawn without the necessity of physically contacting every member of the population.

It can be a list of names and telephone numbers, as in telephone surveys, an area map of housing or a list of addresses purchased from a mailing list supplier.

It could also be a database. The frame defines the sampling unit, the unit used in the design of the sample.

The frame, and therefore the sampling unit, may take the form of households, students, retail stores of a particular defined type (nature and size, for instance), businesses or transactions.

However, lists are not always available. In such a situation, some sort of counting system must be used to keep track of population members and identify the selections; for example every fourth shopper could be selected.

Step 3: choose the sampling method
There are two main types of sampling methods: probability sampling and non-probability sampling.

Probability samples comprise samples in which the elements being included have a known chance of being selected.

A probability sample enables sampling error to be estimated. This, in simple terms, is the difference between the sample value and the true value of the population being surveyed.

A sampling error can be stated in mathematical terms: usually plus or minus a certain percentage. A larger sample usually implies a smaller sampling error.

Non-probability samples are ones in which participants are selected in a purposeful way. The selection may require certain percentages of the sample to be women or men, housewives under thirty or a similar criterion.

This type of selection is an effort to reach a cross-section of the elements being sampled. However, because the sample is not rigorously chosen it is statistically impossible to state a true sampling error.

Types of Probability Sampling
Most samples chosen for applied research are non-probability samples.

 

A true probability sample, because of the stringent requirements, is likely to be far too expensive and too time-consuming for most uses.

The sampling method chosen for any particular study, therefore, must be explained carefully, with the reasons for its acceptability and likelihood of supplying accurate data.

The research plan may not require that a whole country be sampled. Cost and time factors may lead to the decision to cover only part of a country.

Probability sampling
Simple random sampling is a technique in which each element of the population has an equal chance of being selected.

Simple random sampling is carried out by assigning each element of the sampling frame a number.

Then a series of random numbers is generated, using either a computer or random number table.

The sample becomes the elements whose numbers appear on the list of random numbers. This method is appealing because it produces an unbiased estimate of the population’s characteristics.

It guarantees that every member of the population has a known and equal chance of being selected; therefore, the resulting sample, no matter what the size, will be a valid representation of the population.

To obtain a simple random sample is not easy or practical in many circumstances. It may be time-consuming or costly and sometimes is theoretically impossible.

For instance, if we wish to take a simple random sample from a large finite population of one million families, although it is possible, it is not a simple task to assign a number to each of the families and then draw a sample at random from the numbers.

 

When a population is infinite, numbering each element is impossible. Therefore, simple random sampling needs modifications.

The most common types of modified probability samples are systematic, stratified, and cluster samples.

Systematic sampling is a technique in which a sample is drawn by choosing a beginning point in a list and then sequentially selecting every kth element from the list.

For a systematic sample, the items in the population must be ordered. The selection procedure depends on the number of items in the population and the size of the sample.

The number of items in the population is first divided by the number desired in the sample.

The quotient is k, indicating whether every tenth, eleventh, or perhaps hundredth element in the population is to be selected.

The first item of the sample is selected at random.

 

The rest of the sample is chosen by selecting every kth element from the ordered list until the sample size is reached.

The popularity of systematic sampling has fallen because computerized databases now have a random number selection capability.

However, in the special case of a physical listing of the population, such as a membership directory or a telephone book, systematic sampling is often chosen over random sampling because of its economic efficiency.

In this instance, systematic sampling can be applied with less difficulty and accomplished in a shorter time period than can simple random sampling.

Furthermore, systematic sampling has the potential to create a sample that is almost identical in quality to samples created from simple random sampling.

The essential difference between systematic sampling and simple random sampling is apparent in the use of the words “systematic” and “random.”

The system used in systematic sampling is the skip interval, whereas the randomness in simple random sampling is determined through the use of successive random draws.

In cluster sampling, a random sample of subgroups is chosen and all members of the subgroups become part of the sample.

It is done by first dividing the population into subgroups.

 

Then a random sample of subgroups is chosen, and all members of the chosen subgroups are included in the study.

Notice here that not all subgroups are selected, but those that are selected compose the sample.

If the researcher samples all of the members of the selected subgroups, it is a one-stage cluster sample.

If a sample of members of the selected subgroups is randomly selected, it is a two-stage cluster sample.

One of the most popular ways of forming a cluster is by geographic areas. For the first step, the researcher could select a random sample of areas, and then for the second step pick a probability method to sample individuals within the chosen areas.

The two-step area sample approach is preferable to the one-step approach because there is always the possibility that a single cluster may be less representative than the researcher believes. But the two-step method is more costly because more areas and time are involved.

The greatest danger in one-stage cluster sampling is cluster specification error that occurs when the clusters are not homogeneous.

In stratified sampling, the researcher first divides the population into natural sub-groups that are more homogeneous than the population as a whole.

Then items are selected for the sample at random or by a systematic method from each subgroup.

This method is usually used when a large variation exists within a population and the researcher has some prior knowledge about natural subgroups within the population.

Estimates of the population based on the stratified sample usually have greater precision (or smaller sampling error) than if the whole population were sampled by simple random sampling.

The number of items selected from each stratum may be proportionate or disproportionate to the size of the stratum in relation to the population.

Under the proportionate method, for example, if the size of stratum A is 30 percent of the population, then 30 percent of the sample will come from stratum A. So, if the sample has 300 items.

30 percent of the sample size, or 90 items, are to be selected from stratum A. When the selection is disproportionate, it is relatively difficult to weigh the results from individual strata properly.

The main benefit of stratified sampling is that the sample will include items from each stratum.

The above table shows that if the stratified sample of companies is proportionate, the large companies are only represented with one company in the sample of 111 companies.

Taking the importance of the large companies into consideration, the proportion of large companies should be higher.

Therefore, in this case represented in the above table we would prefer a disproportionate sample.

There are times when stratified sampling is used in marketing research because skewed populations are encountered.

Prior knowledge of populations under study, augmented by research objectives sensitive to subgroupings, sometimes reveals that the population is not normally distributed.

Under these circumstances, it is advantageous to apply stratified sampling to preserve the diversity of the subgroups.

Usually, a surrogate measure, which is some observable or easily determined characteristic of each population member, is used to help partition or separate the population members into their various subgroups.

Non-probability sampling methods
All of the sampling methods described so far embody probability sampling assumptions. In each case, the probability of any unit being selected from the population into the sample is known, even though it cannot be calculated precisely.

Convenience samples are samples drawn at the convenience of the interviewer. The selection of place and, consequently, prospective respondents, is subjective rather than objective.

Certain members of the population are automatically eliminated from the sampling process.

When researchers have little time or money available for an elaborate study, they may do convenience sampling, selecting sample items that are easy to obtain.

In fact, there may be no other way to gather data in some cases than to sample a group of individuals who are available.

Often, for example, college professors will use their students as a sample, because students are a captive audience and are convenient for the study.

The problem here lies with the subjective selection of the sample and the lack of generalizability of the results (Wansink, 2002).

Judgement samples differ from convenience samples because they require a judgement or an “educated guess” as to who should represent the population.

Often the researcher or some individual helping the researcher who has considerable knowledge about the population will choose those individuals whom they feel constitute the sample.

 

It should be apparent that such samples are highly subjective and, therefore, prone to error. Focus group studies often use judgement sampling rather than probability sampling.

Quota samples establish a special quota for various types of individuals to be interviewed.

Researchers may want to ensure that their sample includes a sufficient number of individuals with a particular characteristic that affects the study.

In these cases the researchers determine the percentage of the target population that possesses the characteristics of interest and specify the number of these individuals to be included in the sample to reflect their proportion in the population.

Quota samples are often used by companies that have a firm grasp on the features characterizing the individuals they wish to study.

A large bank, for example, might stipulate that the final sample have an equal number of males and females because in the bank’s understanding of its market, the customer base is equally divided between the two sexes.

When done conscientiously and with a firm understanding of quota characteristics, such sampling can rival probability sampling in the minds of researchers.

Snowball samples require respondents to provide the names of additional respondents. Such lists begin when the researcher compiles a short list of sample units that is smaller than the total sample for study.

After each respondent is interviewed, they are asked to name other possible respondents. In this manner, additional respondents are referred by previous respondents.

Or, as the name implies, the sample grows just as a snowball grows when it is rolled downhill.

Snowball samples are most appropriate when there is a limited and disappointingly short sample frame and when respondents can provide the names of others who would qualify for the survey.

The non-probability aspects of snowball sampling come from the selectivity used throughout.

The initial list may also be special in some way, and the primary means of adding people to the sample is by tapping the memories of those on the original list. Referral samples are often useful in industrial marketing research situations.

Step 4: determine the sample size
Having chosen the sampling method, the next step is to determine the appropriate sample size.

If the sample size is too large, more money and time will be spent than is necessary, but the result obtained from the large sample may not be more accurate than that from a smaller sample.

On the other hand, if the sample size is too small, the study may not reach a valid conclusion.

It is important to realize that the more elements that are properly sampled from the population, the less the sampling error.

This error exists because the whole population is not examined, ultimately leaving something out of the investigation.

The most correct method of determining sample size is the confidence interval approach, which applies the concepts of accuracy (sample error), variability, and confidence interval to create a “correct” sample size.

Because it is, theoretically, the most correct method, it is the one used by national opinion polling companies.

To describe the confidence interval approach to sample size determination, we first describe an underlying concept.

The larger a probability sample is, the more accurate it is (less sample error), indicating that there is a relationship between sample size and the accuracy of that sample.

However, the relationship between sample size and accuracy is not linear – doubling the sample size

95% confidence intervals (= sample error = E) obtained around estimates of proportions, given various sample sizes does not halve the sample error.

 

I

n fact, the sampling error diminishes in accordance with the square root of the growth in sample size.

So if the sample doubles, the sampling error decreases by a little more than 40 percent. Because of this statistical relationship you have to quadruple a sample in order to halve the sample error. The above table dipicts this principle.

 

Step 5: gather the data
Gathering data is a two-stage process. First, the sample unit must be selected. Second, information must be gained from that unit.

Simply put, you need to choose a person and ask him or her some questions. However, not everyone will agree to answer.

So there comes the question of substitutions. Substitutions occur whenever an individual who was qualified to be in the sample proves to be unavailable, unwilling to respond, or unsuitable.

The final activity in the sampling process is the assessment stage. Sample assessment can take a number of forms, one of which is to compare the sample’s demographic profile with a known profile such as the census.

With quota sample validation, the researcher must use a demographic characteristic other than those used to set up the quota system.

The essence of sample validation is to assure the client that the sample is, in fact, a representative sample of the population about which someone wishes to make decisions.

summing up
Sampling design begins by defining the target population in terms of elements, sampling units, extent and time.

Then the sampling frame should be determined. A sampling frame is a representation of the elements of the target population.

At this stage, it is important to recognize any sampling frame errors. The next step involves selecting a sampling technique and determining the sample size.

In addition to quantitative analysis, several qualitative considerations should be taken into account in determining the sample size.

Execution of the sampling process requires detailed specifications for each step. Finally, the selected sample should be validated by comparing characteristics of the sample with known characteristics of the target population.

Sampling techniques may be classified as either non-probability or probability techniques.

When conducting international marketing research it is desirable to achieve comparability in sample composition and representativeness even though this may require the use of different sampling techniques in different countries.

Non-probability sampling is based on researchers’ subjective judgement rather than on scientific principles.

However, this does not mean the results are useless. On the contrary, a researcher may do a good job in portraying the target population, but without scientifically determined samples, there is no way to determine how precise the results are.

But the ease of obtaining the sample and the low cost associated with drawing non-probability samples often compensate for their lack of statistical support.

Judgement sampling, convenience sampling, quota sampling, and snowball sampling are popular non-probability sampling methods.

Probability sampling is any sampling plan in which the chance of being selected is known and equal for every sampling unit in the population.

Statisticians prefer these methods since sampling selection is objective and the sampling error may be measured.

Simple random sampling, systematic sampling, stratified sampling, and cluster sampling are types of probability sampling methods.

For a sample to be statistically useful, it must be representative of the target population. While industry rules of thumb, affordability, and statistical methods can all be used to determine sample size, the statistical method is preferred, because it is supported by scientific principles.

Using this method, researchers need three pieces of information: desired precision, desired confidence level, and an estimation of the population standard deviation or parameter.

Describe the sampling design process.

 

Distinguish between probability and non-probability samples. What are the advantages and disadvantages of each?

 

Why are non-probability samples popular in marketing research?

Describe snowball sampling. Give an example of a situation in which you might use this type of sample. What are the dangers associated with this type of sample?

What are the differences between proportionate and disproportionate stratified sampling?

What is the least expensive and least time-consuming of all sampling techniques?

What is meant by a “skewed” population? Illustrate what you think is a skewed population distribution variable and what it looks like.

Differentiate one-step from two-step area sampling, and indicate when each one is preferred.

Discuss the factors that determine sample size.

 

 

 Keywords: sampling, probability sampling, non-probability sampling, target population, sampling frame, probability sampling, non-probability sampling, sampling methods, sampling error, Simple random sampling, Systematic sampling, cluster sampling, Convenience samples, Judgement samples, Quota samples, Snowball samples, sample size,

 

Tags:

19 Comments so far ↓

Leave a Comment