__SAMPLING__**MOULE**

**‐**

**I :**

*Sampling*
i) Concept of population and sample in
Qualitative, Quantitative and Mixed research

ii) Techniques of sampling‐ Probability and Non
probability sampling‐Different types.

**(8 hours)**

*Introduction*
The
quality of a piece of research stands or falls not only by the appropriateness
of methodology and instrumentation but also by the suitability of the sampling
strategy that has been adopted (see also Morrison 1993: 112–17). Researchers must take sampling decisions
early in the overall planning of a piece of research. Factors such as expense,
time, and accessibility frequently prevent researchers from gaining information
from the whole population. Therefore they often need to be able to obtain data
from a smaller group or subset of the total population in such a way that the
knowledge gained is representative of the total population (however defined)
under study. This smaller group or subset is the

*sample*.
Experienced
researchers start with the total population and work down to the sample. By
contrast, less experienced researchers often work from the bottom up, that is,
they determine the minimum number of respondents needed to conduct the research
(Bailey 1978). However, unless they identify the total population in advance,
it is virtually impossible for them to assess how representative the sample is
that they have drawn.

Decisions
and problems face researchers in deciding the sampling strategy to be used.
Judgements have to be made about

**four**key factors in sampling:
1
the sample size

2
representativeness and parameters of the sample

3
access to the sample

4
the sampling strategy to be used.

The
decisions here will determine the sampling strategy to be used. This assumes
that a sample is actually required; there may be occasions on which the
researcher can access the whole population rather than a sample.

*The sample size*
A
question that often plagues novice researchers is just how large their samples
for the research should be. There is no clear-cut answer, for the correct
sample size depends on the purpose of the study and the nature of the
population under scrutiny. However, it is possible to give some advice on this
matter. Generally speaking, the larger the sample the better, as this not only
gives greater reliability but also enables more sophisticated statistics to be
used. Thus, a sample size of

**thirty**is held by many to be the minimum number of cases if researchers plan to use some form of statistical analysis on their data, though this is a very small number and we would advise very considerably more. Researchers need to think out in advance of any data collection the sorts of relationships that they wish to explore within subgroups of their eventual sample. The number of variables researchers set out to control in their analysis and the types of statistical tests that they wish to make must inform their decisions about sample size prior to the actual research undertaking. Typically an anticipated**minimum of thirty cases per variable**should be used as a ‘rule of thumb’, i.e. one must be assured of having a minimum of thirty cases for each variable (of course, the thirty cases for variable one could also be the same thirty as for variable two), though this is a very low estimate indeed. This number rises rapidly if different subgroups of the population are included in the sample. Further, depending on the kind of analysis to be performed, some statistical tests will require larger samples. For example, less us imagine that one wished to calculate the chi-square statistic, with cross-tabulated data, for example looking at two subgroups of stakeholders in a primary school containing sixty 10-year-old pupils and twenty teachers and their responses to a question on a 5-point scale. Here one can notice that the sample size is eighty cases, an apparently reasonably sized sample. However, six of the ten cells of responses (60 per cent) contain fewer than five cases.
Our original sample size
of 278 has now increased, very quickly, to 428. The message is very clear: the
greater the number of strata (subgroups), the larger the sample will be. Much
educational research concerns itself with strata rather than whole samples, so
the issue is significant. One can rapidly generate the need for a very large
sample.

If subgroups are
required then the same rules for calculating overall sample size applies to
each of the subgroups.

Further,
determining the size of the sample will also have to take account of
non-response, attrition and respondent mortality, i.e. some participants will
fail to return questionnaires, leave the research, and return incomplete or
spoiled questionnaires (e.g. missing out items, putting two ticks in a row of
choices instead of only one). Hence it is advisable to overestimate rather than
to underestimate the size of the sample required, to build in redundancy
(Gorard 2003: 60). Unless one has guarantees of access, response and, perhaps,
the researcher’s own presence at the time of conducting the research (e.g.
presence when questionnaires are being completed), then it might be advisable
to estimate up to double the size of required sample in order to allow for such
loss of clean and complete copies of questionnaires or responses.

In some circumstances, meeting the
requirements of sample size can be done on an evolutionary basis. For example,
let us imagine that you wish to sample 300 teachers, randomly selected. You
succeed in gaining positive responses from 250 teachers to, for example, a
telephone survey or a questionnaire survey, but you are 50 short of the
required number. The matter can be resolved simply by adding another 50 to the
random sample, and, if not all of these are successful, then adding some more
until the required number is reached.

Borg and Gall (1979: 195) suggest that, as a
general rule, sample sizes should be large where

** there are many variables**

** only small differences or small relationships are expected or predicted**

** the sample will be broken down into subgroups**

** the sample is heterogeneous in terms of the variables under study**

** reliable measures of the dependent variable are unavailable**.

Oppenheim (1992: 44) adds to this the view
that the nature of the scales to be used also exerts an influence on the sample
size. For

**nominal data the sample sizes may well have to be larger than for interval and ratio data**(i.e. a variant of the issue of the number of subgroups to be addressed, the greater the number of subgroups or possible categories, the larger the sample will have to be). Borg and Gall (1979) set out a formula driven approach to determining sample size (see also Moser and Kalton 1977; Ross and Rust 1997: 427–38), and they also suggest using correlational tables for correlational studies – available in most texts on statistics – as it were ‘in reverse’ to determine sample size (Borg and Gall 1979: 201), i.e. looking at the significance levels of correlation coefficients and then reading off the sample sizes usually required to demonstrate that level of significance. For example, a correlational significance level of 0.01 would require a sample size of 10 if the estimated coefficient of correlation is 0.65, or a sample size of 20 if the estimated correlation coefficient is 0.45, and a sample size of 100 if the estimated correlation coefficient is 0.20. Again, an inverse proportion can be seen – the larger the sample population, the smaller the estimated correlation coefficient can be to be deemed significant.
With both qualitative and quantitative data,
the essential requirement is that the sample is representative of the
population from which it is drawn. In a dissertation concerned with a life
history (i.e. n= 1), the sample is the population!

**Qualitative data**

In a qualitative study of thirty highly able girls

of similar socio-economic background following

an A level Biology course, a sample of five or

six may suffice the researcher who is prepared to

obtain additional corroborative data by way of

validation.

Where there is heterogeneity in the population,

then a larger sample must be selected on

some basis that respects that heterogeneity. Thus,

from a staff of sixty secondary school teachers

differentiated by gender, age, subject specialism,

management or classroom responsibility, etc., it

106 SAMPLING

would be insufficient to construct a sample consisting

of ten female classroom teachers of Arts

and Humanities subjects.

**Quantitative data**

For quantitative data, a precise sample number

can be calculated according to the

*level of accuracy*
and the

*level of probability*that researchers require
in their work. They can then report in their

study the rationale and the basis of their research

decisions (Blalock 1979).

By way of example, suppose a teacher/researcher

wishes to sample opinions among 1,000 secondary

school students. She intends to use a 10-point

scale ranging from 1 = totally unsatisfactory to

10 = absolutely fabulous. She already has data

from her own class of thirty students and suspects

that the responses of other students will be

broadly similar. Her own students rated the

activity (an extracurricular event) as follows: mean

score = 7.27;
standard deviation = 1.98. In other

words, her students were pretty much ‘bunched’

about a warm, positive appraisal on the 10-point

scale. How many of the 1,000 students does she

need to sample in order to gain an accurate (i.e.

reliable) assessment of what the whole school

(n = 1, 000) thinks of the extracurricular event?

*It all depends on what degree of accuracy and what level*

*of probability she is willing to accept*.

A simple calculation from a formula by Blalock

(1979: 215–18) shows that:

if she
is happy to be within + or− 0.5 of a scale

point and accurate 19 times out of 20, then she

requires a sample of 60 out of the 1,000;

if she
is happy to be within + or − 0.5 of a

scale point and accurate 99 times out of 100,

then she requires a sample of 104 out of the

1,000

if she
is happy to be within + or− 0.5 of a scale

point and accurate 999 times out of 1,000, then

she requires a sample of 170 out of the 1,000

if she
is a perfectionist and wishes to be within

+ or − 0.25 of a scale point and accurate 999

times out of 1,000, then she requires a sample

of 679 out of the 1,000.

It is clear that sample size is a matter of

judgement as well as mathematical precision; even

formula-driven approaches make it clear that there

are elements of prediction, standard error and

human judgement involved in determining sample

size.

*Sampling error*
If many samples are taken from the same

population, it is unlikely that they will all have

characteristics identical with each other or with

the population; their means will be different. In

brief, there will be sampling error (see Cohen

and Holliday 1979, 1996). Sampling error is often

taken to be the difference between the sample

mean and the population mean. Sampling error

is not necessarily the result of mistakes made

in sampling procedures. Rather, variations may

occur due to the chance selection of different

individuals. For example, if we take a large

number of samples from the population and

measure the mean value of each sample, then

the sample means will not be identical. Some

will be relatively high, some relatively low, and

many will cluster around an average or mean value

of the samples. We show this diagrammatically in

Box 4.2 (see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.4.ppt).

Why should this occur? We can explain the

phenomenon by reference to the Central Limit

Theorem which is derived from the laws of

probability. This states that if random large

samples of equal size are repeatedly drawn from

any population, then the mean of those samples

will be approximately normally distributed. The

distribution of sample means approaches the

normal distribution as the size of the sample

increases, regardless of the shape – normal or

otherwise – of the parent population (Hopkins

*et al*. 1996: 159, 388). Moreover, the average or

mean of the sample means will be approximately

the same as the population mean. Hopkins

*et al*.
(1996: 159–62) demonstrate this by reporting

the use of computer simulation to examine the

sampling distribution of means when computed

10,000 times (a method that we discuss in

SAMPLING ERROR 107

**Chapter 4**

**Box 4.2**

Distribution of sample means showing the spread

of a selection of sample means around the

population mean

*Ms Ms Ms Ms Mpop Ms Ms Ms Ms*

*Mpop*! Population mean

*Ms*! Sample means

*Source:*Cohen and Holliday 1979

Chapter 10). Rose and Sullivan (1993: 144)

remind us that 95 per cent of all sample means

fall between plus or minus 1.96 standard errors

of the sample and population means, i.e. that we

have a 95 per cent chance of having a single

sample mean within these limits, that the sample

mean will fall within the limits of the population

mean.

By drawing a large number of samples of equal

size from a population, we create a sampling

distribution. We can calculate the error involved

in such sampling (see http://www.routledge.

com/textbooks/9780415368780 – Chapter 4, file

4.5.ppt). The standard deviation of the theoretical

distribution of sample means is a measure of

sampling error and is called the standard error

of the mean (

*SE**M*). Thus,*SE*=

*SD*

*s*$

*N*

where

*SD**S*= the standard deviation of the sample
and

*N*= the number in the sample.
Strictly speaking, the formula for the standard

error of the mean is:

*SE*=

*SD*

*pop*

$

*N*
where

*SD**pop*= the standard deviation of the
population.

However, as we are usually unable to ascertain the

*SD*of the total population, the standard deviation

of the sample is used instead. The standard error

of the mean provides the best estimate of the

sampling error. Clearly, the sampling error depends

on the variability (i.e. the heterogeneity) in the

population as measured by

*SD**pop*as well as the
sample size (

*N*) (Rose and Sullivan 1993: 143).
The smaller the

*SD**pop*the smaller the sampling
error; the larger the

*N*, the smaller the sampling
error. Where the

*SD**pop*is very large, then*N*
needs to be very large to counteract it. Where

*SD*

*pop*is very small, then

*N*, too, can be small

and still give a reasonably small sampling error.

As the sample size increases the sampling error

decreases. Hopkins

*et al*. (1996: 159) suggest that,
unless there are some very unusual distributions,

samples of twenty-five or greater usually yield a

normal sampling distribution of the mean. For

further analysis of steps that can be taken to cope

with the estimation of sampling in surveys we refer

the reader to Ross and Wilson (1997).

**The standard error of proportions**

We said earlier that one answer to ‘How big a

sample must I obtain?’ is ‘How accurate do I want

my results to be?’ This is well illustrated in the

following example:

A school principal finds that the 25 students she talks

to at random are reasonably in favour of a proposed

change in the lunch break hours, 66 per cent being in

favour and 34 per cent being against. How can she be

sure that these proportions are truly representative of

the whole school of 1,000 students?

A simple calculation of the standard error of

proportions provides the principal with her answer.

SE = !

*P*×*Q**N*

where

P = the
percentage in favour

Q = 100
per cent − P

N = the
sample size

108 SAMPLING

The formula assumes that each sample is drawn

on a simple random basis. A small correction factor

called the finite population correction (fpc) is

generally applied as follows:

SE of proportions = !(1 −

*f*)*P*×*Q**N*

where f is the

proportion included in the sample.

Where, for example, a sample is 100 out of 1,000,

f is 0.1.

SE of proportions = !(1 − 0.1)(66 × 34)

100 = 4.49

With a sample of twenty-five, the SE = 9.4. In

other words, the favourable vote can vary between

56.6 per cent and 75.4 per cent; likewise, the unfavourable

vote can vary between 43.4 per cent

and 24.6 per cent. Clearly, a voting possibility

ranging from 56.6 per cent in favour to 43.4 per

cent against is less decisive than 66 per cent as opposed

to 34 per cent. Should the school principal

enlarge her sample to include 100 students, then

the SE becomes 4.5 and the variation in the range

is reduced to 61.5 per cent−70.5 per cent in favour

and 38.5 per cent−29.5 per cent against.Sampling

the whole school’s opinion (n = 1, 000) reduces

the SE to 1.5 and the ranges to 64.5 per cent−67.5

per cent in favour and 35.5 per cent−32.5 per cent

against. It is easy to see why political opinion surveys

are often based upon sample sizes of 1,000 to

1,500 (Gardner 1978).

What is being suggested here generally is that,

in order to overcome problems of sampling error,

in order to ensure that one can separate random

effects and variation from non-random effects,

and in order for the power of a statistic to be

felt, one should opt for as large a sample as

possible. As Gorard (2003: 62) says, ‘power is an

estimate of the ability of the test you are using

to separate the effect size from random variation’,

and a large sample helps the researcher to achieve

statistical power. Samples of fewer than thirty are

dangerously small, as they allow the possibility of

considerable standard error, and, for over around

eighty cases, any increases to the sample size have

little effect on the standard error.

*The representativeness of the sample*
The researcher will need to consider the extent

to which it is important that the sample in fact

represents the whole population in question (in

the example above, the 1,000 students), if it is

to be a valid sample. The researcher will need

to be clear what it is that is being represented,

i.e. to set the parameter characteristics of the

wider population – the sampling frame – clearly

and correctly. There is a popular example of

how poor sampling may be unrepresentative and

unhelpful for a researcher. A national newspaper

reports that one person in every two suffers

from backache; this headline stirs alarm in every

doctor’s surgery throughout the land. However,

the newspaper fails to make clear the parameters

of the study which gave rise to the headline.

It turns out that the research took place in a

damp part of the country where the incidence

of backache might be expected to be higher

than elsewhere, in a part of the country which

contained a disproportionate number of elderly

people, again who might be expected to have more

backaches than a younger population, in an area

of heavy industry where the working population

might be expected to have more backache than

in an area of lighter industry or service industries,

and used only two doctors’ records, overlooking

the fact that many backache sufferers went to

those doctors’ surgeries because the two doctors

concerned were known to be overly sympathetic

to backache sufferers rather than responsibly

suspicious.

These four variables – climate, age group,

occupation and reported incidence – were seen

to exert a disproportionate effect on the study,

i.e. if the study were to have been carried

out in an area where the climate, age group,

occupation and reporting were to have been

different, then the results might have been

different. The newspaper report sensationally

generalized beyond the parameters of the data,

thereby overlooking the limited representativeness

of the study.

It is important to consider adjusting the

weightings of subgroups in the sample once the

THE ACCESS TO THE SAMPLE 109

**Chapter 4**

data have been collected. For example, in a

secondary school where half of the students are

male and half are female, consider pupils’ responses

to the question ‘How far does your liking of the

form teacher affect your attitude to work?’

Variable: How far does your liking of the form

teacher affect your attitude to school work?

Very A Some- Quite A very

little little what a lot great

deal

Male 10 20 30 25 15

Female 50 80 30 25 15

Total 60 100 60 50 30

Let us say that we are interested in the attitudes

according to the gender of the respondents, as well

as overall. In this example one could surmise that

generally the results indicate that the liking of the

form teacher has only a small to moderate effect

on the students’ attitude to work. However, we

have to observe that twice as many girls as boys

are included in the sample, and this is an unfair

representation of the population of the school,

which comprises 50 per cent girls and 50 per cent

boys, i.e. girls are over-represented and boys are

under-represented. If one equalizes the two sets

of scores by gender to be closer to the school

population (either by doubling the number of boys

or halving the number of girls) then the results

look very different.

Variable: How far does your liking of the form

teacher affect your attitude to school work?

Very A Some- Quite A very

little little what a lot great

deal

Male 20 40 60 50 30

Female 50 80 30 25 15

Total 70 120 90 75 45

In this latter case a much more positive picture is

painted, indicating that the students regard their

liking of the form teacher as a quite important

feature in their attitude to school work. Here

equalizing the sample to represent more fairly

the population by weighting yields a different

picture. Weighting the results is an important

consideration.

*The access to the sample*
Access is a key issue and is an early factor that must

be decided in research. Researchers will need to

ensure that access is not only permitted but also, in

fact, practicable. For example, if a researcher were

to conduct research into truancy and unauthorized

absence from school, and decided to interview

a sample of truants, the research might never

commence as the truants, by definition, would not

be present! Similarly access to sensitive areas might

be not only difficult but also problematical both

legally and administratively, for example, access

to child abuse victims, child abusers, disaffected

students, drug addicts, school refusers, bullies and

victims of bullying. In some sensitive areas access

to a sample might be denied by the potential

sample participants themselves, for example AIDS

counsellors might be so seriously distressed by their

work that they simply cannot face discussing with

a researcher the subject matter of their traumatic

work; it is distressing enough to do the job without

living through it again with a researcher.

Access might also be denied by the potential

sample participants themselves for very practical

reasons, for example a doctor or a teacher

simply might not have the time to spend with

the researcher. Further, access might be denied

by people who have something to protect, for

example a school which has recently received

a very poor inspection result or poor results on

external examinations, or people who have made

an important discovery or a new invention and

who do not wish to disclose the secret of their

success; the trade in intellectual property has

rendered this a live issue for many researchers.

There are very many reasons that might prevent

access to the sample, and researchers cannot afford

to neglect this potential source of difficulty in

planning research.

In many cases access is guarded by ‘gatekeepers’

– people who can control researchers’ access to

those whom they really want to target. For school

staff this might be, for example, headteachers,

110 SAMPLING

school governors, school secretaries, form teachers;

for pupils this might be friends, gang members,

parents, social workers and so on. It is critical

for researchers to consider not only whether

access is possible but also how access will be

undertaken – to whom does one have to go, both

formally and informally, to gain access to the target

group.

Not only might access be difficult but also

its corollary – release of information – might be

problematic. For example, a researcher might gain

access to a wealth of sensitive information and

appropriate people, but there might be a restriction

on the release of the data collection; in the field

of education in the UK reports have been known

to be suppressed, delayed or ‘doctored’. It is not

always enough to be able to ‘get to’ the sample, the

problem might be to ‘get the information out’ to

the wider public, particularly if it could be critical

of powerful people.

*The sampling strategy to be used*
There are two main methods of sampling (Cohen

and Holliday 1979; 1982; 1996; Schofield 1996).

The researcher must decide whether to opt for

a probability (also known as a random sample)

or a non-probability sample (also known as a

purposive sample). The difference between them

is this: in a probability sample the chances of

members of the wider population being selected

for the sample are known, whereas in a nonprobability

sample the chances of members of the

wider population being selected for the sample

are unknown. In the former (probability sample)

every member of the wider population has an

equal chance of being included in the sample;

inclusion or exclusion from the sample is a matter

of chance and nothing else. In the latter (nonprobability

sample) some members of the wider

population definitely will be excluded and others

definitely included (i.e. every member of the wider

population does not have an equal chance of being

included in the sample). In this latter type the

researcher has deliberately – purposely – selected

a particular section of the wider population to

include in or exclude from the sample.

*Probability samples*
A probability sample, because it draws randomly

from the wider population, will be useful if the

researcher wishes to be able to make generalizations,

because it seeks representativeness of the

wider population. It also permits two-tailed tests

to be administered in statistical analysis of quantitative

data. Probability sampling is popular in

randomized controlled trials. On the other hand,

a non-probability sample deliberately avoids representing

the wider population; it seeks only to

represent a particular group, a particular named

section of the wider population, such as a class

of students, a group of students who are taking

a particular examination, a group of teachers

(see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.6.ppt).

A probability sample will have less risk of

bias than a non-probability sample, whereas,

by contrast, a non-probability sample, being

unrepresentative of the whole population, may

demonstrate skewness or bias. (For this type of

sample a one-tailed test will be used in processing

statistical data.) This is not to say that the former is

bias free; there is still likely to be sampling error in a

probability sample (discussed below), a feature that

has to be acknowledged, for example opinion polls

usually declare their error factors, e.g. ±3 per cent.

There are several types of probability samples:

simple random samples; systematic samples; stratified

samples; cluster samples; stage samples, and

multi-phase samples. They all have a measure of

randomness built into them and therefore have a

degree of generalizability.

**Simple random sampling**

In simple random sampling, each member of the

population under study has an equal chance of

being selected and the probability of a member

of the population being selected is unaffected

by the selection of other members of

the population, i.e. each selection is entirely

independent of the next. The method involves

selecting at random from a list of the population

(a sampling frame) the required number of

PROBABILITY SAMPLES 111

**Chapter 4**

subjects for the sample. This can be done by

drawing names out of a container until the required

number is reached, or by using a table of

random numbers set out in matrix form (these

are reproduced in many books on quantitative

research methods and statistics), and allocating

these random numbers to participants or cases

(e.g. Hopkins

*et al*. 1996: 148–9). Because of
probability and chance, the sample should contain

subjects with characteristics similar to the

population as a whole; some old, some young,

some tall, some short, some fit, some unfit,

some rich, some poor etc. One problem associated

with this particular sampling method

is that a complete list of the population is

needed and this is not always readily available

(see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.7.ppt).

**Systematic sampling**

This method is a modified form of simple random

sampling. It involves selecting subjects from a

population list in a systematic rather than a

random fashion. For example, if from a population

of, say, 2,000, a sample of 100 is required,

then every twentieth person can be selected.

The starting point for the selection is chosen at

random (see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.8.ppt).

One can decide how frequently to make

systematic sampling by a simple statistic – the total

number of the wider population being represented

divided by the sample size required:

f =

*N*

*sn*

f = frequency
interval

N = the
total number of the wider population

sn = the
required number in the sample.

Let us say that the researcher is working with a

school of 1,400 students; by looking at the table

of sample size (Box 4.1) required for a random

sample of these 1,400 students we see that 302

students are required to be in the sample. Hence

the frequency interval (f) is:

1, 400

302 = 4.635 (which rounds up to 5.0)

Hence the researcher would pick out every fifth

name on the list of cases.

Such a process, of course, assumes that the

names on the list themselves have been listed in a

random order. A list of females and males might

list all the females first, before listing all the males;

if there were 200 females on the list, the researcher

might have reached the desired sample size before

reaching that stage of the list which contained

males, thereby distorting (skewing) the sample.

Another example might be where the researcher

decides to select every thirtieth person identified

from a list of school students, but it happens that:

(a) the school has just over thirty students in each

class; (b) each class is listed from high ability to

low ability students; (c) the school listing identifies

the students by class.

In this case, although the sample is drawn

from each class, it is not fairly representing the

whole school population since it is drawing almost

exclusively on the lower ability students. This is

the issue of

*periodicity*(Calder 1979). Not only is
there the question of the order in which names

are listed in systematic sampling, but also there

is the issue that this process may violate one of

the fundamental premises of probability sampling,

namely that every person has an equal chance

of being included in the sample. In the example

above where every fifth name is selected, this

guarantees that names 1–4, 6–9 etc. will be

excluded, i.e. everybody does not have an equal

chance to be chosen. The ways to minimize this

problem are to ensure that the initial listing is

selected randomly and that the starting point for

systematic sampling is similarly selected randomly.

**Stratified sampling**

Stratified sampling involves dividing the population

into homogenous groups, each group

containing subjects with similar characteristics.

For example, group A might contain males and

group B, females. In order to obtain a sample

representative of the whole population in

112 SAMPLING

terms of sex, a random selection of subjects

from group A and group B must be taken. If

needed, the exact proportion of males to females

in the whole population can be reflected

in the sample. The researcher will have to identify

those characteristics of the wider population

which must be included in the sample, i.e. to

identify the parameters of the wider population.

This is the essence of establishing the sampling

frame (see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.9.ppt).

To organize a stratified random sample is a

simple two-stage process. First, identify those

characteristics that appear in the wider population

that must also appear in the sample, i.e. divide

the wider population into homogenous and, if

possible, discrete groups (strata), for example

males and females. Second, randomly sample

within these groups, the size of each group

being determined either by the judgement of

the researcher or by reference to Boxes 4.1

or 4.2.

The decision on which characteristics to include

should strive for simplicity as far as possible, as

the more factors there are, not only the more

complicated the sampling becomes, but often the

larger the sample will have to be to include

representatives of all strata of the wider population.

A stratified random sample is, therefore, a

useful blend of randomization and categorization,

thereby enabling both a quantitative and

qualitative piece of research to be undertaken.

A quantitative piece of research will be able

to use analytical and inferential statistics, while

a qualitative piece of research will be able to

target those groups in institutions or clusters of

participants who will be able to be approached to

participate in the research.

**Cluster sampling**

When the population is large and widely dispersed,

gathering a simple random sample poses

administrative problems. Suppose we want to survey

students’ fitness levels in a particularly large

community or across a country. It would be completely

impractical to select students randomly

and spend an inordinate amount of time travelling

about in order to test them. By cluster sampling,

the researcher can select a specific number of

schools and test all the students in those selected

schools, i.e. a geographically close cluster is sampled

(see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.10.ppt).

One would have to be careful to ensure that

cluster sampling does not build in bias. For

example, let us imagine that we take a cluster

sample of a city in an area of heavy industry or

great poverty; this may not represent all kinds of

cities or socio-economic groups, i.e. there may be

similarities within the sample that do not catch

the variability of the wider population. The issue

here is one of representativeness; hence it might be

safer to take several clusters and to sample lightly

within each cluster, rather to take fewer clusters

and sample heavily within each.

Cluster samples are widely used in small-scale

research. In a cluster sample the parameters of the

wider population are often drawn very sharply; a

researcher, therefore, would have to comment on

the generalizability of the findings. The researcher

may also need to stratify within this cluster sample

if useful data, i.e. those which are focused and

which demonstrate discriminability, are to be

acquired.

**Stage sampling**

Stage sampling is an extension of cluster sampling.

It involves selecting the sample in stages, that

is, taking samples from samples. Using the large

community example in cluster sampling, one type

of stage sampling might be to select a number of

schools at random, and from within each of these

schools, select a number of classes at random,

and from within those classes select a number of

students.

Morrison (1993: 121–2) provides an example

of how to address stage sampling in practice. Let

us say that a researcher wants to administer a

questionnaire to all 16-year-old pupils in each

of eleven secondary schools in one region. By

contacting the eleven schools she finds that there

are 2,000 16-year-olds on roll. Because of questions

NON-PROBABILITY SAMPLES 113

**Chapter 4**

of confidentiality she is unable to find out the

names of all the students so it is impossible to

draw their names out of a container to achieve

randomness (and even if she had the names, it

would be a mind-numbing activity to write out

2,000 names to draw out of a container!). From

looking at Box 4.1 she finds that, for a random

sample of the 2,000 students, the sample size is

322 students. How can she proceed?

The first stage is to list the eleven schools on

a piece of paper and then to write the names of

the eleven schools on to small cards and place

each card in a container. She draws out the first

name of the school, puts a tally mark by the

appropriate school on her list and returns the

card to the container. The process is repeated 321

times, bringing the total to 322. The final totals

might appear thus:

School 1 2 3 4 5 6 7 8 9 10 11 Total

Required no.

of students 22 31 32 24 29 20 35 28 32 38 31 322

For the second stage the researcher then

approaches the eleven schools and asks each of

them to select randomly the required number of

students for each school. Randomness has been

maintained in two stages and a large number

(2,000) has been rendered manageable. The

process at work here is to go from the general to

the specific, the wide to the focused, the large to

the small. Caution has to be exercised here, as the

assumption is that the schools are of the same size

and are large; that may not be the case in practice,

in which case this strategy may be inadvisable.

**Multi-phase sampling**

In stage sampling there is a single unifying purpose

throughout the sampling. In the previous example

the purpose was to reach a particular group of

students from a particular region. In a multi-phase

sample the purposes change at each phase, for

example, at phase one the selection of the sample

might be based on the criterion of geography

(e.g. students living in a particular region); phase

two might be based on an economic criterion

(e.g. schools whose budgets are administered in

markedly different ways); phase three might be

based on a political criterion (e.g. schools whose

students are drawn from areas with a tradition

of support for a particular political party), and

so on. What is evident here is that the sample

population will change at each phase of the research

(see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.11.ppt).

*Non-probability samples*
The selectivity which is built into a nonprobability

sample derives from the researcher

targeting a particular group, in the full knowledge

that it does not represent the wider population; it

simply represents itself. This is frequently the case

in small-scale research, for example, as with one

or two schools, two or three groups of students, or

a particular group of teachers, where no attempt

to generalize is desired; this is frequently the case

for some ethnographic research, action research

or case study research (see http://www.routledge.

com/textbooks/9780415368780 – Chapter 4, file

4.12.ppt). Small-scale research often uses nonprobability

samples because, despite the disadvantages

that arise from their non-representativeness,

they are far less complicated to set up, are considerably

less expensive, and can prove perfectly

adequate where researchers do not intend to generalize

their findings beyond the sample in question,

or where they are simply piloting a questionnaire

as a prelude to the main study.

Just as there are several types of probability sample,

so there are several types of non-probability

sample: convenience sampling, quota sampling,

dimensional sampling, purposive sampling and

snowball sampling. Each type of sample seeks only

to represent itself or instances of itself in a similar

population, rather than attempting to represent

the whole, undifferentiated population.

**Convenience sampling**

Convenience sampling – or, as it is sometimes

called, accidental or opportunity sampling –

involves choosing the nearest individuals to serve

as respondents and continuing that process until

114 SAMPLING

the required sample size has been obtained or

those who happen to be available and accessible

at the time. Captive audiences such as students or

student teachers often serve as respondents based

on convenience sampling. Researchers simply

choose the sample from those to whom they

have easy access. As it does not represent any

group apart from itself, it does not seek to

generalize about the wider population; for a

convenience sample that is an irrelevance. The

researcher, of course, must take pains to report

this point – that the parameters of generalizability

in this type of sample are negligible. A

convenience sample may be the sampling strategy

selected for a case study or a series of case

studies (see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.13.ppt).

**Quota sampling**

Quota sampling has been described as the

non-probability equivalent of stratified sampling

(Bailey 1978). Like a stratified sample, a

quota sample strives to represent significant characteristics

(strata) of the wider population; unlike

stratified sampling it sets out to represent these

in the proportions in which they can be found

in the wider population. For example, suppose

that the wider population (however defined) were

composed of 55 per cent females and 45 per cent

males, then the sample would have to contain 55

per cent females and 45 per cent males; if the

population of a school contained 80 per cent of

students up to and including the age of 16 and

20 per cent of students aged 17 and over, then

the sample would have to contain 80 per cent of

students up to the age of 16 and 20 per cent of students

aged 17 and above. A quota sample, then,

seeks to give proportional weighting to selected

factors (strata) which reflects their weighting in

which they can be found in the wider population

(see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.14.ppt). The

researcher wishing to devise a quota sample can

proceed in three stages:

1 Identifythosecharacteristics(factors)which

appear in the wider population which must

also appear in the sample, i.e. divide the wider

population into homogenous and, if possible,

discrete groups (strata), for example, males

and females, Asian, Chinese and African

Caribbean.

2 Identifytheproportionsinwhichtheselected

characteristics appear in the wider population,

expressed as a percentage.

3 Ensure that the percentaged proportions of

the characteristics selected from the wider

population appear in the sample.

Ensuring correct proportions in the sample may

be difficult to achieve if the proportions in the

wider community are unknown or if access to the

sample is difficult; sometimes a pilot survey might

be necessary in order to establish those proportions

(and even then sampling error or a poor response

rate might render the pilot data problematical).

It is straightforward to determine the minimum

number required in a quota sample. Let us say that

the total number of students in a school is 1,700,

made up thus:

Performing arts 300 students

Natural sciences 300 students

Humanities 600 students

Business and Social Sciences 500 students

The proportions being 3:3:6:5, a minimum of 17

students might be required (3 + 3 + 6 + 5) for

the sample. Of course this would be a minimum

only, and it might be desirable to go higher than

this. The price of having too many characteristics

(strata) in quota sampling is that the minimum

number in the sample very rapidly could become

very large, hence in quota sampling it is advisable

to keep the numbers of strata to a minimum. The

larger the number of strata, the larger the number

in the sample will become, usually at a geometric

rather than an arithmetic rate of progression.

**Purposive sampling**

In purposive sampling, often (but by no means

exclusively) a feature of qualitative research,

researchers handpick the cases to be included

in the sample on the basis of their judgement

of their typicality or possession of the particular

NON-PROBABILITY SAMPLES 115

**Chapter 4**

characteristics being sought. In this way, they build

up a sample that is satisfactory to their specific

needs. As its name suggests, the sample has been

chosen for a specific purpose, for example: a group

of principals and senior managers of secondary

schools is chosen as the research is studying the

incidence of stress among senior managers; a group

of disaffected students has been chosen because

they might indicate most distinctly the factors

which contribute to students’ disaffection (they

are

*critical cases*, akin to ‘critical events’ discussed
in Chapter 18, or

*deviant cases*– those cases which
go against the norm: (Anderson and Arsenault

1998: 124); one class of students has been selected

to be tracked throughout a week in order to report

on the curricular and pedagogic diet which is

offered to them so that other teachers in the

school might compare their own teaching to that

reported. While it may satisfy the researcher’s

needs to take this type of sample, it does not

pretend to represent the wider population; it

is deliberately and unashamedly selective and

biased (see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.15.ppt).

In many cases purposive sampling is used in

order to access ‘knowledgeable people’, i.e. those

who have in-depth knowledge about particular

issues, maybe by virtue of their professional

role, power, access to networks, expertise or

experience (Ball 1990). There is little benefit

in seeking a random sample when most of

the random sample may be largely ignorant of

particular issues and unable to comment on

matters of interest to the researcher, in which

case a purposive sample is vital. Though they may

not be representative and their comments may not

be generalizable, this is not the primary concern

in such sampling; rather the concern is to acquire

in-depth information from those who are in a

position to give it.

Another variant of purposive sampling is the

*boosted*sample. Gorard (2003: 71) comments on

the need to use a boosted sample in order to include

those who may otherwise be excluded from, or

under-represented in, a sample because there are

so few of them. For example, one might have a very

small number of special needs teachers or pupils in

a primary school or nursery, or one might have a

very small number of children from certain ethnic

minorities in a school, such that they may not

feature in a sample. In this case the researcher will

deliberately seek to include a sufficient number of

them to ensure appropriate statistical analysis or

representation in the sample, adjusting any results

from them, through weighting, to ensure that they

are not over-represented in the final results. This

is an endeavour, perhaps, to reach and meet the

demands of social inclusion.

A further variant of purposive sampling

is

*negative case sampling*. Here the researcher
deliberately seeks those people who might

disconfirm the theories being advanced (the

Popperian equivalent of falsifiability), thereby

strengthening the theory if it survives such

disconfirming cases. A softer version of negative

case sampling is

*maximum variation*sampling,
selecting cases from as diverse a population as

possible (Anderson and Arsenault 1998: 124) in

order to ensure strength and richness to the data,

their applicability and their interpretation. In this

latter case, it is almost inevitable that the sample

size will increase or be large.

**Dimensional sampling**

One way of reducing the problem of sample size in

quota sampling is to opt for dimensional sampling.

Dimensional sampling is a further refinement of

quota sampling. It involves identifying various

factors of interest in a population and obtaining

at least one respondent of every combination of

those factors. Thus, in a study of race relations,

for example, researchers may wish to distinguish

first, second and third generation immigrants.

Their sampling plan might take the form of a

multidimensional table with ‘ethnic group’ across

the top and ‘generation’ down the side. A second

example might be of a researcher who may be interested

in studying disaffected students, girls and

secondary-aged students and who may find a single

disaffected secondary female student, i.e. a respondent

who is the bearer of all of the sought characteristics

(see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.16.ppt).

116 SAMPLING

**Snowball sampling**

In snowball sampling researchers identify a small

number of individuals who have the characteristics

in which they are interested. These people are

then used as informants to identify, or put the

researchers in touch with, others who qualify

for inclusion and these, in turn, identify yet

others – hence the term snowball sampling. This

method is useful for sampling a population where

access is difficult, maybe because it is a sensitive

topic (e.g. teenage solvent abusers) or where

communication networks are undeveloped (e.g.

where a researcher wishes to interview stand-in

‘supply’ teachers – teachers who are brought in

on an

*ad-hoc*basis to cover for absent regular
members of a school’s teaching staff – but finds

it difficult to acquire a list of these stand-in

teachers), or where an outside researcher has

difficulty in gaining access to schools (going

through informal networks of friends/acquaintance

and their friends and acquaintances and so on

rather than through formal channels). The task for

the researcher is to establish who are the critical or

key informants with whom initial contact must be

made (see http://www.routledge.com/textbooks/

9780415368780 – Chapter 4, file 4.17.ppt).

**Volunteer sampling**

In cases where access is difficult, the researcher may

have to rely on volunteers, for example, personal

friends, or friends of friends, or participants who

reply to a newspaper advertisement, or those who

happen to be interested from a particular school,

or those attending courses. Sometimes this is

inevitable (Morrison 2006), as it is the only kind

of sampling that is possible, and it may be better

to have this kind of sampling than no research

at all.

In these cases one has to be very cautious

in making any claims for generalizability or

representativeness, as volunteers may have a range

of different motives for volunteering, e.g. wanting

to help a friend, interest in the research, wanting

to benefit society, an opportunity for revenge on a

particular school or headteacher. Volunteers may

be well intentioned, but they do not necessarily

represent the wider population, and this would

have to be made clear.

**Theoretical sampling**

This is a feature of grounded theory. In grounded

theory the sample size is relatively immaterial, as

one works with the data that one has. Indeed

grounded theory would argue that the sample size

could be infinitely large, or, as a fall-back position,

large enough to saturate the categories and issues,

such that new data will not cause the theory that

has been generated to be modified.

Theoretical sampling requires the researcher

to have sufficient data to be able to generate

and ‘ground’ the theory in the research context,

however defined, i.e. to create a theoretical

explanation of what is happening in the situation,

without having any data that do not fit the theory.

Since the researcher will not know in advance

how much, or what range of data will be required,

it is difficult, to the point of either impossibility,

exhaustion or time limitations, to know in advance

the sample size required. The researcher proceeds

in gathering more and more data until the theory

remains unchanged or until the boundaries of

the context of the study have been reached,

until no modifications to the grounded theory are

made in light of the constant comparison method.

Theoretical saturation (Glaser and Strauss 1967:

61) occurs when no additional data are found that

advance, modify, qualify, extend or add to the

theory developed.

Glaser and Strauss (1967) write that

theoretical sampling is the process of data collection

for generating theory whereby the analyst jointly

collects, codes, and analyzes his [

*sic*.] data and decides
what data to collect next and where to find them, in

order to develop his theory as it emerges.

(Glaser and Strauss 1967: 45)

The two key questions, for the grounded theorist

using theoretical sampling are, first, to which

CONCLUSION 117

**Chapter 4**

groups does one turn next for data? Second, for

what theoretical purposes does one seek further

data? In response to the first, Glaser and Strauss

(1967: 49) suggest that the decision is based on

theoretical relevance, i.e. those groups that will

assist in the generation of as many properties and

categories as possible.

Hence the size of the data setmay be fixed by the

number of participants in the organization, or the

number of people to whom one has access, but

the researcher has to consider that the door may

have to be left open for him/her to seek further

data in order to ensure theoretical adequacy and to

check what has been found so far with further data

(Flick

*et al*. 2004: 170). In this case it is not always
possible to predict at the start of the research just

how many, and who, the research will need for the

sampling; it becomes an iterative process.

Non-probability samples also reflect the issue

that sampling can be of

*people*but it can also
be of

*issues*. Samples of people might be selected
because the researcher is concerned to address

specific issues, for example, those students who

misbehave, those who are reluctant to go to school,

those with a history of drug dealing, those who

prefer extra-curricular to curricular activities. Here

it is the issue that drives the sampling, and so the

question becomes not only ‘whom should I sample’

but also ‘what should I sample’ (Mason 2002:

127–32). In turn this suggests that it is not only

people who may be sampled, but texts, documents,

records, settings, environments, events, objects,

organizations, occurrences, activities and so on.

*Planning a sampling strategy*
There are several steps in planning the sampling

strategy:

1 Decidewhetheryouneedasample,orwhether

it is possible to have the whole population.

2 Identifythepopulation,itsimportantfeatures

(the sampling frame) and its size.

3 Identify the kind of sampling strategy you

require (e.g. which variant of probability and

non-probability sample you require).

4 Ensurethataccesstothesampleisguaranteed.

If not, be prepared to modify the sampling

strategy (step 2).

5 For probability sampling, identify the confidence

level and confidence intervals that you

require.

For non-probability sampling, identify the

people whom you require in the sample.

6 Calculatethenumbersrequiredinthesample,

allowing for non-response, incomplete or

spoiled responses, attrition and sample

mortality, i.e. build in redundancy.

7 Decide how to gain and manage access

and contact (e.g. advertisement, letter,

telephone, email, personal visit, personal

contacts/friends).

8 Be prepared to weight (adjust) the data, once

collected.

*Conclusion*
The message from this chapter is the same as for

many of the others – that every element of the

research should not be arbitrary but planned and

deliberate, and that, as before, the criterion of

planning must be

*fitness for purpose*. The selection
of a sampling strategy must be governed by the

criterion of suitability. The choice of which

strategy to adopt must be mindful of the purposes

of the research, the time scales and constraints on

the research, the methods of data collection, and

the methodology of the research. The sampling

chosen must be appropriate for all of these factors

if validity is to be served.

To the question ‘how large should my sample

be?’, the answer is complicated. This chapter has

suggested that it all depends on:

population size

confidence level and
confidence interval

required

accuracy required (the
smallest sampling error

sought)

number of strata required

number of variables
included in the study

variability of the factor
under study

118 SAMPLING

the
kind of sample (different kinds of

sample within probability and non-probability

sampling)

representativeness
of the sample

allowances
to be made for attrition and nonresponse

need
to keep proportionality in a proportionate

sample.

That said, this chapter has urged researchers to

use large rather than small samples, particularly in

quantitative research.

## No comments:

## Post a Comment