Categories: General

Sampling: Sampling Design and Its Types


Definition of different terms

What is sampling?

Sampling is a process used in statistical analysis in which a predetermined number of observations are taken from a larger population. The methodology used to sample from a larger population depends on the type of analysis being performed.


The act, process, or technique of selecting a representative part of a population for the purpose of determining parameters of the whole population.

  • Population:

Group of the individual having similar characteristics is called population

  • Sample:

A small group from a population is supposed to carry all the characteristics of the population or which represent the whole population.

A sample should have three major qualities:

  • Representativeness
  • Reasonable size
  • Unbiased
  • Sample frame:

It is the list of all the sample units which are considered as a sample of a population.

  • Target Population:

The population to be studied/ to which the investigator wants to generalize his results.

  • Sample unit:

The individual member of the sample frame is called the sample unit.

  • Sample size:

It is the number of individuals who are included in the sample frame.

  • Census study:

Sometimes, the entire population will be sufficiently small, and the researcher can include the entire population in the study. This type of research is called a census study because data is gathered on every member of the population.

A population frame is the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled and may include individuals, households or institutions.



  • It is done from a homogeneous population.
  • It is a sampling technique in which each member of the population has an equal chance of being selected as a sample unit.
  • The various ways of probability sampling have two things in common:
  • Every element has a known nonzero probability of being sampled.
  • Involves random selection at some point.


  • Ensures
  • Minimum biasness.


  • Time taking
  • Costly
  • Requires population frame


Simple Random Sampling,

Stratified Random Sampling,

Systematic Sampling,

Cluster Sampling

Multistage Sampling.


In this method, each sampling unit of the population has an equal chance of being selected in the sample.


  • Easy to apply
  • Randomness ensured


  • Tedious (Too long)
  • Cumbersome (Difficult to use)
  • Expensive
  • An entire list of population frame may not be available

How to draw a simple random sample?

The simple random sampling procedure is as follows:

  1. a) Make a numbered list of all units in the reference population from which you will select the sample (for example, a list all the health centers in the country)
  2. b) Decide on the size of the sample (the WHO Drug Use Indicators method requires a minimum of 20 facilities).
  3. C) Choose the facilities to include a lottery method. (For example, the numbers of all the facilities can be placed in a box and drawn, a random number table can be used, or random numbers can be generated using a spreadsheet or calculator)


Every nth class is chosen for the study from a list of cases.

Determined by dividing the size of the population by the desired sample size.

Sampling fraction

The ratio between sample size and population

(K)= N/n

N   =size of population

n   = desired sample size


  • Convenient
  • Adequately represents all sections of the population
  • Simpler
  • Unbiased


  • The need for preparing a population frame can be avoided.

How to draw a systematic random sample?

To create a systemic random sample, there are seven steps:

(a) Defining the population

(b) Choosing your sample size

(c) Listing the population

(d) Assigning numbers to cases

(e) Calculating the sampling fraction

(f) Selecting the first unit and

(g) selecting our sample

To the sampling interval, divide the size of the list by the desired sample size. For example, if we want to select 20 health centers from a list of 46 in our sampling frame, our sampling interval would be 46/20 = 2.3.

The first facility chosen in this case can be either 1, 2 or 3, which are all the possible sampling units within the first sampling interval.

The procedure is as follows:

  1. a) Choose a random number between 0 and 1 (with at least 3 digits after the decimal point).
  2. b) Multiply this random number by the sampling interval, and
  3. c) Round this result upward to get the number of the first facility. For example, if the random number was chosen is 0.183, the first unit for the sample 0.183 x2.3 = 0.421 which rounds upward to1, thus, the first facility on the list is chosen for the sample.

Later facilities are selected by adding the sampling interval to the previous result. If the first result was 0.421.then the next facilities selected would be:

Facility 1

0.421 + 2.3 = 2.721 so Facility 3 (Remember: always round upward)

2.721 + 2.3 = 5.021 so Facility 6

5.021 + 2.3 = 7.321 so Facility 8

And so forth.

If the first result had been 1.749, then the first facility would be Facility 2, and the next facilities selected would be:

Facility 2

1.749 + 2.3 = 4.049 so Facility 5

4.049 + 2.3 = 6.349 so Facility 7

6.349 + 2.3 = 8.649 so Facility 9

And so forth

The method just described gives every unit an equal chance of being selected. This method can also be used with minor modifications to select units allowing for how large they are.

Sometimes it is desirable for clinics serving larger populations to have a greater chance of being included in a sample. This method is called sampling with probability proportional to size.

Systematic sampling is also useful when sampling prescriptions from a patient register. If a register contains 100 pages, each with 25 lines of prescriptions, and you need to select 30 prescriptions, the sampling interval would be:

100 x 25 = 83.3

Thus every 83rd prescription would be sampled. Multistage sampling, described as the fifth method below, could also be used to select a sample from a patient register.


When the population is not homogenous, we consider different sections of the population which are homogenous within themselves. The population is divided into a number of sections called strata. A sample is drawn independently from each stratum using a simple random method.


  • Can conduct an analysis of subgroups
  • Sampling variations are lower
  • Represents the population
  • Improves the accuracy/efficiency of estimation.


  • Calculation of sample size for each subgroup is a must
  • Time taking
  • Can be costly


This method is used when the whole population is made up of many natural groups. In this method, a group is taken as a sampling unit.

  • Elements within a cluster should be heterogeneous
  • There should be homogeneity b/w the clusters
  • Random sampling is used on any relevant cluster


  • Administrative convenience
  • The need for full sampling frame of population
  • Full-spectrum of persons is represented


  • Vigilance (careful attention) is needed for the homogeneity between and
  • Heterogeneity within the cluster.


  • It is a modification, rather than an implementation of the cluster sampling method.
  • It is the probability sampling technique wherein the sampling is carried out in several stages such that the sample size gets reduces at each stage.
  • In simple words divide the large populations into stages to make sampling more practical.

For example

  • The complex form of cluster sampling in which two or more levels of units are embedded in the other
  • First stage random number of districts chosen in all states
  • Followed by a random number of villages.
  • Then the third stage unit will be houses
  • All ultimate units (houses for instance) selected at the last step are surveyed.


  • Economical
  • Population list may not be required


  • Each sample may not be a full representative of the whole population.


Any sampling method where some elements of the population have no chance of selection (these are sometimes referred red to as ‘out of coverage ‘ / ‘ undercovered ‘), or where the probability of selection can’t be accurately determined. The elements do not have a pre-determined chance of being selected. In this method, samples are not picked randomly.


  • Economical
  • Convenient
  • Used when time or other factors rather than generalizability become critical


It is also known as accidental, haphazard or accessibility sampling. The sample is selected from elements of a population that are easily accessible. A readily available group of individuals is used.


  • Practical approach
  • Relies on readily available units
  • Administrative convenience
  • Ease of access


  • Opportunistic and voluntary nature of participants
  • Lack of representativeness


  • It is also known as purposive sampling.
  • It is a non-probability sampling technique where the researcher selects units to be sampled based on their knowledge and professional judgment.
  • This is usually an extension of convenience sampling.
  • Data collection is confined to the specificity of people who provide the desired information.
  • This method is used when enough of the eligible subjects are not willing to cooperate.


  • Less time consuming as a large number of interviewers are not needed
  • No statistical knowledge is required
  • Does not require vast knowledge about mathematics


  • It is unscientific
  • It is solely dependent on a thorough knowledge of the population
  • There is no logic to the selection of the sample or its size

SNOWBALL SAMPLING: chain sampling

It relies on referrals from initial subjects to generate additional subjects

When participants are hard to find, for example, a study investigation on cheating in exams


  • The ability to recruit hidden populations.
  • Economical


  • It can lead to bias
  • It’s not possible to determine the sampling error
  • There is no guarantee about the representativeness of samples


It is a non-probability sampling technique where the assembled sample has the same proportions of individuals as the entire population with respect to known characteristics, traits or focused phenomenon.

Two types of quota

  1. Equitable quota

An equal number of sample is selected.


100 samples are selected all from Punjab, Sindh, K.P.K., and Baluchistan.

  1. Ratio based quota

On the basis of ratio, sample units are selected from all areas of the population.


If we have to select 200 samples from Pakistan then we will select 45% from Punjab, 30% from Sindh, 15% from Baluchistan and 10% from K.P.K.


  • Moderate cost
  • Very extensively used and understood
  • No need for the list of population elements


  • Variability and bias cannot be measured and controlled
  • Projecting data beyond the sample not justified


The sample size is the number of patients or other experimental units included in a study, and one of the first practical steps in designing a trial is the choice of the sample size needed to answer the research question.

In practice, the sample size used in a study is determined based on the expense of data collection, and the need to have sufficient statistical power.

How large a sample do we need?

If the sample is too small;

  • Even a well-conducted study fails to answer the research questions.
  • It may fail to detect important effects and associations.
  • Less accuracy.

If the sample size is too large;

  • The study will become difficult to conduct.
  • Costly
  • Time-consuming

Sample size calculation

1) On the basis of population

2) On the basis of prevalence


Used for measurable data like height, weight, blood pressure, etc.

When the total population is known:


Where:              N=total population                                    e = margin of error

When the total population is unknown;



Zα is the z-table value against our assumed alpha (0.05)

δ is a variation or standard deviation

e is a margin of error

Prevalence based

Used for immeasurable data like intelligence, beauty or for binomial data like true/false, pass/fail, present/absent, etc.

  1. The sample size estimate for prevalence studies is a function of expected prevalence and precision for a given level of confidence expressed by the z statistic.



p is probability or chances of occurrence of an event q is chances of an event to NOT occur? It is also written as 1-p.

If p is 20% (0.2) then q is 80% (0.8).

  1. By using t-distribution



n is sample size p is a prevalence

t is t distribution of CI =0.95

m is the margin of error that is 5%

For More Articles Keep Visiting MedsDrive

Recent Posts

First aid: Description, Importance and Principles

First aid  Definition First aid can be anything from putting on plaster to saving someone's life. The immediate care is… Read More

4 months ago

5 kg Healthy Weight Loss in One Month

Weight Loss Weight gain is a general problem nowadays. It affects daily life activities. The stubborn fats are not easy… Read More

4 months ago

Neonatal Intensive Care Unit: Role of Parents

Role of parents in the Neonatal Intensive Care Unit In most of the cases, the parents can stay with the… Read More

5 months ago

COVID-19: Preventive Measures for Corona Virus

Corona virus COVID-19 ,the respiratory illness corona virus has spread across the world and the WHO has officially declared the… Read More

5 months ago

Nutrition and Fluids in NICU

Nutrition and Fluids in the NICU Nutrition for babies in the Neonatal Intensive Care Unit The feeding process i.e. the… Read More

5 months ago

Detail overview of Neonatal Intensive Care Unit (NICU)

About the Neonatal Intensive Care Unit The Neonatal Intensive Care Unit (NICUs) has compound equipment, staff and devices for the… Read More

5 months ago

This website uses cookies.