SAMPLING
Definition of different terms
What is sampling?
Sampling is a process used in statistical analysis in which a predetermined number of observations are taken from a larger population. The methodology used to sample from a larger population depends on the type of analysis being performed.
OR
The act, process, or technique of selecting a representative part of a population for the purpose of determining parameters of the whole population.

Population:
Group of the individual having similar characteristics is called population

Sample:
A small group from a population is supposed to carry all the characteristics of the population or which represent the whole population.
A sample should have three major qualities:
 Representativeness
 Reasonable size
 Unbiased

Sample frame:
It is the list of all the sample units which are considered as a sample of a population.

Target Population:
The population to be studied/ to which the investigator wants to generalize his results.

Sample unit:
The individual member of the sample frame is called the sample unit.

Sample size:
It is the number of individuals who are included in the sample frame.

Census study:
Sometimes, the entire population will be sufficiently small, and the researcher can include the entire population in the study. This type of research is called a census study because data is gathered on every member of the population.
A population frame is the source material or device from which a sample is drawn. It is a list of all those within a population who can be sampled and may include individuals, households or institutions.
SAMPLING DESIGNING PROCESS
PROBABILITY SAMPLING
 It is done from a homogeneous population.
 It is a sampling technique in which each member of the population has an equal chance of being selected as a sample unit.
 The various ways of probability sampling have two things in common:
 Every element has a known nonzero probability of being sampled.
 Involves random selection at some point.
Merits
 Ensures
 Minimum biasness.
Demerits
 Time taking
 Costly
 Requires population frame
TYPES OF PROBABILITY SAMPLING:
Simple Random Sampling,
Stratified Random Sampling,
Systematic Sampling,
Cluster Sampling
Multistage Sampling.
SIMPLE RANDOM SAMPLING
In this method, each sampling unit of the population has an equal chance of being selected in the sample.
Merits
 Easy to apply
 Randomness ensured
Demerits
 Tedious (Too long)
 Cumbersome (Difficult to use)
 Expensive
 An entire list of population frame may not be available
How to draw a simple random sample?
The simple random sampling procedure is as follows:
 a) Make a numbered list of all units in the reference population from which you will select the sample (for example, a list all the health centers in the country)
 b) Decide on the size of the sample (the WHO Drug Use Indicators method requires a minimum of 20 facilities).
 C) Choose the facilities to include a lottery method. (For example, the numbers of all the facilities can be placed in a box and drawn, a random number table can be used, or random numbers can be generated using a spreadsheet or calculator)
SYSTEMATIC SAMPLING
Every nth class is chosen for the study from a list of cases.
Determined by dividing the size of the population by the desired sample size.
Sampling fraction
The ratio between sample size and population
(K)= N/n
N =size of population
n = desired sample size
Merits
 Convenient
 Adequately represents all sections of the population
 Simpler
 Unbiased
Demerits
 The need for preparing a population frame can be avoided.
How to draw a systematic random sample?
To create a systemic random sample, there are seven steps:
(a) Defining the population
(b) Choosing your sample size
(c) Listing the population
(d) Assigning numbers to cases
(e) Calculating the sampling fraction
(f) Selecting the first unit and
(g) selecting our sample
To the sampling interval, divide the size of the list by the desired sample size. For example, if we want to select 20 health centers from a list of 46 in our sampling frame, our sampling interval would be 46/20 = 2.3.
The first facility chosen in this case can be either 1, 2 or 3, which are all the possible sampling units within the first sampling interval.
The procedure is as follows:
 a) Choose a random number between 0 and 1 (with at least 3 digits after the decimal point).
 b) Multiply this random number by the sampling interval, and
 c) Round this result upward to get the number of the first facility. For example, if the random number was chosen is 0.183, the first unit for the sample 0.183 x2.3 = 0.421 which rounds upward to1, thus, the first facility on the list is chosen for the sample.
Later facilities are selected by adding the sampling interval to the previous result. If the first result was 0.421.then the next facilities selected would be:
Facility 1
0.421 + 2.3 = 2.721 so Facility 3 (Remember: always round upward)
2.721 + 2.3 = 5.021 so Facility 6
5.021 + 2.3 = 7.321 so Facility 8
And so forth.
If the first result had been 1.749, then the first facility would be Facility 2, and the next facilities selected would be:
Facility 2
1.749 + 2.3 = 4.049 so Facility 5
4.049 + 2.3 = 6.349 so Facility 7
6.349 + 2.3 = 8.649 so Facility 9
And so forth
The method just described gives every unit an equal chance of being selected. This method can also be used with minor modifications to select units allowing for how large they are.
Sometimes it is desirable for clinics serving larger populations to have a greater chance of being included in a sample. This method is called sampling with probability proportional to size.
Systematic sampling is also useful when sampling prescriptions from a patient register. If a register contains 100 pages, each with 25 lines of prescriptions, and you need to select 30 prescriptions, the sampling interval would be:
100 x 25 = 83.3
Thus every 83^{rd} prescription would be sampled. Multistage sampling, described as the fifth method below, could also be used to select a sample from a patient register.
STRATIFIED RANDOM SAMPLING
When the population is not homogenous, we consider different sections of the population which are homogenous within themselves. The population is divided into a number of sections called strata. A sample is drawn independently from each stratum using a simple random method.
Merits
 Can conduct an analysis of subgroups
 Sampling variations are lower
 Represents the population
 Improves the accuracy/efficiency of estimation.
Demerits
 Calculation of sample size for each subgroup is a must
 Time taking
 Can be costly
CLUSTER SAMPLING
This method is used when the whole population is made up of many natural groups. In this method, a group is taken as a sampling unit.
 Elements within a cluster should be heterogeneous
 There should be homogeneity b/w the clusters
 Random sampling is used on any relevant cluster
Merits
 Administrative convenience
 The need for full sampling frame of population
 Fullspectrum of persons is represented
Demerits
 Vigilance (careful attention) is needed for the homogeneity between and
 Heterogeneity within the cluster.
MULTISTAGE SAMPLING
 It is a modification, rather than an implementation of the cluster sampling method.
 It is the probability sampling technique wherein the sampling is carried out in several stages such that the sample size gets reduces at each stage.
 In simple words divide the large populations into stages to make sampling more practical.
For example
 The complex form of cluster sampling in which two or more levels of units are embedded in the other
 First stage random number of districts chosen in all states
 Followed by a random number of villages.
 Then the third stage unit will be houses
 All ultimate units (houses for instance) selected at the last step are surveyed.
Merits
 Economical
 Population list may not be required
Demerits
 Each sample may not be a full representative of the whole population.
NONPROBABILITY SAMPLING TECHNIQUE
Any sampling method where some elements of the population have no chance of selection (these are sometimes referred red to as ‘out of coverage ‘ / ‘ undercovered ‘), or where the probability of selection can’t be accurately determined. The elements do not have a predetermined chance of being selected. In this method, samples are not picked randomly.
Merits
 Economical
 Convenient
 Used when time or other factors rather than generalizability become critical
CONVENIENCE SAMPLING
It is also known as accidental, haphazard or accessibility sampling. The sample is selected from elements of a population that are easily accessible. A readily available group of individuals is used.
Merits
 Practical approach
 Relies on readily available units
 Administrative convenience
 Ease of access
Demerits
 Opportunistic and voluntary nature of participants
 Lack of representativeness
JUDGEMENTAL SAMPLING
 It is also known as purposive sampling.
 It is a nonprobability sampling technique where the researcher selects units to be sampled based on their knowledge and professional judgment.
 This is usually an extension of convenience sampling.
 Data collection is confined to the specificity of people who provide the desired information.
 This method is used when enough of the eligible subjects are not willing to cooperate.
Merits
 Less time consuming as a large number of interviewers are not needed
 No statistical knowledge is required
 Does not require vast knowledge about mathematics
Demerits
 It is unscientific
 It is solely dependent on a thorough knowledge of the population
 There is no logic to the selection of the sample or its size
SNOWBALL SAMPLING: chain sampling
It relies on referrals from initial subjects to generate additional subjects
When participants are hard to find, for example, a study investigation on cheating in exams
Merits
 The ability to recruit hidden populations.
 Economical
Demerits
 It can lead to bias
 It’s not possible to determine the sampling error
 There is no guarantee about the representativeness of samples
QUOTA SAMPLING
It is a nonprobability sampling technique where the assembled sample has the same proportions of individuals as the entire population with respect to known characteristics, traits or focused phenomenon.
Two types of quota

Equitable quota
An equal number of sample is selected.
Example
100 samples are selected all from Punjab, Sindh, K.P.K., and Baluchistan.

Ratio based quota
On the basis of ratio, sample units are selected from all areas of the population.
Example
If we have to select 200 samples from Pakistan then we will select 45% from Punjab, 30% from Sindh, 15% from Baluchistan and 10% from K.P.K.
Merits
 Moderate cost
 Very extensively used and understood
 No need for the list of population elements
Demerits
 Variability and bias cannot be measured and controlled
 Projecting data beyond the sample not justified
SAMPLE SIZE
The sample size is the number of patients or other experimental units included in a study, and one of the first practical steps in designing a trial is the choice of the sample size needed to answer the research question.
In practice, the sample size used in a study is determined based on the expense of data collection, and the need to have sufficient statistical power.
How large a sample do we need?
If the sample is too small;
 Even a wellconducted study fails to answer the research questions.
 It may fail to detect important effects and associations.
 Less accuracy.
If the sample size is too large;
 The study will become difficult to conduct.
 Costly
 Timeconsuming
Sample size calculation
1) On the basis of population
2) On the basis of prevalence
Populationbased:
Used for measurable data like height, weight, blood pressure, etc.
When the total population is known:
n=
Where: N=total population e = margin of error
When the total population is unknown;
n=
Where;
Zα is the ztable value against our assumed alpha (0.05)
δ is a variation or standard deviation
e is a margin of error
Prevalence based
Used for immeasurable data like intelligence, beauty or for binomial data like true/false, pass/fail, present/absent, etc.
 The sample size estimate for prevalence studies is a function of expected prevalence and precision for a given level of confidence expressed by the z statistic.
n
Where;
p is probability or chances of occurrence of an event q is chances of an event to NOT occur? It is also written as 1p.
If p is 20% (0.2) then q is 80% (0.8).
 By using tdistribution
n=
Where;
n is sample size p is a prevalence
t is t distribution of CI =0.95
m is the margin of error that is 5%
For More Articles Keep Visiting MedsDrive