Going back to one of my examples from previous articles in this series, where we talked about calculating the average height of the human population. I mentioned we could sample data from different races, ethnic groups, continents, countries etc to get a well balanced representation of the larger human population.
But, one question comes if you think about it, how do we know what sub-groups and characteristics of groups we should involve in our samples? This leads us to a whole new dimension or field called Sampling theory. Sampling theory is a very critical step in statistics and determines what statistical methods and statements we can make about our analysis.
Before we begin the data collection process, data sampling is first decided on.
Sometimes, you do not have to worry about defining how you are going to sample data from the population. In such cases, you are able to collect data on all persons in the population. In this case, no need to worry about sampling techniques, cause you do not need to sample in the first place. For example, when conducting a research on all local school students in a small town, you have the ability to collect data on all of them, so you do not have to deal with sampling. This is called full or total surveys.
You can be in situations where you do not have access or ability to collect data on the whole population. In such cases, you need to come up with some technique to generate random samples from the target population.
Total or Full Survey: You have access to data about the whole population.
Random Survey: You do not have access to the whole data, hence you have to sample a section of the larger population.
What Is A Population
Population is all the entries, bodies, people, events, or objects of interest in your research. Using human height study research, our population would be all the people on Earth.
What Is A Sample
A sample is a randomly selected subset of the population. Going to our human height research study, our sample is the group of people we select that give a good representation of the larger population.
Types Of Sampling Techniques
There are several main types of sampling methods used in research and statistics:
Probability Sampling
These methods give every member of the population a known chance of being selected:
Simple Random Sampling
Every member has an equal chance of selection, like drawing names from a hat or using a random number generator.
Systematic Sampling
Selecting every nth item from a list (e.g., every 10th person on a roster).
Stratified Sampling
Dividing the population into subgroups (strata) based on shared characteristics, then randomly sampling from each stratum. This ensures representation across important categories.
Cluster Sampling
Dividing the population into clusters (often geographic), randomly selecting some clusters, then surveying all members within chosen clusters.
Non-Probability Sampling
These methods don’t give all members a known chance of selection:
Convenience Sampling
Selecting whoever is easiest to reach, like surveying people at a shopping mall.
Purposive/Judgmental Sampling
Deliberately choosing participants based on specific criteria or expertise.
Quota Sampling
Setting quotas for certain characteristics and filling them non-randomly, similar to stratified sampling but without random selection.
Snowball Sampling
Existing participants recruit future participants, useful for hard-to-reach populations.
The choice depends on your research goals, resources, and whether you need results generalizable to a larger population. Probability methods generally allow for stronger statistical inference, while non-probability methods can be more practical or appropriate for exploratory research.
Conclusion
Thanks for reading, see you in the next article!
Other platforms where you can reach out to me:
Happy coding! And see you next time, the world keeps spinning.