- Last Updated: October 8, 2024
Primary data is collected specifically for a relevant purpose or research question. Learn more about the methods, examples, tools, and best practices.
What is data collection?
Data collection is the process of gathering information to find a solution or outcome to a specific question or problem. If you’re an academic researcher, you might collect data to answer the question about the prevalence of a certain disease in a specific region. Or an international development professional, for example, might collect data to determine whether a program is effective.
There are two main types of data collection: primary and secondary. The type of data you use for your research or evaluation depends on factors like:
-
- How much time you have
-
- The expertise in the area of interest
-
- The availability of existing (secondary) data
-
- Your organization’s ability and expertise to collect primary data
In this article, we’ll introduce you to primary data, share specific examples of primary data collection, and show you the most common methods of collecting primary data. Then, we’ll walk you through a brief tutorial on how to use SurveyCTO for primary data collection.
What is primary data?
Primary data collection is the process of gathering data directly from a first-hand source. In other words, it’s data that’s collected by the organization that expects to use it. Methods include surveys, interviews, observation, and focus groups. For example, The World Bank tracked the impact of COVID-19 in Afghanistan through 14,000 phone surveys. The data collected through these surveys is primary data.
Secondary data, on the other hand, is data collected by someone other than the primary user and made available for other researchers to use.
What about third-party data? Third-party data is information collected by various sources and aggregated by a third-party provider, hence the name “third-party data.” These aggregators don’t have a direct relationship with the subjects.
Now, let’s look at some specific examples of collecting primary data.
The best way to understand primary data is to see specific examples of research or evaluation scenarios. Here are some examples from different fields.
Household composition surveys: Researchers often undergo household surveys to track the composition of a household over time. For example, IDinsight carries out large-scale household survey projects in India to inform social impact projects.
Phone interviews: A researcher, often called an enumerator, communicates with respondents over the phone and asks questions based on a predetermined questionnaire. Some projects use computers or tablets to record responses, a process known as computer-assisted telephone interviews (CATI).
Inspections and program evaluations: International development projects often use primary data collection to answer questions like:
-
- Is this program effective?
-
- Can the impact be quantified?
-
- What is the most impactful element of this development program?
For example, undergoing a large survey to evaluate the effectiveness of an employee training program is one example of primary data collection for the purpose of inspections and evaluations.
Go deeper: Stories of primary data collection in action.
One of the best ways to understand what primary data collection looks like is by reading real case studies from researchers and data collection professionals. Here are stories of organizations who have undergone major primary data collection projects:
What are the two main categories of primary data?
The two main categories of primary data are quantitative and qualitative data.
Quantitative data is data that can be measured. Here are some concrete examples of quantitative data:
-
- The number of houses in a community with one child or fewer
-
- The years of schooling study participants have
-
- The number of days a respondent is absent from work in a given period
Qualitative data, on the other hand, is data that cannot be quantified or measured. For example:
-
- Opinions about a specific topic, like government policies
-
- Personal experiences in healthcare or justice systems
-
- Level of satisfaction with a local education initiative
What are the most common methods of collecting primary data?
Surveys and questionnaires
Surveys are one of the most common forms of collecting large amounts of primary data. By definition, a survey is a series of questions used to investigate the experience or opinions of a group of people.
Surveys are great for high-frequency studies where a large number of respondents are needed. Since you can deploy surveys using tools like SurveyCTO, it’s possible to deploy a survey to thousands of respondents and get respondents in a relatively short period.
Surveys are great for both quantitative and qualitative data. Most survey tools have a variety of form fields that you can customize to the specific question type (for example, a number, or a name, or an email).
Other places where surveys are a great method for data collection:
-
- Online data collection: Most survey tools have the ability to share a link for respondents to fill out online via a web browser on a computer or mobile device.
-
- Offline data collection: For settings without internet, some survey tools (like SurveyCTO) have functionality that allows respondents to answer questions without an internet connection.
-
- Deploying surveys via social media: Easily shareable survey links make it easy to share your survey with a wide audience via social media.
-
- Anonymous data collection: Many survey tools also have functionality that makes it easy to protect the identity of the respondent.
Surveys, especially those involving high volumes of respondents, can require deep expertise. Large organizations often hire survey firms to carry out data collection.
Interviews, both in-person and over the phone
An interview is a one-on-one conversation where an interviewer asks questions to the respondent. Interviews are best for qualitative data collection since they allow the interviewer to get a deeper understanding of the respondent’s experience by asking open-ended questions.
Interviews can take place over the phone or in person. Sometimes, an interview looks like a facilitator asking a series of survey questions on a tablet or mobile device. To record responses, interviewers can use CAPI (computer-assisted personal interviewing) and CATI (computer-assisted telephone interviewing) using a phone or tablet to record responses.
Interviews are a great method of primary data collection about sensitive topics, where respondents might not be comfortable sharing information via an online or offline form. They can also be a useful tool for following up on answers from a preliminary survey or questionnaire.
Observation
Observation is a method for gathering primary data about behavior, events, or how individuals interact with their natural setting. For example, researchers in Haiti ran experiments in the field (instead of in a laboratory) so as not to disrupt the behavior of participants by bringing them into a lab.
Observation is instrumental when it’s not possible to get data from individuals in an interview or survey. It’s also a useful primary data collection tool when the physical setting matters for data collection — for example, if it’s essential to see how participants respond to elements in their natural environment.
Observation is also a useful primary data collection tool for:
-
- Evaluations and inspections
-
- Behavioral studies
-
- Understanding ongoing processes
-
- Gathering data on interactions between individuals
Focus groups
Focus groups are similar to interviews as a method of gathering primary data but involve discussion between participants instead of one-on-one conversations between interviewers and respondents.
If you’re considering focus groups as a method for primary data collection, consider that you’ll need an experienced facilitator and a group with availability at the same time. You’ll also need to consider the impact of groupthink on your findings during the focus group
Focus groups can be great for gathering qualitative information for marketing purposes, or for gathering common impressions or shared experiences.
Tools used for primary data collection
There are a few tools that can be used to collect primary data. Paper and pen are one option. This is inexpensive and doesn’t require much training for enumerators, but creates a lot of work in manual data entry. It also introduces a high level of human error, resulting in poorer-quality data. Verifying the integrity of data collected in this way can also be challenging, since paper responses don’t allow for the kind of auditing that modern tech does, like quality checks or tracking survey completion time.
As organizations scale, many undergo digital transformation to take their primary data collection process online. There are various tools that you can use for this. You can collect data on a desktop or laptop computer using web surveys, or conduct video interviews. You can also use a mobile data collection platform to facilitate computer-assisted personal interviews (CAPI) or computer-assisted telephone interviews (CATI).
How to collect primary data with SurveyCTO
SurveyCTO is a data collection platform you can use to collect primary data on your mobile device, in-person, over the phone, online, or offline. Here’s how to get started collecting data with SurveyCTO:
Step 1: Easily design your form with SurveyCTO’s drag-and-drop form designer
Use our drag-and-drop form designer, or upload forms directly from Excel or Google Sheets. Choose from dozens of field types to design your form. Then it’s easy to rigorously test your forms before deploying them to mobile devices and the web.
Step 2: Collect data online or offline
You can collect data online with simple web forms. If you’re collecting data in a remote area or place with no internet connection, collect data on the SurveyCTO Collect app and upload it later. There are a variety of options for offline data collection workflows, depending on which devices you’re using and what kind of WiFi access you have.
Step 3: Monitor and check incoming data in real time
One of the best ways to ensure high-quality primary data is to monitor data as it’s being collected, instead of waiting for the data collection process to finish. SurveyCTO has a variety of features that allow you to set up quality checks to monitor your data throughout the process. These include:
-
- The review and corrections workflow, which you can configure to hold, review and correct submissions before they are approved for sharing or publishing.
-
- Automated quality checks, which trigger warnings based on parameters you set.
-
- Audio audits to randomly screen portions of interviews.
Step 4: Export, share, and visualize your data
Finally, you can download your data in multiple formats to use with other tools, including .csv, .xlsx, .sav, and .do files. Share data to Google Sheets or integrate with any other tools through Zapier, OpenFn, API, or webhooks. Create real-time dashboards, and seamlessly publish your data to the platform of your choice for visualization and analysis.
Follow SurveyCTO’s quick start guide for more details on getting started with primary data collection.
Advantages and disadvantages of collecting primary data
Choosing whether or not to collect primary data for your research or evaluation depends on various factors including time, resources, and expertise available.
Advantages of collecting primary data
There are three main advantages of collecting primary data:
-
- Reliability: The main advantage of primary data collection is control. When you collect primary data, your organization has total control over exactly how the data was collected, cleaned, and checked for quality. It’s important to note, though, that secondary data can be highly reliable. For example, government organizations have stringent standards for data quality and extensive resources for quality checks, so in some cases, secondary data can actually be more reliable than primary data, depending on how it’s collected.
-
- Relevance: When you collect primary data, the data collected is relevant to your purposes. For example, if your study looks at the effect of employment training programs on household income in India, then the data you collect relates directly to the exact variables you want to study: employment training programs and household income. If you rely on secondary data, you might be able to find data around the populations you’re studying, but the data might not be as relevant to your exact area of research.
-
- Specificity: Secondary data may be broad enough to cover a large variety of variables for a large population, but it may not be specific enough.
Challenges of collecting primary data
These are the main challenges of collecting primary data.
-
- It can be more time-consuming (but there’s a lot you can do to make it more efficient): Unlike secondary data, which has already been collected, checked for quality, and organized, primary data collection can be a heavily involved process. You might need specialized knowledge of data collection methods and methods of data quality assurance. However, there are ways to reduce the time needed to collect, clean, and organize data. Thanks to modern tools, APIs, and automation, you can build workflows that significantly reduce the time required for primary data collection.
-
- It’s often more expensive: The cost of primary data collection depends on many factors. But overall, since it requires designing surveys or interviews, carrying out surveys or interviews, training enumerators, checking data quality, and processing data, it can be expensive.
-
- Access might be difficult to obtain: Access to certain populations can be challenging to obtain. Secondary data sets (especially government data) often contain information on populations or parameters that your organization might struggle to access.
-
- Your time frame might be too large: If you’re studying changes in a population over two decades, for example, then it would take you (at minimum) 20 years to collect that data. It can be helpful to find secondary data from longitudinal studies that provide data about populations over extensive periods.
Remember: primary and secondary data can complement each other
It’s important to remember that you don’t always have to make the choice between primary or secondary data. Often, secondary data can enrich existing primary data, or vice versa. You can use secondary data as preliminary research to inform your primary data collection.
Your next steps: Explore more resources
To keep learning about primary data collection methods and best practices, take advantage of these resources from SurveyCTO:
-
- Get emails about our monthly webinars, where researchers from organizations like The World Bank and IDinsight showcase primary data collection methods and best practices. Sign up here to get notified about those webinars.
- Check out previous webinars from SurveyCTO about primary data collection, like this one on data quality from the World Bank.
- Sign up for a free trial of SurveyCTO for your next primary data collection project.