How to Use Machine Learning in Qualitative Research?

Jun 12, 2020 - 8 min read
How to Use Machine Learning in Qualitative Research?

The qualitative method is a research method that is crucial to understanding the subject in question. While the quantitative approach solely focuses on numerical data, the qualitative method makes it possible to recognize much more than what these data say. It allows us to gain insights that cannot be discovered with quantitative research. However, it is not easy to manually analyze the high-volume and multi-class data obtained during the research process. Machine learning comes into play at this point to increase the efficiency and accuracy of qualitative researches, thanks to the smart algorithms it uses and its ability to quickly process much more data than human capacity.

What is Qualitative Research?

Qualitative research is a method of classifying detailed data collected based on smaller samples, aiming to reveal the target audience or consumers' thoughts, tendencies, feelings, and habits about a particular event, concept, brand, product, or service. Contrary to quantitative research, it does not focus on binary data, and it can provide insights that cannot be revealed through numerical data.

Let’s give a concrete example: While the quantitative research may be used to predict the number of votes a political party might gain in the next election, it can't explain why voters prefer a particular party. Or, while we can find the prevalence of smoking among young people by quantitative research, we can only understand their motivations for smoking through qualitative research, and thus, we can take precautions.

Therefore, brands and even individuals who want to understand their target audience and develop a strategy accordingly use the qualitative research method.

Machine Learning and Qualitative Research

Two of the main problems we face when conducting qualitative research are the reliability of the data obtained, and the workforce and time required to process it. For a study to be healthy, we need accurate, up-to-date, and high-volume data.

Some sensitivities and concerns might prevent people from providing accurate information to researchers during face-to-face interviews. Therefore, in some cases, secondary data collected from platforms such as social media where people act more comfortably while expressing their ideas and motivations may be more useful.

On the other hand, activities like developing a marketing strategy need the data to be up to date. An advertising campaign based on the results from a two-year-old study can hardly produce desirable results.

These two problems can be resolved by using products that provide the most up-to-date and reliable secondary data on the interests of the audience, their demographic characteristics, their habits on consuming news, and brand loyalty.

However, when it comes to the classification of high volumes of data at hand, relying solely on human resources is an impossible task or a massive waste of workforce and time. The most effective and practical solution to this problem is to use machine learning in data classification. You can check this article for more details on the subject.

How to use Machine Learning in Qualitative Research?

The steps to take to use machine learning in qualitative research are as follows:

Step 1: Collecting data

As mentioned above, to conduct proper qualitative research, we need as big, reliable, and updated data as possible. For this, you can try Kimola Analytics, which profiles thousands of people anonymously in real-time through their social media activity and enables you to discover trends for different target groups you have created. Of course, you can also use the primary data you have collected through observation, focus groups, in-depth interviews, and surveys.

Step 2: Determining the categories and labeling

Before classifying our data with machine learning technology, we need to determine which categories this data will be organized into. Then we need to classify a small portion of our data manually.

Let us assume that a bank wants to analyze customer experiences on transferring money to develop a strategy accordingly. Among the collected data are some sentences like “I wish the phone app did not ask for verification over SMS every time” and “I can’t deposit money through the machine on sunny days because there is too much glare on the screen!” Before analyzing and classifying this data, we need to manually establish some categories, such as platform-based tags (Mobile Application/ATM).

Step 3: Machine Learning

Once the user creates these categories, it is time for machine learning. For this step, the user manually classifies a small portion of data so the machine can learn from it. The critical point here is that the more data is classified manually, the better the machine learns, and the higher the accuracy rate becomes.

The user then uploads the data, some portion of which is already classified, to the system.

During the step of machine learning, if the product used offers, ready-to-use models, it is possible to proceed to the next level without manual classification. For example, Kimola Cognitive comes with a set of ready-to-use Machine Learning models for the most common use cases like sentiment, emotion, complaint, and hate speech analyses. You can choose a model then drag and drop your file with interview scripts or social media conversations to classify automatically.

Step 4: Automatic Data Classification

The system now knows what to do with the unclassified data, based on the model created by the user by manually classifying a small portion of it or the ready-to-use models. Therefore, with the help of its smart algorithm, the machine uses this information and classifies the rest of the data.

Let's continue our previous example of the bank. The system is now taught that the sentence “I can’t deposit money through the machine on sunny days because there is too much glare on the screen!” should be classified with the label "ATM," even though the word ATM doesn't appear in the sentence. It uses this information, analyzes other data classified as "ATM," and it learns which new data should be classified as ATM from now on. Therefore it can understand that sentences such as "I waited 15 minutes to deposit money!" and "Why is the nearest cash machine always so far?" are also ATM-related customer experiences. Moreover, it can make this classification very quickly thanks to its processor capacity, which is enormously superior to the human brain.

Step 5: Transforming Analyzed Data into Insights

After the automatic data classification phase, we export our database. At this point, based on the purpose of our research, it comes to conclude from the data we obtained by filtering it using the categories.

Continuing from the example above, the bank can now filter the data based on categories such as "Platform" and "Sentiment" to learn whether its mobile application brings positive or negative customer experiences. Moreover, by filtering only negative comments about mobile app experiences, it can now see what kind of problems people have about the application. Thus, the professionals of the bank or the agency they work with can dedicate their workforce and time to making deductions of data, instead of spending days for just classifying it.

Optional Step: Providing Regular Data Flow to the System

For some researches, it is enough to follow the above steps and get the desired result, while in some cases, it may be necessary to provide new data regularly to the system. For example, a company that researches website experiences of its customers in 2019 will not need to upload any data after January 1, 2020, into the system. A company that makes periodical researches on brand loyalty, on the other hand, might want to repeat the above steps with new data periodically.

The most significant advantage of machine learning at this stage is that the accuracy rate increases with every classification process. If the system's ready-to-use models are regularly developed, as in the Cognitive product, the model's accuracy consistently increases. Besides, if the product allows its users to create their model, as the Cognitive product does, then the users can manually review the classifications that are not accurate, make the necessary changes, and define the updated database as a new model to get higher accuracy in the next classification.

Try Kimola Cognitive for Free!

Cognitive, the newest product of Kimola, is a system that does not require any technical knowledge to use, is entirely web-based, and has an interface that allows the data to be uploaded by just dragging and dropping. Using machine learning technology, Cognitive classifies high-volume data quickly and with high accuracy. Doing so enables the creative employees of the communications industry who use qualitative research to focus on the other things they do best.


You can sign up here to try Kimola Cognitive and create your first machine learning model for free. If you’d like to get more information about Cognitive, you can send us your questions here, or you can request a meeting appointment via Calendly.

Subscribe to Our Newsletter!

We gather global consumer research news and share them with 3,000+ marketing and research professionals worldwide.