What is Text Classification?

2 mins read - Updated on Dec 19, 2023

Text Classification (a.k.a. Text Categorization) means the task of assigning predefined tags to pieces of free text based on their content. Pieces of text can come from different sources and can be anything like articles, emails, interview scripts or social media posts. Assigning predefined tags helps us create structured documents rather than having rows of raw data. Text Classification is the backbone of any Text Analysis tool, thereby bringing the necessary structure to unmanageable text data.

Using Text Classification in Qualitative Research

Although it's easy to analyze surveys with closed-ended questions, insights that turn into a business strategy are mostly hidden in the free-text format. This is, of course, the most exciting and the most challenging part of qualitative research. That's why marketers and consumer researchers are continually dealing with text data, whether it comes from a focus group interview, feedback forms, or social media.

In qualitative consumer research, the most common way of handling the data in a free-text format is classifying each text piece. In order to do that, each text piece -which is mostly a row in an Excel file- is labeled based on the category and content. Let's consider we have an Excel file full of consumer feedback on each row for a bank's products and services. One of these rows has consumer feedback saying, "This was my easiest mobile banking experience.". We can handle this specific feedback in two different categories; Channel and Emotion. Channel category, of course, will help us to understand which feedback is about which channel of the bank. In our case, we can label this feedback as "Mobile Application" for Channel category. Likewise, Emotion category will help us to understand which feeling consumers are having on each feedback, which can be Satisfaction in our case. So we label the same feedback "Satisfaction" for Emotion category.

Why Text Classification is Important in Qualitative Research

Humans are programmed to classify new things based on old learnings. This is something we do automatically in daily life, like deciding a unique situation is dangerous or not. Likewise, when trying to understand a set of data, classifying helps us to structure the research and break the whole game into a reasonable amount of data that we can easily understand and reveal the insights hidden in free-text.