Machine Learning facilitates our lives by combining the advantage of computers that can process an enormous volume of information compared to the limited capacity of the human brain, and the predictive power of artificial intelligence. We encounter machine learning technology so often in our daily lives that we don't even realize how it makes our lives easier. The researchers, however, especially those dealing with data classification, are consciously using machine learning in their studies and are well aware of how magical it is. Let's take a closer look at what machine learning is, what areas of our life it is used, and the enormous advantages it brings to text classification.
Machine Learning is the study of computer algorithms that improve automatically through experience, according to the definition of Tom Mitchell, who is the former Chair of the Machine Learning Department at Carnegie Mellon University and one of the most significant contributors to this technology.
Let's use a chess analogy to understand machine learning better. Imagine we are teaching chess to somebody (the machine in this case). We start by explaining the rules of the game, the name of the pieces, how they move, tactics such as fork, the most frequent openings, and end-game strategies. This is the "teaching to the machine" process. However, there is no way we can teach all the possible moves and variations in chess to someone, even though this "someone" is the most advanced computer ever created. We can only guide to a point. After that point, the machine learning process has to take over. Our student, we mean the machine, uses the information we provided to it and learn from them. It will get better in every game by applying this data and past experiences to the positions it encounters, even if it faces a position for the first time. The volume and the accuracy of the information we provide to our student in the beginning and the number of games our student plays to gain experience will determine how good it will become at chess.
Machine learning is a technology that we encounter in almost every aspect of life, from marketing to banking, from economy to education, from software engineering to user behavior analysis. Thanks to the enormous benefit it brings, it has become so widespread that we don't even notice when we come across this technology. For example, when we are on an online bookshop, the system analyzes the books we checked and/or bought in the past and tells us, "We think you will love these books as well!" Remember how surprised we were when we first realized how accurate the suggestions were? Not only have we stopped being surprised and amazed, but we have also stopped noticing that it is actually a machine that is giving us these suggestions!
One of the applications of machine learning that we often see in our daily lives is the Facebook feed. By learning from our interaction frequencies to our friends' posts, Facebook prioritizes them and shows the messages that we are likely to interact more first. On the other hand, leading e-mail providers use machine learning so effectively that we seldom feel the need to check our spam folder to see if we missed anything. The facial recognition technology of our smartphones uses machine learning too. After learning how our face looks like, it decides to unlock or not by asking a simple question: "Is this person owner of the phone?" Besides, even if we did not have glasses or hats while teaching our face to the phone, it still can predict how we would look like with glasses or a hat and unlock when we have these accessories.
As these examples tell us, one of the most significant benefits of machine learning is that it doesn’t confine itself to the information we provided. It can classify the new data with high accuracy by learning from the previous data.
Data classification is the decision of which category a newly encountered data falls in the light of what has been learned from the analysis of training sets previously provided to the system. This definition, which sounds too technical, actually refers to an activity that is not only done by machines but by all humans also. Indeed, the effort to classify everything we see around us is an instinct we have had since our ancestors.
Our ancestors decided whether an animal they encountered for the first time posed a danger to them by analyzing what they learned from their previous experience. For example, because they already learned that carnivorous animals with sharp teeth and claws could be dangerous, our ancestors understood that they had to protect themselves when they encountered a species that they had never seen before if it also had sharp teeth and claws.
Today, this situation is no different. For example, when a person asks for a loan from us, we make a classification as “reliable/unreliable” in our minds by evaluating our previous experience. In doing so, we evaluate many factors such as body language, the size of the loan, past similar experiences with them or other people, etc.
We always need to classify the newly acquired raw data in light of our previous experience to make sense of it.
The type of data, of course, determines the type of classification. If the data is a text, the process is called text classification. Text classification is of great importance as it is one of the most common types of classification used by researchers.
Text classification is the process of assigning tags or categories to newly gathered text data according to its content. It includes messages posted on your company's social media accounts by others, messages written about your company in the digital world, transcription of the calls from your call center, and any other text-based data in your database.
When done efficiently and effectively, text classification enables us to get the most accurate answers to the questions that are the source of our research in the most practical way.
Imagine an e-commerce company has decided to set a strategy for the next year based on data obtained from 120 thousand consumer complaints from the last year. Even if this company knows on which topics the complaints are concentrated, they still need to classify the data to identify its distribution throughout the year and its sub-categories, to compare each one within itself.
Let's give a concrete example. For a company to reach a conclusion like, “Customers who place their orders through our website don’t complain about delivery times as much as the customers who place their orders through the smartphone app," they need to set categories such as the platform (website/mobile phone), subject of the complaint (delivery time, product quality, etc.), and sentiments (positive/negative) and classify the data based on these categories.
Using machine learning technology, we can automatically analyze and classify all of our text-based data, such as transcriptions of interviews used in qualitative consumer research or social media conversations, based on their content. To do this, teaching the categories to the system for just one time is enough. Following the teaching step, which is done by labeling only a fraction of the data we have, the rest of the information is automatically classified by the system within a few minutes with high accuracy.
For more detailed information on how machine learning is used in qualitative research, you can check this article.
The most significant benefit of using machine learning in data classification is that it enables both actors in human-machine collaboration to focus on the things they do better than their counterparts.
The machine side of this collaboration can classify very high volumes of data quickly, thanks to its enormous processing power and advanced algorithms. Continuing from the example given above, by using machine learning, the e-commerce company we have mentioned won’t need to analyze and classify its database of 120 thousand lines one by one. This is a massive gain in resources like workforce and time.
Thanks to machine learning, humans can focus on the areas they do better than machines. For example, the competent professionals of that e-commerce company or the agency they work with, who have gained years of experience in the marketing and research industry, can now focus on analyzing the data already classified by the machine and use their intellectual knowledge to develop the most effective strategies, instead of spending their workforce and time on the classification process.
Even though machine learning is used in every area of life today, the communication industry cannot yet fully utilize the potential of this technology. The main reason for this is that the sector rightly chooses to use its human resources for more critical areas.
This is where Kimola's Cognitive product comes into play.
Kimola is a company that develops products focusing on technology. That way, marketers, advertisers, and researchers can get to know the consumers better, understand them more deeply, and reach results more quickly. Besides being the first company in Turkey to measure ratings in the social media, Kimola is also the founder of the Social Research Platform that guides the communication and field strategies of many global brands such as Pfizer, Japan Tobacco, Samsung, and Nike.
Cognitive, the newest product of Kimola, is a system that does not require any technical knowledge to use, is entirely web-based, and has an interface that allows the data to be uploaded by just dragging and dropping. Using machine learning technology, Cognitive classifies high-volume data quickly and with high accuracy, enabling the creative employees of the communications industry to focus on the other things they do best.
You can sign up here to try Kimola Cognitive and create your first machine learning model for free. If you’d like to get more information about Cognitive, you can send us your questions here, or you can request a meeting appointment via Calendly.
As more content is created in the digital world every day, hate speech becomes common. It snowballs in almost every platform, from...
Qualitative research allows us to understand a subject or a target audience in-depth and draw firm conclusions. However, analyzing...