How To Collect Data For Customer Sentiment Analysis
Customer sentiment analysis involves collecting, analyzing, and leveraging data to understand customers' feelings. This article focuses on how to collect data for customer sentiment analysis.
Image by Editor
Customer sentiment analysis is the process of using machine learning (ML) to discover customer intent and opinion about a brand from customer feedback given in reviews, forums, surveys, and so on. Sentiment analysis of customer experience data gives businesses deep insight into motivations behind purchase decisions, the patterns in changing brand sentiment based on timelines or events, and market-gap analysis that can help in product and service improvement.
Table of contents:
- What is customer sentiment analysis?
- How do you collect data for customer sentiment analysis?
- How sentiment scores are derived from customer feedback
- Conclusion
Sentiment analysis fine-combs customer feedback data to identify specific emotions or sentiments. Broadly, these are positive, negative, or neutral. But within these parameters, a sentiment analysis model driven by ML tasks such as natural language processing (NLP) and semantic analysis that can find the semantic and syntactic aspects of words can help find different types of negative sentiment as well.
For example, it can help give varying sentiment scores based on words that denote different negative emotions such as anxiety, disappointment, regret, anger, and so on. The same is the case with positive micro sentiments.
Such fine-grained emotion mining combined with aspect-based analysis of a customer’s experience with a brand can be of prime importance. For example, when you know sentiment based on aspects like price, convenience, ease-of-purchase, customer service, etc, you get actionable insights that you can depend on to make the right decisions when it comes to quality control and product improvement.
How do you Collect Data for Customer Sentiment Analysis?
A very important part of procuring targeted and insightful brand sentiment intelligence is having reliable customer feedback data. Here are five essential ways in which you can collect such data.
1. Social media comments and videos
Social media listening is one of the ways in which you can get current customer feedback about your brand, which includes both your product as well as service. A sentiment analysis model that can process and evaluate social media comments, as well as video content, is the perfect bet to leverage this data source.
With such a tool, you harness data for customer sentiment analysis from text-heavy social media sites like Twitter to video-based ones like TikTok or Instagram. This gives you a great advantage because not all social media platforms are one-size-fits-all when it comes to customer choices.
For example, while customers mainly use Twitter to directly interact with a brand, Facebook users are known to leave detailed remarks about a business they have associated with. This stark contrast is due to factors such as the nature of the business, age, geographic location, digital usage, and so on.
The examples below show how customers leave comments on the two different social media channels.
Another great advantage of social media sentiment analysis is that you can also find social media Influencers who fit your bill and can be an awesome addition to your digital marketing strategy. Influencers cost half the investment that goes into hiring a PR agency or celebrity endorsement.
Also, people trust product reviews and endorsements from Influencers to whom they can relate. This is true whether you’re an intern looking for professional styling tips or a father of four in search of the best options in cell phones for teens. This is how data science and ML help in finding the right TikTok Influencer for a business.
2. Go Beyond Quantitative Surveys like NPS, CES, or CSAT
Customer feedback metrics like net promoter score (NPS), customer effort score (CES), or star ratings can tell you at a glance whether people are happy with your business or not. But this doesn’t really give you any actual business insight.
To get real customer sentiment insights you need to go beyond quantitative metrics. And for that, you need to analyze comments and open-ended survey responses that do not have any fixed response. This allows customers to write free-flowing comments, which can give you insight into aspects of your business that you were not even aware of.
In the above example, we can see that customers have given a 1-star rating to the business. But upon reading the comments we realize that the reasons behind the negative sentiments are entirely different.
While one customer is unhappy about the company’s online customer service, the other mentions that even though they are a long-time customer, the fall in the quality and the new pricing is why they might not be buying from them anymore.
These are actionable insights, where a business knows exactly where improvement must be made in order to maintain customer satisfaction and loyalty. Going beyond just numerical metrics can get you these insights.
3. Analyze reviews from customer forums and websites
Another excellent way to get diverse customer feedback data is by sieving through product review websites like GoogleMyBusiness and forums such as Reddit. Importantly, getting insights from different data sources can give you better insights because of the type of audience different platforms invite.
For example, Reddit is mostly used by customers who are more passionate about a subject or product because the forum allows them to have verbose discussions. While, Amazon reviews or Google reviews are mostly used by casual customers who would like to leave a review either at the nudge of the business or because of the experience, good or bad, that they might have had.
These ML-driven technical insights drawn from reviews on Disney World in Florida derived from customer comments on Reddit and Google illustrate this point further.
4. Voice of customer (VoC) data from non-traditional sources
Non-traditional sources of customer feedback data such as chatbot histories, customer emails, customer support transcripts, and so on are brilliant sources to gain customer experience insights. An advantage of these sources is that all this data is already available in your customer relationship management (CRM) tools.
When you are able to gather and analyze this data you will be able to discover many underlying issues that even well-planned customer surveys or social media listening may not be able to highlight.
5. Analyze news and podcasts
News data that consists of both articles, as well as news videos and podcasts, can give you granular insights into brand performance and perception. Market feedback from news sources can help a business in effective public relations (PR) activities for brand reputation management.
It can also help in competitor analysis based on industry trends that a sentiment analysis model can extract from brand experience data in news articles or videos as well as help them understand consumer behavior.
How are Sentiment Scores Deduced from Customer Feedback?
To illustrate how sentiment is extracted and scores are calculated, let us take news sources as the vital source of customer feedback and see how an ML model will analyze such data.
1. Gathering the data
In order to get the most accurate results, we must use all news sources available publically. This includes news from television channels, online magazines and other publications, radio broadcasts, podcasts, videos, etc.
There are two ways in which this can be done. We either upload the data directly through Live news APIs like Google News API, ESPN Headlines API, BBC News API, and others like them. Or, we manually upload them to the ML model we are using by downloading the comments and articles in a .csv file.
2. Processing Data With ML Tasks
The model now processes the data and identifies the different formats - text, video, or audio. In the case of text, the process is fairly simple. The model extracts all the text including emoticons and hashtags. In the case of podcasts, radio broadcasts, and videos, it will require audio transcription through speech-to-text software. This data too is then sent to the text analytics pipeline.
Once in the pipeline, natural language processing (NLP), named entity recognition (NER), semantic classification, etc make sure that key aspects, themes, and topics from the data are extracted and grouped so that they can be analyzed for sentiment.
3. Analyzing sentiment
Now that the text has been segregated, each theme, aspect, and entity is analyzed for sentiment and the sentiment score is calculated. This can be done in any of three approaches - word count method, sentence-length method, and the ratio of positive and negative words.
Let us take this sentence as an example. “Stadium goers remarked that the seats were good. However, the tickets did seem too costly, given that there were no season passes available, and many even encountered rude staff at the ticket counter, according to the Daily Herald.”
Let us assume that after tokenization, text normalization (eliminating non-text data), word stemming (finding the root word), and stop word removal (removing redundant words), we get the following scores for negative and positive sentiment.
Positive - Good - 1(+ 0.07)
Negative - Costly(- 0.5), rude(- 0.7) - 2
Now let us calculate the sentiment scores using the three aforementioned methods.
Word count method
This is the simplest way in which the sentiment score can be calculated. In this method, we reduce the negative from the positive occurrences (1 - 2 = -1)
Thus, the sentiment score of the above example is -1.
Sentence-length method
The number of positive words is subtracted from the negative words. The result is then divided by the total number of words in the text. Because the score thus arrived can be very small and follow into many decimal places, it is often multiplied by a single digit. This is done so that the scores are bigger and thus easier to comprehend and compare. In the case of our example, the score will be.
1-2/42 = -0.0238095
Negative-Positive word count ratio
The total number of positive words is divided by the total number of negative words. The result is then added by 1. This is more balanced than other approaches, especially in the case of large amounts of data.
1/ 2+1 = 0.33333
4. Insights Visualization
Once the data is analyzed for sentiment, the insights are presented on a visualization dashboard so you can understand the intelligence that has been garnered from all the data. You can see timeline-based sentiment analysis, as well as those based on events such as product launches, stock market fluctuations, press releases, company statements, new pricing, etc.
These aspect-based insights are what can be of incredible value to you as you plan your marketing and growth strategies.
Conclusion
AI and data science are of immense importance to marketing activities, especially in an era of constant innovation and shifting market dynamics. Customer sentiment analysis driven by customer feedback data that has been directly harnessed from them can give you all the leverage you need to make sure that you have a sustainable marketing strategy for continued growth.
Martin Ostrovsky is the founder and CEO of Repustate. He is passionate about AI, ML, and NLP. He sets the strategy, roadmap, and feature definition for Repustate's Global Text Analytics API, Sentiment Analysis, Deep Search, and Named Entity Recognition solutions.