The Visible Approach to Assessing Social Media Sentiment
By Shawn Rutledge, Director of Core Technology
The explosive growth of social media content has resulted in a large quantity of online data that expresses opinion. The goal of sentiment analysis is to determine the attitude, opinion, emotional state, or intended emotional communication of a speaker or writer.
The Visible Approach
Fundamentally, the Visible® approach is document classification using supervised machine learning. This means we assign each post (document) a score as to how subjective it is (neutral vs. non-neutral sentiment) and how polar it is (positive, negative, or mixed sentiment). Scores for entities or selections are aggregated into an estimation model for quantification and are rank ordered for information retrieval. We chose a supervised approach because we find that it yields the best results—and we have access to a proprietary resource of tens of millions of human-annotated social media posts over thousands of topics, hundreds of enterprise marketing customers, in dozens of industries, over half a decade of experience.
In the model-training phase, supervised learning takes past experience (as captured in data) and applies algorithms to derive a generalized mathematical model of that experience. The model can then be used to make accurate predictions or decisions at a future point or on different data (prediction/scoring phase). The quality of a model depends primarily on the data the model has to learn from and secondarily on the algorithm or algorithms used to derive the model.
Supervised learning requires training data tagged with the correct answer (truth label) to train the model. At Visible, we train our models as a four-class problem with Positive,Negative, Mixed, and Neutral labels.
Training data must also be represented as features. (In machine learning, the term “features” is used for attributes, variables, or predictors from the data.) For sentiment models, a simple feature might be the existence of a word like “love” in a post, which would be statistical evidence in favor of a Positive label. We train our models using features derived from titles, bodies, threads, sites, permalinks, authors, publish dates, queries, customers, etc.
Finally, there must be enough training instances that are representative of the data where the model will be applied. If there is too little training data or biased training data, a model leaned from such data will have poor generalization performance and not work well on new data. Important segments we sample and test include media-type (e.g., micro-blogs, blogs, forums), industry (e.g., technology, finance, travel, home improvement), language (e.g., English, Spanish), key sites (e.g., Twitter, Facebook), and time (performance over months and years).
Feature engineering is the art and science of deriving features from the available training data for use in model-training algorithms. It is the glue between data and algorithms, and the approaches are often chosen in concert with the data available and algorithms used. There are two broad categories of features we use in our algorithms: those derived though natural language processing (NLP) and those derived from social context.
Natural Language Processing (NLP)
We use shallow (statistical) NLP techniques to extract features from posts and conversations. For the informal language of social media, these techniques generalize better than deeper linguistic and syntactical techniques. Negation can be seen as a simple proximity pattern of a certain term class, and ideas like sarcasm, humor, or irony can often be picked up in a similar manner (“sick” vs. “sikk”, “awesome – NOT!”, emoticons, etc.).
The context of a social media is also an important source of sentiment information. In social media, the first sentence of a post often contains sentiment (unlike reviews, for example, where the last sentence often captures the overall sentiment), and the further down in a thread you are, the more likely the conversation has gone negative. Certain authors tend to be detractors; others are fans of specific brands. The language and behavior of Twitter tends to be different than that of the blogosphere. Industries tend to have different language and patterns as well (“sick phone” vs. “sick child”).
There are too many algorithms in general use to cover here. Naïve Bayes, Maximum Entropy (MaxEnt), and Support Vector Machines (SVMs) are three of the algorithms most commonly used for text classification. Visible is currently using an algorithm based on gradient boosting [Friedman00], which outperforms all other techniques we’ve benchmarked on this problem.
Our solution also learns from feedback. We continue to gather both direct and indirect feedback, and we periodically update our sentiment models.
There has been a huge amount of research and publication in sentiment analysis, especially in the last decade. Google Scholar shows thousands to hundreds of thousands of papers related to the topic, and we’ve surveyed almost a hundred of the most relevant while doing our research. [Liu09] and [PangLee08] give a much broader overview.
Unsupervised (or semi-supervised) techniques represent a broad class of approaches worth mentioning. Visible has billions of unlabeled social media posts available to us, and we have tried some of the best of these techniques. In our experience, these techniques do work, but they don’t work as well as supervised learning techniques (given tens of millions of labeled examples).
Our researchers continue to push the state of the art in sentiment analysis. Our philosophy is extremely pragmatic: use whatever technique works the best. Consistent with this philosophy, we will continue to adapt and improve our solution.
For a more in-depth discussion of the complexities of assessing social media sentiment, see Measuring Social Sentiment: Assessing and Scoring Opinion in Social Media.
Friedman, J. and Hastie, T. and Tibshirani, R. 2000. Additive logistic regression: a statistical view of boosting. The annals of statistics, 28(2):337-407.
Liu, B. 2009. Sentiment analysis and subjectivity. Handbook of Natural Language Processing, Second Edition.
Pang, B. and Lee, L. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1-2):1-135.