Opinion Mining in NLP - Coding

Opinion mining, a key subfield of natural language processing (NLP), focuses on discerning the sentiment behind textual content. This process categorizes text as positive, negative, or neutral and has wide-ranging applications, from customer review analysis to market research and social media monitoring.

In this article, we will explore various opinion mining techniques, we will also implement these techniques using Python to gain practical experience.

Opinion Mining Techniques

Each opinion mining technique offers a unique approach to extracting sentiment from textual data.

Here’s an overview:

1. Sentiment Polarity Classification

Sentiment Polarity Classification provides a detailed sentiment score for a given text, ranging from negative to positive. This technique is invaluable for businesses analyzing customer reviews and social media posts. By assigning a sentiment score, businesses can make informed decisions based on the general sentiment of their customer base.

To implement this technique, we will first import the `pipeline` module from the `transformers` library. Next, we will initialize the sentiment analysis pipeline using the ‘nlptown/bert-base-multilingual-uncased-sentiment’ model. Then, we will pass a sample text into this pipeline, which will return two variables: the first is `label`, which will give one of the categories from the label_map, ranging from 1 star to 5 stars (1 star being negative, 3 stars being neutral, and 5 stars being positive). The second variable is `score`, which represents the confidence percentage. Finally, we will format the output and print the results.

Python

from transformers import pipeline

sentiment_pipeline = pipeline('sentiment-analysis', model='nlptown/bert-base-multilingual-uncased-sentiment')

text = "The movie was absolutely fantastic! I loved every moment of it."

result = sentiment_pipeline(text)[0]

label_map = {
    '1 star': 'Very Negative',
    '2 stars': 'Negative',
    '3 stars': 'Neutral',
    '4 stars': 'Positive',
    '5 stars': 'Very Positive'
}

custom_label = label_map[result['label']]
confidence_percentage = result['score'] * 100

print(f"Sentiment: {custom_label}")
print(f"Confidence: {confidence_percentage:.2f}%")

Output:

Sentiment: Very Positive
Confidence: 95.22%

2. Emotion Detection

Emotion detection is an advanced version of the above technique, which identifies the emotion expressed in the text, such as joy, anger, sadness, etc. This technique is particularly helpful in understanding the feelings and attitudes of the individuals behind the text. It enables businesses to comprehend customer feedback more effectively than the previous technique.

To implement this technique, we will first import the `pipeline` module from the transformers library, then initialize the emotion detection pipeline using the ‘j-hartmann/emotion-english-distilroberta-base’ model. Next, we will call this pipeline with a sample text, which will return a label indicating the emotion and a score variable giving the confidence percentage. Finally, we will format the output and print it.

Python

from transformers import pipeline

emotion_pipeline = pipeline('text-classification', model='j-hartmann/emotion-english-distilroberta-base')

text = "I am so excited about the new project!"

result = emotion_pipeline(text)[0]

custom_label = result['label'].capitalize()
confidence_percentage = result['score'] * 100

print(f"Emotion: {custom_label}")
print(f"Confidence: {confidence_percentage:.2f}%")

Output:

Emotion: Joy
Confidence: 97.58%

3. Aspect-based sentiment analysis

Aspect-based sentiment analysis is the most advanced technique among all, as it solves the problem present in all the other techniques, which is that they provide the sentiment of the entire text together. This technique addresses this issue by providing the sentiment based on specific aspects within the text. This allows businesses to pinpoint areas for improvement and understand customer preferences in greater detail.

To implement this technique, we will first import necessary libraries like spaCy and NLTK, and modules like `Matcher` and `SentimentIntensityAnalyzer` from spaCy and NLTK, respectively. Next, we will download the VADER lexicon and load the spaCy model, then define a sample text. We will initialize the `Matcher` and use the `SentimentIntensityAnalyzer` to evaluate the sentiment for each aspect, classifying it as positive or negative based on the compound score. Finally, we will print the list of aspects with their corresponding sentiment.

Python

import spacy
from spacy.matcher import Matcher
from nltk.sentiment import SentimentIntensityAnalyzer 
import nltk
nltk.download('vader_lexicon')

nlp = spacy.load("en_core_web_sm")

text = "The food was delicious, but the service was slow."

doc = nlp(text)

matcher = Matcher(nlp.vocab)
pattern = [{"LOWER": {"IN": ["food", "service"]}}] 
matcher.add("AspectMatcher", [pattern])


sia = SentimentIntensityAnalyzer()  
aspects = []
for match_id, start, end in matcher(doc):
    aspect = doc[start:end].text
    sentiment = None
    
    scores = sia.polarity_scores(aspect)
    sentiment = "positive" if scores["compound"] > 0 else "negative" 

    aspects.append({"aspect": aspect, "sentiment": sentiment})

print(aspects)

Output:

[{'aspect': 'food', 'sentiment': 'negative'}, {'aspect': 'service', 'sentiment': 'negative'}]

4. Multilingual Sentiment Analysis

Multilingual sentiment analysis allows us to determine the sentiment of text in any language. This technique is particularly useful for global businesses that receive customer feedback in multiple languages. Using this technique, we can directly obtain the sentiment of all customer feedback at once, without the need to translate it into English first before performing sentiment analysis.

To implement this, we will first import the `pipeline` module from the `transformers` library. Next, we will create a pipeline using the ‘nlptown/bert-base-multilingual-uncased-sentiment’ model. We will then create a list of texts in different languages and pass them to the pipeline using a for loop. Finally, we will format the output and print the results.

Python

from transformers import pipeline

multilingual_sentiment_pipeline = pipeline('sentiment-analysis', model='nlptown/bert-base-multilingual-uncased-sentiment')

texts = [
    "सेवा उत्कृष्ट थी।",  # Hindi: "The service was excellent."
    "సినిమా చాలా నిరాశపరిచింది."  # Telugu: "The movie was very disappointing."
]

results = [multilingual_sentiment_pipeline(text) for text in texts]
for text, result in zip(texts, results):
    print(f"Text: {text}\nSentiment: {result}\n")

Output:

Text: सेवा उत्कृष्ट थी।
Sentiment: [{'label': '5 stars', 'score': 0.4839303195476532}]

Text: సినిమా చాలా నిరాశపరిచింది.
Sentiment: [{'label': '3 stars', 'score': 0.3825204372406006}]

Conclusion

In conclusion, opinion mining techniques such as fine-grained sentiment analysis, emotion detection, aspect-based sentiment analysis, and multilingual sentiment analysis are powerful tools for extracting and understanding the sentiment of text. By using these techniques, businesses can gain a deeper understanding of public opinion about their products, enhancing the decision-making process. Next time you need to implement sentiment analysis on your product, use one of the above techniques based on your specific needs.

Reffered: https://www.geeksforgeeks.org

AI ML DS

Related
Sequential Data Analysis in Python
Reversing sklearn.OneHotEncoder Transform to Recover Original Data
How to Identify the Most Informative Features for scikit-learn Classifiers
Mastering Calculus for Machine Learning: Key Concepts and Applications
ROC Curves for Multiclass Classification in R

Type:	Geek
Category:	Coding
Sub Category:	Tutorial
Uploaded by:	Admin
Views:	17