Understand Tweets Better with BERT Sentiment Analysis
Sentiment Analysis is one of the key topics in NLP to understand the public opinion about any brand, celebrity, or politician. Thanks to pretrained BERT models, we can train simple yet powerful models. In this study, we will train a feedforward neural network in Keras with features extracted from Turkish BERT for Turkish tweets.
In this study, we will use BOUN Twitter Data (2018) which have 8000 of Turkish tweets with 3 classes: positive, neutral, negative. This dataset is annotated by different judges.
Also, this dataset is imbalanced: 52% is neutral, 30% is positive, 18% is negative. We will take care of this problem.
Complete version of this code and the datasets can be reached in my github repo.
1. Importing libraries and paths
Let’s start with importing required libraries:
Set paths of the dataset, path of tweet jsons for analysis, and the device.
Using CPU for the training is sufficient but feature extraction with BERT can take some time.
2. Filtering and Feature Extraction with BERT
Simple filter function for links and mentions.
Now, it is the BERT time. We are using Turkish tweets, so we use Turkish BERT. If you want to train a model for another language, check out community models of huggingface. If you cannot see a model for that language, you can use multilingual BERT.
This feature_extraction method:
- Takes a sentence.
- Filter it with our simple filter method.
- Tokenize it with Bert Tokenizer.
- Produce a vector with the length of 768 to represent the sentence.
Let’s load our dataset and process through our feature extraction method.
data_prep function takes a dataset portion(train/val/test) and extract features for each sentence. Also, produces one hot encodings of sentiments ([1, 0, 0] for negative, [0, 1, 0] for neutral, [0, 0, 1] for positive)
3. Training a Model
Now, it is the BEST part. We are starting to train a model. Let’s define the model first.
- We use class_weights to take care of our imbalanced dataset(To prevent tendency of predicting neutral most of the time due to its high frequency).
- Dropout layers for regularization.
- Softmax activation function for final layer and tanh for others.
- Adagrad optimizer.
- Categorical crossentropy loss function which is suited for multiclass classification.
Now, let’s start the training and check out our scores in the test set.
Results of the test set:
We can use average recall for our final score which is used in Semeval 2017 - Task 4. Also, average is macro average of scores in here. So, we don’t take the number of tweets in each class into account while taking average.
The results might be a little bit different(less than 1%) in each time because we shuffle training data in model.fit() in each epoch and it generates randomness to the results.
Now we will analyze real tweets by our model!
Our first suspect is Cappy. Lately, I have seen tweets about Cappy juice in Turkey which have unidentified objects in it. People were talking about this and I want to see the effect. The ugly tweets:
First, tweets are collected by using TweetScraper with Cappy keyword and Turkish language and saved to cappy.json file. Then, our model predicted sentiments of the tweets and remove 1 to map values between (-1,1).
After that we simply print figure with moving average to see effect of these tweets.
The results are beautiful! Big drops due to popularity of these two tweets.
Lately, big discussion about Netflix in Twitter occured after their new Turkish series with LGBT content.
The drop in sentiment is clearly visible in here too.
5.3 Berkcan Guven (Youtuber)
Our last suspect is one of the most famous youtubers in Turkey, Berkcan Guven. Berkcan Guven released a debatable video with underage celebrity. He removed the video 7 hours after uploading but 700k people watched it already. Let’s see the effect of this in Twitter.
We also see week-long drop in sentiments about Berkcan Guven after the video.
We trained a feedforward neural network with BERT features for sentiment analysis task. We also used this model to analyze popular topics in Twitter and we captured correlation between incidents and Twitter sentiments.
For the complete notebook and the datasets, check out my github repo.