- Balkan Journal of Electrical and Computer Engineering
- Vol: 9 Issue: 4
- A New Technique for Sentiment Analysis System Based on Deep Learning Using Chi-Square Feature Select...
A New Technique for Sentiment Analysis System Based on Deep Learning Using Chi-Square Feature Selection Methods
Authors : Mohammed Hussein, Fatih Özyurt
Pages : 320-326
View : 11 | Download : 9
Publication Date : 2021-10-30
Article Type : Research
Abstract :The sentiment analysis system uses natural language processing techniques and a sentimental vocabulary network. Sentiment analysis means discovering and recognizing people's positive or negative feelings about an issue or product in the texts. Increasing the importance of sentiment analysis has coincided with social media's growth, such as opinion polls, weblogs, Twitter and other social networks. One of the applications of deep learning in NLP is sentiment analysis. The most common and successful type of RNN is the LSTM network. There is a lot of research that uses the LSTM ability to analyze sentiment. But large data volumes reduce the accuracy of LSTM network results in test data; in other words, the problem of over-fitting occurs. This problem occurs when there is a high correlation between independent variables. The model may not have high validity despite the high value of the correlation coefficient between the independent and dependent variables. In other words, although the model looks good, it does not have significant independent variables. Combining the LSTM network with feature selection methods can increase sentiment analysis accuracy to select effective features and solve this problem. In this study, we review state of the art to determine how previous research has addressed these tasks. We also proposed combining the feature selection method, Chi-Square with LSTM, Bi-LSTM and GRU models, the performance of each measured and compared in terms of accuracy, precision, recall, and F1 score for two benchmark datasets, YELP and US Airline. The results show that feature selection methods significantly increases classification accuracy in all cases. In the Yelp dataset, the maximum attained an accuracy of Bi-LSTM is 100% using chi-square when the number of features is 500 In the US Airline dataset, the maximum achieved an accuracy of GRU-LSTM is 97.9% using chi-square when the number of features is 20.Keywords : deep learning, feature selection, chi-square, sentiment analysis