Data Analytics

Data Analytics

The data analytics is done using NLP over the TextBlob toolkit in Python. The TextBlob toolkit is built over NLTK and pattern, and has Part of Speech tagging and sentiment analysis features that are critical in our data analysis.

A list of reviews related to a product are analyzed using NLP. This reviews are further broken up into sentences, from which Part-Of-Speech Tagging is used to extract relevant attributes from a set of reviews related to a product. The sentences of the reviews are being extracted and sentiment analysis is performed to get a sentiment score. The sentiment score ranges from -1 to 1. Using POS, adjectives are identified and the corresponding sentiment score of the sentence to which they belong is then put into a list as a 3-tuple along with a counter for the attribute. As multiple common attributes are obtained, the counter is incremented and the sentiment score is averaged with regard to the newly obtained sentiment score.

We then output the top 3 negative and frequently occurring words from the list and the top 1 positive and frequently occurring word along with their count and average sentiment values. The shopkeeper can then use this list, at the end of the day or week, to extract themost frequent positive and negative sentiment words/phrases towards the product.