Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts (2307.02640v1)

Published 5 Jul 2023 in cs.CL

Abstract: The massive collection of user posts across social media platforms is primarily untapped for AI use cases based on the sheer volume and velocity of textual data. Natural language processing (NLP) is a subfield of AI that leverages bodies of documents, known as corpora, to train computers in human-like language understanding. Using a word ranking method, term frequency-inverse document frequency (TF-IDF), to create features across documents, it is possible to perform unsupervised analytics, ML that can group the documents without a human manually labeling the data. For large datasets with thousands of features, t-distributed stochastic neighbor embedding (t-SNE), k-means clustering and Latent Dirichlet allocation (LDA) are employed to learn top words and generate topics for a Reddit and Twitter combined corpus. Using extremely simple deep learning models, this study demonstrates that the applied results of unsupervised analysis allow a computer to predict either negative, positive, or neutral user sentiment towards plastic surgery based on a tweet or subreddit post with almost 90% accuracy. Furthermore, the model is capable of achieving higher accuracy on the unsupervised sentiment task than on a rudimentary supervised document classification task. Therefore, unsupervised learning may be considered a viable option in labeling social media documents for NLP tasks.

References (9)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Unsupervised Sentiment Analysis of Plastic Surgery Social Media Posts (2307.02640v1)

Summary

Related Papers