Bachelor's Thesis • Machine Learning • Natural Language Processing • Python
This Bachelor's Thesis project titled "Machine Learning: A powerful tool for enhancing customer experience through sentiment analysis" explores how businesses can leverage sentiment analysis to interpret large volumes of customer reviews from Amazon. The research investigates how automating the classification of text into positive, negative, and neutral categories helps companies rapidly identify trends, strengths, and areas for product improvement using data-driven decisions.
Amazon Customer Reviews
Dataset: 2015-2020
20 Product Categories
Text Cleaning
Star Rating Mapping
Data Normalization
TF-IDF Vectorization
CountVectorizer
Text to Numerical
8 ML Algorithms
80% Training Data
Cross Validation
20% Test Data
Performance Metrics
Model Comparison
Logistic Regression achieved highest accuracy for two-class sentiment analysis, demonstrating superior performance in binary classification tasks.
VADER sentiment analyzer proved highly effective with 90% accuracy for binary classification, showing the power of rule-based methods.
Training on diverse product categories yields more robust models, reducing category bias and improving generalization across domains.
Adding neutral sentiment class reduces overall accuracy, highlighting the complexity of identifying mixed or ambiguous sentiment expressions.
Model performance varies significantly when testing across different product categories, emphasizing the importance of domain-specific training.