Franchise Media Success Analysis
Overview
As part of a team project, I developed machine learning models to predict movie box office performance and identify key drivers of franchise success using a 1M-record TMDb dataset.
My Contributions
- Cleaned and preprocessed large-scale movie data, handling missing values and removing sparse features
- Engineered features including release date transformations and multi-label genre encoding
- Built and evaluated Linear Regression, Random Forest, and Logistic Regression models
Results
- Achieved up to R² = 0.75 for revenue prediction
- Reached 72% accuracy in classifying box office success
Tools & Technologies
- Python (pandas, scikit-learn)
- Data preprocessing & feature engineering
- Supervised machine learning models