Keywords: Exploratory Data Analysis (EDA), Principal Component Analysis (PCA), Linear Regression, Decision Trees, Random Forest, Neural Network, Feature Importance, R
Summary: This project aims to predict soccer players' market values by analyzing their physical attributes, skill metrics, and demographic data using a dataset of 17,000 players from SoFIFA.com.
Methodologies include data preprocessing, exploratory data analysis (EDA), principal component analysis (PCA), and predictive modeling through linear regression, decision trees, random forests, and neural networks.
Key findings highlight the influence of attacking skills like crossing, finishing, and heading on market values, along with significant contributions from physical attributes such as height and potential. PCA reduced dimensionality by explaining 90% of skill variability with the first eight components. Decision trees demonstrated the highest accuracy in categorizing players by market value.
Limitations include subjectivity in FIFA's player skill ratings and dataset constraints. Future work could involve refining data accuracy, adjusting for inflation, and leveraging larger datasets for improved predictive performance.
Project Write-Up