Lending club based data analysis practical project (Open source)
The project is an independently completed personal open source project.
- Project Description: Based on the Lending Club dataset, this project explores how to improve predicting whether to grant a loan or not by using machine learning and deep learning methods.
- Responsible part: done independently. The project is based on the lending club dataset and uses lightgbm as the baseline to predict whether to give a loan or not. Based on CatBoostEncoder, clustering, exponential zone partitioning, and business logic analysis, four sets of derived variables were analyzed for improving the effect, followed by multiple machine learning methods for prediction and integration through voting fusion and Stacking with mixing, which improved the accuracy from 91.56% to 91.84%, followed by DNN methods, trying Using SGD and Adam optimizer for model training to optimize the accuracy, the effect is still lacking, using Tabnet to try, the result is poor, using Stacking stacking integration of machine learning methods and DNN methods after the accuracy increased by 0.3%
- Completion: Open source