Tech stack: scikit-learn, Matplotlib, Seaborn, Pandas, NumPy.
Dataset used: Telecommunications Dataset, link here.
Description: A logistic regression model was used from the sk-learn library to predict customer churn on a telecommunications dataset. The correlation matrix was visualised to select the relevant training features for training the model. The independent variables were normalized and split into training and testing sets. The model was trained and tested for all solvers, with C = 0.01. Best performance was by the liblinear solver model.
Resulting accuracy metrics:
Best solver = liblinear
Jaccard score = 0.36
Log loss = 0.62
Classification report:
Confusion Matrix: