Understanding Classification Metrics in Scikit-Learn

1 Aug, 2018 · Read in about 2 min · (282 Words)
data science python blog titanic

In this blog post, we will walk through and provide some basic tutorial of the classification metrics in Python’s scikit-learn by using Titanic dataset for demonstration. We will write our own functions from scratch to understand the math and process behind those metrics.

One of the main major area, especially in the business setting, of predictive modeling is classification. Classification is the problem of identifying to which of a set of categories (or group) a new observation belongs to, by using the previous knowledge (training set of data) whose category membeship is known. Some of the most popular examples are the spam email detection (classified the email onto spam and non-spam), churn detection (classified the customers who will continue to use the services or will soon stop using the services).

For demonstration purpose, Titanic dataset (available in Kaggle) will be used. The classification problem here is to predict who would likely to survive based on the given data (i.e. ages, family, passenger class). Hence, we need to classify onto survive and non-survive passengers.

[Insert some pictures here]

As you train the classification model, you will want to assess how good the classifier is. There are many different ways of evaluating the classifier performance. There is out-of-the-box Python package, called scikit-learn, which have been used by most data scientists. The package itself contains many built-in functions for analyzing the model performance. In this tutorial, we will walk through a few key metrics and write our own functions from scratch for deeper understanding the logic behind it.

This tutorial will cover the following metrics from sklearn.metrics :

confusion matrix
accuracy score
recall score
precision score
f1 score
roc curve
roc auc score

[Insert some pictures here]

Getting Started