Understanding the Confusion Matrix

Blog

Understanding the Confusion Matrix

Understanding The Confusion Matrix

Confusion Matrix

It is not intimidating as the name suggests. It was so named because it shows the extent to which a model is confused about the class an observation belongs to. In this article, I will be explaining what the confusion matrix is and how its values are been interpreted.

Confusion matrix is a contingency table that summarizes the performance of a machine learning classifier model. In this article, we will be considering confusion matrix for binary classification.

The classes an observation belongs to in a binary classification problem are generally labelled as positive and negative class. The positive class is often the class of interest. Interest here means the class we want to flag. For example, if we build a spam detecting machine learning model for mails, the category of interest (emails which we want to flag) are spam. As such, spam would be the positive class, while ham would be the negative class.

Each prediction of a binary classifier can fall under one of the following four categories.

  1. True Positives (TP): These are observations which actually belong to the positive class, and were flagged as being positive. These are correct positive predictions and are also known as hits.
  2. True Negatives (TN): These are observations which are actually negative and were flagged as being negative. Like true positives, they are also correct predictions but technically correct negative predictions. They also called correct rejections.
  3. False Positive (FP): These are observations which are actually negative but were flagged (as being positive). They are incorrect predictions and are example of error of the first kind hence, they are called type I error. They are also called false alarm because the model will raise alarm about them (flag them as positive) whereas they are negative.
  4. False Negatives (FN): These are observations which are actually positive but were labelled as negative. They are incorrect predictions and are error of the second kind hence, they are called type II error. They are also known as miss because the model misses them by not flagging them as positive.

Confusion matrix is a contingency table that nicely shows these four types of predictions such that the rows are actual labels, while the columns are predicted labels (some literatures do the reverse). Though there are variations in the arrangement of the categories of predictions, a typical confusion matrix is shown below:

                Predicted

Actual

Positive

Negative

Positive

TP

FN

Negative

FP

TN

For example, if for 100 observations, a model hits 40 (true positives), correctly rejects 43 (true negatives), raises 6 false alarms (false positives), and misses 11 (false negatives); we can represent these in a contingency table as shown below:

                Predicted

Actual

Positive

Negative

Positive

40

11

Negative

6

43


Reading the table highlighted in red:

  1. Row 1, column 1 is the number of true positives because the label of the row is actual positive and that of the column is predicted positive making them to be predicted positives that are actually positive.
  2. Row 1, column 2 is the number of false negatives because the label of the row is actual positive, while that of the column is predicted negative making them to be predicted negatives that are actually positive.
  3. Row 2, column 1 is the number of false positives because the label of the row is actual negative, while that of the column is predicted positive making them to be predicted positives that are actually negative.
  4. Row 2, column 2 is the number of true negatives because the label of the row is actual negative, while that of the column is predicted negative making them to be predicted negatives that are actually negative.

A Store of Metrices

Beyond being a contingency table that shows the number of true positive, true negative, false positive and false negative; other metrices can be gotten from the confusion matrix hence, it is like a store of metrices. The popular metrics that can be evaluated from a confusion matrix are: accuracy, recall, specificity, precision, negative predictive value, and the f1 score. We will look at these one after the other.

Accuracy

Accuracy is the ratio of correct predictions to the number of predictions. It is given by:

For our example, the accuracy would be:

Recall

Recall is the ratio of true positive to actual positive. It gives the fraction of the actual positives that were hit. Since it measures how sensitive a model is to the positive class, it is also called sensitivity or True Positive Rate (TPR). It is given by:

For our example, the recall would be:

Specificity

This is the also called the True Negative Rate (TNR). It is the ratio of true negative to actual negative. It measures how correct the model is in not flagging the negative class. It is given by:

For our example, the specificity would be:

Precision

Precision measures how precise a model is when it classifies an observation as being positive. It is the ratio of true positive to predicted positive. It is also called Positive Predictive Value (PPV) and it is given by:

For our example, the recall would be:

Negative Predictive Value

This is long for NPV and it measures how accurate a model is in its negative predictions. It is the ratio of true negative to predicted negative. It is given by:

For our example, the recall would be:

F1 Score

Accuracy may be misleading when the class is heavily unbalanced because its value may be inflated by the majority class. F1 score is a robust metric to heavily unbalanced data because it penalizes misclassification more. It is the harmonic mean of precision and recall and it is given by:

For our example, the recall would be:

Conclusion

We have seen that the confusion matrix is a contingency table that shows the true positive, true negative, false positive and false negative. These values it contains summarizes the performance of a classification model and they can be used to obtain other metrics like accuracy, recall, specificity, precision, negative predictive value, and the f1 score.

See Also: Hypothesis TestingImportance of Data VisualizationLinear Regression SimplifiedLogistic Regression ExplainedRegression Analysis: Interpreting Stata Output

 


← Back


Comments

No comments added


Leave a Reply

Success/Error Message Goes Here
Do you need help with your academic work? Get in touch

AcademicianHelp

Your one-stop website for academic resources, tutoring, writing, editing, study abroad application, cv writing & proofreading needs.

Get Quote
TOP