{"id":7393,"date":"2023-10-16T19:00:12","date_gmt":"2023-10-17T00:00:12","guid":{"rendered":"https:\/\/www.sharpsightlabs.com\/?p=7393"},"modified":"2023-12-17T16:18:05","modified_gmt":"2023-12-17T22:18:05","slug":"true-negative-explained","status":"publish","type":"post","link":"https:\/\/www.sharpsightlabs.com\/blog\/true-negative-explained\/","title":{"rendered":"True Negative, Explained"},"content":{"rendered":"
If you want to master building classification systems for machine learning, you need to understand how to evaluate<\/em> classifiers.<\/p>\n And in turn, that means you need to understand classification metrics.<\/p>\n In classification there are a wide variety of metrics, like precision, recall, sensitivity, accuracy, and many others, but most of these metrics are actually based on a few very simple metrics: False Positive, False Negative, True Positive, and True Negative.<\/p>\n And with that in mind, in this post, I’m going to focus entirely on True Negatives.<\/p>\n I’m going to explain what True Negatives are, how they’re used in classification evaluation, and also introduce you to a few issues around measuring True Negatives.<\/p>\n If you need something specific, you can click on any of the following links. The link will take you to the appropriate section of the tutorial.<\/p>\n Table of Contents:<\/strong><\/p>\n But keep in mind that this is largely a conceptual article, and one part really builds on the next. It’s probably best if you read the whole thing.<\/p>\n With that said, let’s start learning about True Negatives.<\/p>\n <\/a><\/p>\n In simplest terms, a True Negative is when an example is actually a “negative” example, and a classification system correctly predicts the <\/p>\n It’s possible that you’re a little unfamiliar with some of these terms though, so let’s quickly review classification so I can explain True Negatives in slightly clearer and more concrete terms.<\/p>\n Let’s quickly review the basics of classification.<\/p>\n Classification systems \u2013 sometimes called “classifiers” \u2013 are machine learning prediction systems that predict categories<\/em>.<\/p>\n In contrast to regression systems, which produce numbers as predicted outputs …<\/p>\n Classifiers predict categorical labels.<\/p>\n Let’s take the simplest case and discuss binary classification.<\/p>\n In binary classification, which is the simplest and most common type of classification problem, we only have two possible outcomes.<\/p>\n The exact encoding of these outcomes depends on the task, but common binary encodings are things like:<\/p>\n But ultimately, we can simplify all of these different specific encodings to a generalized encoding that includes them all.<\/p>\n We can generalize all of the above encodings into the following scheme: To illustrate how this encoding works, let’s quickly think of a spam classification task.<\/p>\n Think of your email system, like Google’s Gmail or something similar.<\/p>\n Almost all of these email systems have a spam filter<\/em>.<\/p>\n The purpose of the spam filter<\/a> is to filter out junk emails … commonly referred to as “spam”.<\/p>\n These spam email detectors are literally classification systems. For example, Google has been using machine learning classification to detect spam for well over a decade.<\/p>\n In such a classification system, instead of using the encoding This is just one example, but it shows how we can encode binary labels for a binary classifier as Now that we’ve discussed our generalized binary encoding, let’s talk about the different types of incorrect predictions that a classifier can make.<\/p>\n This is important, because it’s where we start getting to the concept of True Negatives.<\/p>\n Imagine a system that classifies images. The sole purpose of the system is identify images of cats. We’ll call it, The Cat Detector.<\/p>\n This system accepts images as inputs, and only produces one of two outputs. It will: <\/p>\n Pretty simple, right?<\/p>\n Not so fast.<\/p>\n Classification systems, like all predictive machine learning systems, make mistakes.<\/p>\n For example, it could be possible for our Cat Detector to input an actual image of a cat, but incorrectly predict In fact, we consider the actual image types (cat or not cat) and the possible predicted labels ( True Negative is one of the 4 prediction types, and it’s one of the 2 correct predictions (the other being True Positive<\/a>).<\/p>\n A True Negative is simply when the classifier predicts <\/a><\/p>\n Now that we know what True Negatives are, let’s discuss why they’re important.<\/p>\n I think there are a few reasons to know TNs, but these are probably the main reasons:<\/p>\n Let’s discuss each of these areas separately.<\/p>\n First, True Negatives directly measure the ability of a classification system to correctly identify negative examples.<\/p>\n This is often not only one of the main considerations for a classification system, and it may be of primary<\/em> importance, depending on the problem we’re trying to solve.<\/p>\n Let’s imagine a scenario where there’s a rare disease. We’ll call it Aurelian Disease.<\/p>\n This disease is rare and treatable, but very costly to treat. It will require a large amount of money, but the person would also need to be quarantined for several months. So the treatment would require a person diagnosed with this disease to leave their job, and spend a large amount of money on treatment. Let’s also say that this hypothetical disease would be contagious, and there would be a public interest in treating it.<\/p>\n In such a scenario, we would want to correctly identify the True Negatives.<\/p>\n Why?<\/p>\n If a person actually did not<\/em> have Aurelian Disease, we would want to correctly classify that case as Furthermore, there might be other abstract costs for misdiagnosis, like psychological suffering for the patient (who would need to be quarantined), and potentially a loss of public trust in health authorities if there were too many False Positives (a person who did not have Aurelian Disease, but was incorrectly classified as There would be a strong need to correctly classify negative cases as Having said that, there’s always a tradeoff between detection of True Negatives and avoidance of False Positives.<\/p>\n We’ll discuss that more later in this blog post.<\/p>\n The number of True Negatives is not just useful by itself as a classification evaluation metric.<\/p>\n True Negatives are also involved in the calculation of several other classification metrics, such as:<\/p>\n Although you’re not going to be computing these metrics by hand yourself, you’ll need to know how they’re computed and what they represent. <\/p>\n And they all include True Negatives in one way or another.<\/p>\n Ultimately, we need to understand True Negatives in order to optimize our classification models.<\/p>\n There are instances where we will need to optimize a classifier strongly for detection of True Negatives.<\/p>\n But as I’ve suggested already, even if True Negatives aren’t the primary concern, almost all model optimization will take True Negatives into account in one way or another.<\/p>\n So if you want to learn how to build great classification systems (and let’s be honest you should, since there will be enormous amounts of money in it), then you need to understand True Negatives and how the metrics related to TNs enable you to optimize your classifiers.<\/p>\n <\/a><\/p>\n Although we often need to compute True Negatives and related metrics to evaluate the performance of a classification system, there are several potential pitfalls and other considerations that we need to consider as well.<\/p>\n Here are a few extra things to consider alongside True Negatives.<\/p>\n For most classification systems, True Negatives are not perfectly constant, but rather, they vary depending on how you build your classifiers.<\/p>\n\n
What is a True Negative?<\/h2>\n
negative<\/code> class.<\/p>\n
Classification Review: Classification Systems Predict Categories<\/h3>\n
Binary Classification Labels<\/h4>\n
\n
1<\/code> and
0<\/code><\/li>\n
True<\/code> and
False<\/code><\/li>\n
Spam<\/code> and
Not Spam<\/code> (for example, in an email “spam” classification task)<\/li>\n
Cancer<\/code> and
Not Cancer<\/code> (for example, in a medical diagnostics task)<\/li>\n<\/ul>\n
Positive and Negative as a General Binary Encoding<\/h4>\n
positive<\/code> and
negative<\/code>.<\/p>\n
Example: Spam Classification<\/h5>\n
Spam<\/code> and
Not Spam<\/code>, we can encode the possible outcomes as follows:<\/p>\n
\n
Positive<\/code> if the system thinks that the email is “spam”.<\/li>\n
Negative<\/code> if the system thinks that the email is “not spam”.<\/li>\n<\/ul>\n
positive<\/code> and
negative<\/code><\/a>.<\/p>\n
Classifiers Make Different Type of Correct and Incorrect Predictions<\/h4>\n
\n
positive<\/code> if it thinks that the image is a cat.<\/li>\n
negative<\/code> if it thinks that the image is not<\/em> a cat.<\/li>\n<\/ul>\n
negative<\/code>.<\/p>\n
positive<\/code> and
negative<\/code>), then we can identify 4 possible types of correct and incorrect predictions:<\/p>\n
\n
positive<\/code> when the actual value is positive (an image of a cat)<\/li>\n
negative<\/code> when the actual value is negative (not an image of a cat)<\/li>\n
positive<\/code> when the actual value is negative (not an image of a cat)<\/li>\n
negative<\/code> when the actual value is positive (an image of a cat)<\/li>\n<\/ul>\n
negative<\/code>, and the actual ground truth value is
negative<\/code>.<\/p>\n
Why are True Negatives Important?<\/h2>\n
\n
Measure the Ability of a Classifier to Identify Negative Cases<\/h3>\n
Example: Detecting a rare but costly-to-treat disease<\/h4>\n
negative<\/code>, because incorrectly diagnosing the case as
positive<\/code> would have high costs.<\/p>\n
positive<\/code>). <\/p>\n
negative<\/code>.<\/p>\n
But, there’s a tradeoff ….<\/h5>\n
True Negatives are the Basis for Other Classification Metrics and Evaluation Tools<\/h3>\n
\n
Model Optimization<\/h3>\n
Pitfalls, Caveats and Other Considerations<\/h2>\n
True Negatives often Depend on Classification Threshold<\/h3>\n