Binary Cross Entropy/Log Loss for Binary Classification - Coding

In the field of machine learning and data science, effectively evaluating the performance of classification models is crucial. Binary cross-entropy, also known as log loss, is one of the most widely used metrics in binary classification tasks. This metric plays a fundamental role in training models and ensuring they accurately distinguish between two classes. In this article, we’ll see what binary cross-entropy is, how it works, and why it’s important for binary classification.

What is binary cross-entropy?

Binary cross-entropy is a loss function used in binary classification problems where the target variable has two possible outcomes, 0 and 1 and it measures the performance of the classification model whose output is a probability is a value between them. The goal of the model is to minimize this loss function during training to improve its predictive accuracy.

Mathematically, Binary Cross-Entropy (BCE) is defined as:

[Tex]\text{BCE} = – \frac{1}{N} \sum_{i=1}^{N} \left[ y_i \log(p_i) + (1 – y_i) \log(1 – p_i) \right][/Tex]

where:

???? is the number of observations.
???????? is the actual binary label (0 or 1) of the ????i-th observation.
???????? is the predicted probability of the ????i-th observation being in class 1.

How Does Binary Cross-Entropy Work?

Binary Cross-Entropy measures the distance between the true labels and the predicted probabilities. When the predicted probability ???????? is close to the actual label ???????? , the BCE value is low, indicating a good prediction. Conversely, when the predicted probability deviates significantly from the actual label, the BCE value is high, indicating a poor prediction. The logarithmic component of the BCE function penalizes wrong predictions more heavily than correct ones.

For example, if the true label is 1 and the predicted probability is close to 0, the loss is substantial. This characteristic makes BCE particularly effective in driving the model to improve its predictions during training.

Why is Binary Cross-Entropy Important?

Training Deep Learning Models: Binary Cross-Entropy is commonly used as the loss function for training neural networks in binary classification tasks. It helps in adjusting the model’s weights to minimize the prediction error.
Probabilistic Interpretation: BCE provides a probabilistic interpretation of the model’s predictions, making it suitable for applications where understanding the confidence of predictions is important, such as in medical diagnosis or fraud detection.
Model Evaluation: BCE offers a clear and interpretable metric for evaluating the performance of binary classification models. Lower BCE values indicate better model performance.
Handling Imbalanced Data: BCE can be particularly useful in scenarios with imbalanced datasets, where one class is significantly more frequent than the other. By focusing on probability predictions, it helps the model learn to make accurate predictions even in the presence of class imbalance.

Mathematical Example of Binary Cross-Entropy

To better understand how Binary Cross-Entropy (BCE) works, let’s walk through a detailed mathematical example.

Consider a binary classification problem where we have the following true labels (????) and predicted probabilities (????) for a set of observations:

Observation	True Label (????y)	Predicted Probability (????p)
1	1	0.9
2	0	0.2
3	1	0.8
4	0	0.4

We will calculate the Binary Cross-Entropy loss for this set of observations step-by-step.

The formula for Binary Cross-Entropy is:

[Tex]\text{BCE} = -\frac{1}{N} \sum_{i=1}^{N} \left[ y_i \log(p_i) + (1 – y_i) \log(1 – p_i) \right] [/Tex]

Where:

???? is the number of observations (in this example, ????=4).
???????? is the true label for the ????-th observation.
???????? is the predicted probability for the ????-th observation.

Step-by-Step Calculation:

1. Observation 1:

Here, True label ????1=1 and Predicted probability ????1=0.1

[Tex]\text{Loss}_1 = – \left( 1 \cdot \log(0.9) + (1 – 1) \cdot \log(1 – 0.9) \right) = – \log(0.9) \approx -(-0.1054) = 0.1054[/Tex]

Similarly, for other classes,

p2=0.2 and Loss2=0.223
????3=0.8and Loss3=0.2231
????4=0.4 and Loss4=0.5108

Next, we sum the individual losses and calculate the average:
Total Loss=0.1054+0.2231+0.2231+0.5108=1.0624

Average Loss (BCE)=1.06244/4=0.2656

Therefore, the Binary Cross-Entropy loss for these observations is approximately 0.2656.

Implementation of Binary Cross Entropy in Python

Manual Calculation with NumPy:The function binary_cross_entropy manually calculates BCE loss using the formula, averaging individual losses for true labels (y_true) and predicted probabilities (y_pred).

Keras Calculation:
- The binary_crossentropy function from Keras computes BCE loss directly and efficiently, taking the same inputs (y_true and y_pred), with results converted to NumPy format.
Verification:
- The close match between manual (bce_loss) and Keras (bce_loss_keras) calculations validates the manual implementation, ensuring accuracy in computing BCE loss for binary classification models.

Python

import numpy as np
from keras.losses import binary_crossentropy

# Example true labels and predicted probabilities
y_true = np.array([0, 1, 1, 0, 1])
y_pred = np.array([0.1, 0.9, 0.8, 0.2, 0.7])

# Compute Binary Cross-Entropy using NumPy
def binary_cross_entropy(y_true, y_pred):
    bce = -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))
    return bce

bce_loss = binary_cross_entropy(y_true, y_pred)
print(f"Binary Cross-Entropy Loss (manual calculation): {bce_loss}")

# Compute Binary Cross-Entropy using Keras
bce_loss_keras = binary_crossentropy(y_true, y_pred).numpy()
print(f"Binary Cross-Entropy Loss (Keras): {bce_loss_keras}")

Output:

Binary Cross-Entropy Loss (manual calculation): 0.20273661557656092
Binary Cross-Entropy Loss (Keras): 0.2027364925606956

The manual calculation using NumPy might have slightly different floating-point precision or rounding behavior compared to the Keras implementation. Keras might use optimized backend operations and higher precision floating-point arithmetic, leading to a very slightly different results.

Conclusion

Binary Cross-Entropy (BCE) is a crucial loss function for binary classification tasks, effectively measuring the performance of models by comparing true labels with predicted probabilities. Its logarithmic nature penalizes incorrect predictions more heavily, guiding the model to improve accuracy during training. Understanding and implementing BCE ensures robust evaluation and enhancement of binary classification models, especially in deep learning applications.

Reffered: https://www.geeksforgeeks.org

AI ML DS

Related
Action Selection in Intelligent Agents
Types of Neural Networks
Final Year Projects for Data Science Portfolio
Applications of Deep Learning In Healthcare
How To Align Kde Plot With Strip Plot In Seaborn?

Type:	Geek
Category:	Coding
Sub Category:	Tutorial
Uploaded by:	Admin
Views:	12