A. Introduction to Model Evaluation
After we create an AI model, we need to check if it is performing well and giving correct results. This process is called Model Evaluation.
Without evaluation, we cannot trust the predictions made by the model.
B. Why is Evaluation Important?
- To check how accurate the model is.
- To identify errors or wrong predictions.
- To avoid using a model that gives biased or unfair results.
- To improve the model by finding where it performs poorly.
C. Key Terms in Model Evaluation
Term | Meaning |
---|---|
Prediction | The output given by the AI model based on input data |
Actual Value | The real or correct answer from the dataset |
True Positive (TP) | Model predicted Yes, and it was Yes |
True Negative (TN) | Model predicted No, and it was No |
False Positive (FP) | Model predicted Yes, but it was No (wrongly positive) |
False Negative (FN) | Model predicted No, but it was Yes (wrongly negative) |
D. Confusion Matrix
A confusion matrix is a table used to describe the performance of a model on a set of test data.
Structure of Confusion Matrix
Predicted: Yes | Predicted: No | |
---|---|---|
Actual: Yes | True Positive (TP) | False Negative (FN) |
Actual: No | False Positive (FP) | True Negative (TN) |
This table helps us to calculate different performance metrics.
E. Accuracy
Accuracy tells us the percentage of predictions that are correct.
Formula:
Accuracy=TP+TNTP+TN+FP+FN\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}Accuracy=TP+TN+FP+FNTP+TN
Example:
If TP = 50, TN = 30, FP = 10, FN = 10, then:
Accuracy=50+3050+30+10+10=80100=80%\text{Accuracy} = \frac{50 + 30}{50 + 30 + 10 + 10} = \frac{80}{100} = 80\%Accuracy=50+30+10+1050+30=10080=80%
F. Precision
Precision tells us how many of the predicted “Yes” results were actually correct.
Formula:
Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}Precision=TP+FPTP
Example:
If TP = 50, FP = 10, then:
Precision=5050+10=5060≈83.3%\text{Precision} = \frac{50}{50 + 10} = \frac{50}{60} \approx 83.3\%Precision=50+1050=6050≈83.3%
G. Recall (Sensitivity or True Positive Rate)
Recall tells us how many of the actual “Yes” cases were correctly predicted by the model.
Formula:
Recall=TPTP+FN\text{Recall} = \frac{TP}{TP + FN}Recall=TP+FNTP
Example:
If TP = 50, FN = 10, then:
Recall=5050+10=5060≈83.3%\text{Recall} = \frac{50}{50 + 10} = \frac{50}{60} \approx 83.3\%Recall=50+1050=6050≈83.3%
H. F1 Score
The F1 Score combines both Precision and Recall into a single value using the harmonic mean.
Formula:
F1=2×Precision×RecallPrecision+RecallF1 = 2 \times \frac{Precision \times Recall}{Precision + Recall}F1=2×Precision+RecallPrecision×Recall
F1 Score is useful when we want to balance between Precision and Recall.
I. Example Case Study – Spam Email Classifier
Actual | Predicted | |
---|---|---|
Email 1 | Spam | Spam |
Email 2 | Not Spam | Spam |
Email 3 | Spam | Not Spam |
Email 4 | Not Spam | Not Spam |
From this data:
- TP = 1 (Spam predicted as Spam)
- FP = 1 (Not Spam predicted as Spam)
- FN = 1 (Spam predicted as Not Spam)
- TN = 1 (Not Spam predicted as Not Spam)
Now, calculate:
- Accuracy = (TP + TN) / Total = (1 + 1) / 4 = 50%
- Precision = TP / (TP + FP) = 1 / (1 + 1) = 50%
- Recall = TP / (TP + FN) = 1 / (1 + 1) = 50%
This shows the model is not very reliable.
J. When to Use Which Metric
Metric | Use When |
---|---|
Accuracy | Data is balanced (equal Yes/No outcomes) |
Precision | You want to avoid false positives (e.g., fraud detection) |
Recall | You want to avoid false negatives (e.g., disease detection) |
F1 Score | You want a balance between precision and recall |
K. Common Mistakes in Model Evaluation
- Only checking accuracy (not enough in real problems).
- Ignoring false positives and false negatives.
- Not checking for bias or fairness in predictions.
- Using small or unbalanced test data.
L. Activity Suggestion
Give students a mini dataset of predicted and actual values. Ask them to:
- Build a confusion matrix.
- Calculate Accuracy, Precision, and Recall.
- Interpret what the values tell us about the model.
M. Keywords to Remember
Term | Description |
---|---|
Confusion Matrix | Table showing TP, TN, FP, FN |
Accuracy | Percentage of correct predictions |
Precision | Out of all predicted “Yes,” how many were actually “Yes” |
Recall | Out of all actual “Yes,” how many were predicted correctly |
F1 Score | A single score combining precision and recall |
TP, TN, FP, FN | Counts of different correct and incorrect predictions |
N. Summary of the Unit
- Evaluation is a key step to check how well an AI model performs.
- A confusion matrix helps organize the outcomes.
- Important metrics include Accuracy, Precision, Recall, and F1 Score.
- The choice of metric depends on the type of problem you are solving.