Advanced Concepts of Modeling in AI

Published on June 29, 2025 by @mritxperts

A. Introduction to AI Models

In the previous unit, we learned that AI models are created by feeding data into algorithms to help machines learn patterns and make predictions.

In this unit, we will go deeper into modeling, understand different types of learning, and explore black box vs. transparent AI models.

B. What is a Model?

A model is the final outcome after training a machine learning algorithm on data.
It is used to make predictions or decisions based on new input data.

Example: A trained AI model can predict whether an email is spam or not.

C. Types of Learning in AI Models

There are two main types of learning used to build AI models:

1. Supervised Learning

The data has both input and output.
The model is trained with correct answers (labels).
The model learns to map inputs to correct outputs.

Example:
If you have data about hours studied (input) and test scores (output), you can train a model to predict test scores.

Common Algorithms:

Linear Regression
Decision Tree
Support Vector Machine (SVM)

2. Unsupervised Learning

The data has only inputs, no output labels.
The model finds hidden patterns or groups in the data.
Used for clustering, pattern detection, etc.

Example:
Grouping customers based on buying habits without knowing their categories.

Common Algorithms:

K-Means Clustering
Hierarchical Clustering
PCA (Principal Component Analysis)

D. Black Box vs. Transparent AI Models

1. Black Box Models

The internal working of the model is not easily understood by humans.
Difficult to explain how the model made a decision.
Often more accurate but less explainable.

Examples: Neural Networks, Deep Learning Models

Problem: If an AI model rejects a loan application, it should explain why — but black box models can’t easily do that.

2. Transparent (Explainable) Models

The working of the model is clear and easy to understand.
Decisions can be explained in simple terms.
Easier to trust and correct.

Examples: Decision Trees, Linear Regression

Importance: In school settings or medical fields, it’s better to use transparent models where reasons for decisions must be known.

E. Concept of Bias in AI Models

AI models learn from data. If the data is biased, the model also becomes biased.

Types of Bias:

Data Bias: Data collected may not represent all types of users.
Algorithmic Bias: The model may give more importance to certain features unfairly.
Human Bias: Mistakes made while collecting or labeling data.

Example: A facial recognition system trained mostly on photos of light-skinned people may not work well for dark-skinned individuals.

F. Training, Testing, and Validation

Training Data: Used to teach the model.
Testing Data: Used to evaluate the model’s performance.
Validation Data: Used to fine-tune the model during training (optional).

Why split data?

To avoid overfitting – a model that only memorizes the training data and performs poorly on new data.

G. Overfitting and Underfitting

Overfitting

The model works well on training data but poorly on new data.
It has memorized the data, not learned from it.

Underfitting

The model is too simple and cannot understand the data.
Gives poor results even on training data.

Goal: Create a balanced model that performs well on both training and testing data.

H. Real-Life Example: Spam Email Classifier

Type	Description
Input data	Email content (words, phrases)
Label (output)	Spam or Not Spam
Learning type	Supervised Learning
Model type	Transparent (if using Decision Tree), Black Box (if using Deep Learning)
Bias to avoid	Avoid training only on certain types of emails

I. Activity Suggestion (for Class)

Ask students to collect a small dataset:

Feature: Study time (in hours)
Label: Marks in test

Using this data, create a simple supervised model and plot a line graph to show how marks increase with study time.

Then explain:

What happens if data has errors?
Can we explain how the model is making predictions?

J. Keywords to Remember

Term	Meaning
Supervised Learning	Learning from data with input and correct output (label)
Unsupervised Learning	Learning from data with no labels; model finds patterns on its own
Black Box Model	A model whose working is difficult to understand
Transparent Model	A model whose working is easy to understand
Bias	Unfair preference or treatment built into the model
Overfitting	Model performs well on training data but poorly on new data
Underfitting	Model fails to capture patterns even in training data

K. Summary of the Unit

AI models learn from data through supervised or unsupervised learning.
Transparent models help build trust and are easier to explain.
Black box models are more complex and harder to interpret.
Bias in data can lead to unfair AI decisions.
We must aim to build fair, balanced, and explainable AI systems.