ML: Simple Model
A very simple model explaining a linear equation, y = mx + b
In the previous article, ML: Simple Math, we discussed the linear equation -
y = mx + b
where -
m = slope
b = y-intercept (point at which the line intersects the Y-axis)
By convention in ML, the equation is written as -
y = b + wx
where -
y = label (output)
x = feature (input)
w = weight (the slope, m)
b = bias (the y-intercept)
Reconsider the previous example. We calculated the slope m
and the y-intercept b
as m = -1
and b = 5
.
(0, 5)
(2, 3)
(6, -1)
(4, ?)
We were able to simply substitute a few given values and find m
and b
. For complex data with multiple features (inputs) it is very difficult to find the weights and biases. A model needs to be trained extensively to compute the weights and biases.
Time to look at the actual code.
Imports.
import torch
import torch.nn as nn
import torch.optim as optim
import copy
import numpy as np
Define training data based on given inputs.
# Training data i.e. given points (0, 5), (2, 3), (6, -1)
X_train = torch.tensor([0.0, 2.0, 6.0])
y_train = torch.tensor([5.0, 3.0, -1.0])
Get a few more points on the graph as test data to verify the model's predictions.
# Test data (5, 0), (1, 4), (3, 2)
X_test = torch.tensor([ 5.0, 1.0, 3.0])
y_test = torch.tensor([0.0, 4.0, 2.0])
Define a simple model.
# Defining the linear equation with 2 variables i.e. 1 input
# and 1 output
# y = b + wx
model = nn.Linear(1, 1)
The next step is to find slope m
(weight w) and b
the y-intercept (bias). You can ignore all the complexities in the below function for now and understand that its job is to find weight and bias.
Here we train the model with the given input and compute the weight w
and bias b
.
def find_weight_and_bias():
loss_fn = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.02)
n_epochs = 1000
batch_size = 1
batch_start = torch.arange(0, len(X_train), batch_size)
best_mse = np.inf
best_weights = None
history = []
for epoch in range(n_epochs):
model.train()
for start in batch_start:
X_batch = X_train[start:start+batch_size]
y_batch = y_train[start:start+batch_size]
y_pred = model(torch.reshape(X_batch, (-1, 1)))
loss = loss_fn(y_pred, torch.reshape(y_batch, (-1, 1)))
optimizer.zero_grad()
loss.backward()
optimizer.step()
model.eval()
y_pred = model(torch.reshape(X_test, (-1, 1)))
mse = loss_fn(y_pred, torch.reshape(y_test, (-1, 1)))
mse = float(mse)
history.append(mse)
if mse < best_mse:
best_mse = mse
best_weights = copy.deepcopy(model.state_dict())
return best_weights
Now, let's find weight and bias.
# Find weight and bias
weight_and_bias = find_weight_and_bias();
print(f'Weight and Bias: {weight_and_bias}')
Predicting the value for x = 4
-
# Substitute values of w and b in y = b + wx
model.load_state_dict(weight_and_bias)
# Predict the value for x = 4
y = model(torch.tensor([4.0]))
print(f'Predicting value, x = 4')
print(f'y: {y}')
Output -
Weight and Bias:
OrderedDict([('weight', tensor([[-1.0000]])), ('bias', tensor([5.0000]))])
Predicting value, x = 4
y: tensor([1.0000], grad_fn=<AddBackward0>)
Observe that w = -1
, b = 5
and the value of y for x = 4
is 1.
See all the code in the colab here.