Building Regression Models with Microsoft Cognitive Toolkit (CNTK)
Last Updated :
09 Sep, 2024
The Microsoft Cognitive Toolkit (CNTK), also known as the Computational Network Toolkit, is an open-source, commercial-grade toolkit designed for deep learning. It allows developers to create models that mimic the learning processes of the human brain. Although Cognitive Toolkit is mainly used for Deep Learning, it can be used to build models that can be used for Regression analysis.
Setting Up CNTK for Regression Models
Regression models are essential in predicting continuous values based on input data. CNTK provides a robust framework for building regression models, leveraging its deep learning capabilities to handle complex datasets.
To set up CNTK for Regression, we have to ensure that Python 3.6.8 is installed on our local machine. Before building a regression model using CNTK, you need to set up the CNTK environment. Follow these steps to install and configure CNTK in your system:
Install CNTK: Install CNTK via pip with the command:
pip install cntk
Now, Let's begin by importing the required libraries, such as CNTK, NumPy, and matplotlib, for creating the model and plotting results. Data preparation is an important step since we will be using the data to train our model and we need to ensure that the model can predict the continuous values with perfection. Follow the below steps for second step:
- First collect the data and identify the independent and dependent variables.
- Then handle the missing values either by removing or filling the NaN values with mean, median or mode.
- Use One Hot Encoding and Label Encoding techniques to convert the categorical values to numerical values.
- To boost the performance of the model, scale the features by using techniques like Normalization and Standardization.
- Remove outliers from the data if present.
- Lastly ensure that all the features and the results are in numpy arrays so that CNTK can process those.
- Split the data into train and test.
Python
import cntk as C
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
# Load the California housing dataset
housing_data = fetch_california_housing()
# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(housing_data.data, housing_data.target, test_size=0.2, random_state=42)
# Normalize features using StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Define input and output dimensions
input_dim = X_train.shape[1] # Number of features
output_dim = 1 # Regression output is a single value
# Define the input and output placeholders
X = C.input_variable(input_dim)
y = C.input_variable(output_dim)
From the example, we can see that we have identified the features and the dependent values. Then we divide it into train and test, use feature scaling and also define placeholders so that CNTK can process the values.
Building Regression Models With CNTK
We can build many regression models using CNTK. Some of them are as follows:
- Linear Regression.
- Polynomial Regression.
- Ridge Regression
- Lasso Regression
In this article, we will cover each of these models, their theoretical background, and their implementation using CNTK.
1. Linear Regression
Linear Regression is the simplest model in which there is one independent and one dependent variable. In this we initialize the weights and the bias. Using the values we define the regression model.
Python
# Initialize weights and bias
W = C.parameter((input_dim, output_dim))
b = C.parameter((output_dim))
# Define the linear regression model
model = C.times(X, W) + b
2. Polynomial Regression
In Polynomial Regression there is one dependent variables but more than one independent feature. To implement polynomial regression, we convert the features to Polynomial forms using sklearn library and then we make use of Linear Regression model which now acts as polynomial regression model.
Python
# Use PolynomialFeatures to generate polynomial terms (degree 2 for simplicity)
poly = PolynomialFeatures(degree=2)
X_train = poly.fit_transform(X_train)
X_test = poly.transform(X_test)
# Normalize features using StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Define input and output dimensions
input_dim = X_train.shape[1] # Number of polynomial features
output_dim = 1 # Single output for regression
# Define the input and output placeholders
X = C.input_variable(input_dim)
y = C.input_variable(output_dim)
# Initialize weights and bias
W = C.parameter((input_dim, output_dim))
b = C.parameter((output_dim))
# Define the polynomial regression model: y = W * X + b
model = C.times(X, W) + b
3. Lasso Regression
Lasso Regression is similar to Linear Regression but to handle multicollinearity and overfitting as it uses L1 Regularization. L1 means that absolute value of penalty is added. In this we modify the loss function by introducing the penalty .
Python
# Define input and output dimensions
input_dim = X_train.shape[1] # Number of polynomial features
output_dim = 1 # Single output for regression
# Define the input and output placeholders
X = C.input_variable(input_dim)
y = C.input_variable(output_dim)
# Initialize weights and bias
W = C.parameter((input_dim, output_dim))
b = C.parameter((output_dim))
# Define the regression model: y = W * X + b
model = C.times(X, W) + b
# Define loss (MSE) with L1 regularization (Lasso)
loss = C.squared_error(model, y) + 0.01 * C.reduce_sum(C.abs(W)) # L1 regularization term
eval_error = C.squared_error(model, y)
# Use Adam optimizer with momentum
learning_rate = 0.01
momentum = C.momentum_schedule(0.9)
learner = C.adam(model.parameters, C.learning_rate_schedule(learning_rate, C.UnitType.minibatch), momentum)
# Create the trainer
trainer = C.Trainer(model, (loss, eval_error), [learner])
4. Ridge Regression
Ridge Regression is similar to Lasso Regression but adds square of penalty values that is L2 regularization in the loss function. Here also we make use of Linear Regression model but modify the loss function by introducing L2 penalty.
Python
# Initialize weights and bias
W = C.parameter((input_dim, output_dim))
b = C.parameter((output_dim))
# Define the regression model: y = W * X + b
model = C.times(X, W) + b
# Define loss (MSE) with L2 regularization (Ridge)
loss = C.squared_error(model, y) + 0.01 * C.reduce_sum(C.square(W)) # L2 regularization term
eval_error = C.squared_error(model, y)
Regression Model Evaluation in CNTK
For training the model we specify the batch size and the epochs. For each epoch the model will be trained on a particular batch of data. For each batch the loss and error is calculated. Finally the test error usually MSE is calculated. Smaller the MSE better is the predictive power of the model.
batch_size = 32
num_epochs = 20
# Training loop
for epoch in range(num_epochs):
for i in range(0, len(X_train_cntk), batch_size):
X_batch = X_train_cntk[i:i + batch_size]
y_batch = y_train_cntk[i:i + batch_size]
trainer.train_minibatch({X: X_batch, y: y_batch})
if epoch % 10 == 0:
print(f'Epoch {epoch}, Loss: {trainer.previous_minibatch_loss_average}')
# Convert test data to CNTK format
X_test_cntk = np.array(X_test, dtype=np.float32)
y_test_cntk = np.array(y_test, dtype=np.float32).reshape(-1, 1)
# Get predictions
predictions = model.eval({X: X_test_cntk})
# Calculate test error (MSE)
mse = np.mean((predictions - y_test_cntk) ** 2)
print(f'Test MSE: {mse}')
We need to optimize our model so that our model is trained faster and that it produces accurate predictions. Some techniques to optimize our model are as follows:
- Use hyperparameter tuning to tune the parameters like batch, epochs, learning rate etc.
- Introduce penalties to reduce overfitting and multicollinearity.
- Scale the features to improve the performance of the model.
- Use cross validation techniques to enhance the performance and optimize the hyperparameters.
- Do not add too many layers to the model as it can result in longer training time and other problems like Vanishing Gradient problem or Overfitting.
Here we have used California Housing Dataset present in sklearn library. 80% of the data has been used for training and the rest 20% is used for testing purpose. Finally we have calculated the Mean Squared Error on the test dataset.
Linear Regression
Python
import cntk as C
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import numpy as np
# Load the California housing dataset
housing_data = fetch_california_housing()
# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(housing_data.data, housing_data.target, test_size=0.2, random_state=42)
# Normalize features using StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Define input and output dimensions
input_dim = X_train.shape[1] # Number of features
output_dim = 1 # Regression output is a single value
# Define the input and output placeholders
X = C.input_variable(input_dim)
y = C.input_variable(output_dim)
# Initialize weights and bias
W = C.parameter((input_dim, output_dim))
b = C.parameter((output_dim))
# Define the linear regression model: y = W * X + b
model = C.times(X, W) + b
# Define loss (MSE) and evaluation functions
loss = C.squared_error(model, y)
eval_error = C.squared_error(model, y)
# Use Adam optimizer (with momentum)
learning_rate = 0.01
momentum = C.momentum_schedule(0.9) # Define momentum
learner = C.adam(model.parameters,
C.learning_rate_schedule(learning_rate, C.UnitType.minibatch),
momentum)
# Create the trainer
trainer = C.Trainer(model, (loss, eval_error), [learner])
# Convert data to CNTK format
X_train_cntk = np.array(X_train, dtype=np.float32)
y_train_cntk = np.array(y_train, dtype=np.float32).reshape(-1, 1)
# Set batch size and epochs
batch_size = 32
num_epochs = 20
# Training loop
for epoch in range(num_epochs):
for i in range(0, len(X_train_cntk), batch_size):
X_batch = X_train_cntk[i:i + batch_size]
y_batch = y_train_cntk[i:i + batch_size]
trainer.train_minibatch({X: X_batch, y: y_batch})
if epoch % 10 == 0:
print(f'Epoch {epoch}, Loss: {trainer.previous_minibatch_loss_average}')
# Convert test data to CNTK format
X_test_cntk = np.array(X_test, dtype=np.float32)
y_test_cntk = np.array(y_test, dtype=np.float32).reshape(-1, 1)
# Get predictions
predictions = model.eval({X: X_test_cntk})
# Calculate test error (MSE)
mse = np.mean((predictions - y_test_cntk) ** 2)
print(f'Test MSE: {mse}')
Output:
Epoch 0, Loss: 5.107728481292725
Epoch 10, Loss: 1.2154287099838257
Test MSE: 0.6077708005905151
Polynomial Regression
Python
import cntk as C
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
import numpy as np
# Load the California housing dataset
housing_data = fetch_california_housing()
# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(housing_data.data, housing_data.target, test_size=0.2, random_state=42)
# Use PolynomialFeatures to generate polynomial terms (degree 2 for simplicity)
poly = PolynomialFeatures(degree=2)
X_train = poly.fit_transform(X_train)
X_test = poly.transform(X_test)
# Normalize features using StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Define input and output dimensions
input_dim = X_train.shape[1] # Updated number of features after polynomial transformation
output_dim = 1 # Regression output is a single value
# Define the input and output placeholders
X = C.input_variable(input_dim)
y = C.input_variable(output_dim)
# Initialize weights and bias
W = C.parameter((input_dim, output_dim))
b = C.parameter((output_dim))
# Define the polynomial regression model: y = W * X + b
model = C.times(X, W) + b
# Define loss (MSE) and evaluation functions
loss = C.squared_error(model, y)
eval_error = C.squared_error(model, y)
# Use Adam optimizer (with momentum)
learning_rate = 0.01
momentum = C.momentum_schedule(0.9) # Define momentum
learner = C.adam(model.parameters,
C.learning_rate_schedule(learning_rate, C.UnitType.minibatch),
momentum)
# Create the trainer
trainer = C.Trainer(model, (loss, eval_error), [learner])
# Convert data to CNTK format
X_train_cntk = np.array(X_train, dtype=np.float32)
y_train_cntk = np.array(y_train, dtype=np.float32).reshape(-1, 1)
# Set batch size and epochs
batch_size = 32
num_epochs = 20
# Training loop
for epoch in range(num_epochs):
for i in range(0, len(X_train_cntk), batch_size):
X_batch = X_train_cntk[i:i + batch_size]
y_batch = y_train_cntk[i:i + batch_size]
trainer.train_minibatch({X: X_batch, y: y_batch})
if epoch % 10 == 0:
print(f'Epoch {epoch}, Loss: {trainer.previous_minibatch_loss_average}')
# Convert test data to CNTK format
X_test_cntk = np.array(X_test, dtype=np.float32)
y_test_cntk = np.array(y_test, dtype=np.float32).reshape(-1, 1)
# Get predictions
predictions = model.eval({X: X_test_cntk})
# Calculate test error (MSE)
mse = np.mean((predictions - y_test_cntk) ** 2)
print(f'Test MSE: {mse}')
Output:
Epoch 0, Loss: 4.65647554397583
Epoch 10, Loss: 1.1391972303390503
Test MSE: 0.6355475783348083
Lasso Regression
Python
import cntk as C
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
import numpy as np
# Load the California housing dataset
housing_data = fetch_california_housing()
# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(housing_data.data, housing_data.target, test_size=0.2, random_state=42)
# Use PolynomialFeatures to generate polynomial terms (degree 2 for simplicity)
poly = PolynomialFeatures(degree=2)
X_train = poly.fit_transform(X_train)
X_test = poly.transform(X_test)
# Normalize features using StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Define input and output dimensions
input_dim = X_train.shape[1] # Number of polynomial features
output_dim = 1 # Single output for regression
# Define the input and output placeholders
X = C.input_variable(input_dim)
y = C.input_variable(output_dim)
# Initialize weights and bias
W = C.parameter((input_dim, output_dim))
b = C.parameter((output_dim))
# Define the regression model: y = W * X + b
model = C.times(X, W) + b
# Define loss (MSE) with L1 regularization (Lasso)
loss = C.squared_error(model, y) + 0.01 * C.reduce_sum(C.abs(W)) # L1 regularization term
eval_error = C.squared_error(model, y)
# Use Adam optimizer with momentum
learning_rate = 0.01
momentum = C.momentum_schedule(0.9)
learner = C.adam(model.parameters, C.learning_rate_schedule(learning_rate, C.UnitType.minibatch), momentum)
# Create the trainer
trainer = C.Trainer(model, (loss, eval_error), [learner])
# Convert data to CNTK format
X_train_cntk = np.array(X_train, dtype=np.float32)
y_train_cntk = np.array(y_train, dtype=np.float32).reshape(-1, 1)
# Set batch size and epochs
batch_size = 32
num_epochs = 20
# Training loop
for epoch in range(num_epochs):
for i in range(0, len(X_train_cntk), batch_size):
X_batch = X_train_cntk[i:i + batch_size]
y_batch = y_train_cntk[i:i + batch_size]
trainer.train_minibatch({X: X_batch, y: y_batch})
if epoch % 10 == 0:
print(f'Epoch {epoch}, Loss: {trainer.previous_minibatch_loss_average}')
# Convert test data to CNTK format
X_test_cntk = np.array(X_test, dtype=np.float32)
y_test_cntk = np.array(y_test, dtype=np.float32).reshape(-1, 1)
# Get predictions
predictions = model.eval({X: X_test_cntk})
# Calculate test error (MSE)
mse = np.mean((predictions - y_test_cntk) ** 2)
print(f'Test MSE: {mse}')
Output:
Epoch 0, Loss: 4.6795973777771
Epoch 10, Loss: 1.1734471321105957
Test MSE: 0.6340514421463013
Ridge Regression
Python
import cntk as C
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, PolynomialFeatures
import numpy as np
# Load the California housing dataset
housing_data = fetch_california_housing()
# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(housing_data.data, housing_data.target, test_size=0.2, random_state=42)
# Use PolynomialFeatures to generate polynomial terms (degree 2 for simplicity)
poly = PolynomialFeatures(degree=2)
X_train = poly.fit_transform(X_train)
X_test = poly.transform(X_test)
# Normalize features using StandardScaler
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
# Define input and output dimensions
input_dim = X_train.shape[1] # Number of polynomial features
output_dim = 1 # Single output for regression
# Define the input and output placeholders
X = C.input_variable(input_dim)
y = C.input_variable(output_dim)
# Initialize weights and bias
W = C.parameter((input_dim, output_dim))
b = C.parameter((output_dim))
# Define the regression model: y = W * X + b
model = C.times(X, W) + b
# Define loss (MSE) with L2 regularization (Ridge)
loss = C.squared_error(model, y) + 0.01 * C.reduce_sum(C.square(W)) # L2 regularization term
eval_error = C.squared_error(model, y)
# Use Adam optimizer with momentum
learning_rate = 0.01
momentum = C.momentum_schedule(0.9)
learner = C.adam(model.parameters, C.learning_rate_schedule(learning_rate, C.UnitType.minibatch), momentum)
# Create the trainer
trainer = C.Trainer(model, (loss, eval_error), [learner])
# Convert data to CNTK format
X_train_cntk = np.array(X_train, dtype=np.float32)
y_train_cntk = np.array(y_train, dtype=np.float32).reshape(-1, 1)
# Set batch size and epochs
batch_size = 32
num_epochs = 20
# Training loop
for epoch in range(num_epochs):
for i in range(0, len(X_train_cntk), batch_size):
X_batch = X_train_cntk[i:i + batch_size]
y_batch = y_train_cntk[i:i + batch_size]
trainer.train_minibatch({X: X_batch, y: y_batch})
if epoch % 10 == 0:
print(f'Epoch {epoch}, Loss: {trainer.previous_minibatch_loss_average}')
# Convert test data to CNTK format
X_test_cntk = np.array(X_test, dtype=np.float32)
y_test_cntk = np.array(y_test, dtype=np.float32).reshape(-1, 1)
# Get predictions
predictions = model.eval({X: X_test_cntk})
# Calculate test error (MSE)
mse = np.mean((predictions - y_test_cntk) ** 2)
print(f'Test MSE: {mse}')
Output:
Epoch 0, Loss: 4.6577277183532715
Epoch 10, Loss: 1.1457855701446533
Test MSE: 0.6365649700164795
Conclusion
Building regression models with Microsoft Cognitive Toolkit (CNTK) offers a powerful approach to handling complex datasets and achieving high accuracy in predictions. The toolkit's flexibility, performance optimizations, and support for advanced techniques make it an excellent choice for developing scalable regression models. By leveraging CNTK's capabilities, developers can create models that efficiently predict continuous values, driving insights and decision-making across various domains.
Similar Reads
Building Classification Models with Microsoft Cognitive Toolkit (CNTK)
Classification models are essential in machine learning for predicting discrete outcomes based on input data. Microsoft Cognitive Toolkit (CNTK) is a powerful open-source deep learning framework designed to make building, training, and deploying machine learning models more efficient. This article p
12 min read
Recurrent Neural Networks Using Microsoft Cognitive Toolkit (CNTK)
Recurrent Neural Networks (RNNs) are ANNs designed to identify patterns in sequences of data, such as time-series or language data. Unlike traditional feedforward networks, RNNs have a unique structure where the output of each step is used as the input for the next. This allows RNNs to maintain an i
7 min read
Microsoft Cognitive Toolkit (CNTK)
Microsoft Cognitive Toolkit (CNTK) is a powerful, open-source deep learning framework developed by Microsoft. It is designed to streamline the development and training of machine learning models, providing high performance and scalability. This article will help to understand the Essentials of CNTK,
8 min read
Convolutional Neural Networks with Microsoft Cognitive Toolkit (CNTK)
Convolutional Neural Networks (CNNs) are a type of neural network commonly used for image and video identification tasks. They excel at recognizing patterns such as edges, textures, and objects in images, making them ideal for tasks involving computer vision. CNNs leverage convolutional layers to an
6 min read
Microsoft Azure - Cognitive Services
Microsoft Azure Cognitive Services provides a variety of pre-trained powerful AI tools and models that gives the developers an opportunity to integrate these intelligent features into their applications effortlessly and without any requirement of implementing Machine Learning. These services cover a
6 min read
Integrating Chat Models with Spring AI
Integrating chat models with Spring AI is an important step in enhancing modern applications with advanced AI capabilities. By combining Spring Boot with OpenAI's ChatGPT APIs, developers can integrate powerful natural language processing and machine learning features into their Java applications. T
8 min read
ChatGPT vs. BlenderBot: Which AI Tool Delivers Smarter Conversations?
Artificial intelligence (AI), by overcoming ongoing challenges and entering the creative world, has changed the way creative professionals approach content alterations in today's digital environment. Out of all the AI technologies available, two widely used platforms - ChatGPT and BlenderBot - are w
8 min read
Building Conversational AI Agents with LLMs
Conversational agents, or chatbots, have become integral to various applications, from customer service to virtual assistants. The advent of advanced language models (LLMs) like GPT-4 has significantly enhanced the capabilities of these agents, making them more intuitive, context-aware, and engaging
5 min read
Automating Model Deployment: Tools and Strategies for MLOps
In the rapidly evolving field of machine learning (ML), automating model deployment has become a crucial aspect of the MLOps (Machine Learning Operations) lifecycle. Automating the deployment of machine learning models enhances productivity, ensures consistency, and accelerates the transition from d
5 min read
Model Context Protocol (MCP)
Model Context Protocol (MCP) is designed to simplify how AI systems access and interact with data. It provides a unified framework for AI models to connect with external tools and data sources which helps in making communication between platforms efficient. This standard helps AI models to use real-
6 min read