0% found this document useful (0 votes)

34 views

MLPC Group Assignment

Uploaded by

S U P R E M

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views

MLPC Group Assignment

Uploaded by

S U P R E M

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

TAYLOR’S UWE DUAL AWARDS PROGRAMMES

JANUARY 2024 SEMESTER

MACHINE LEARNING AND PARALLEL COMPUTING

(ITS66604)

Assignment 2 – Group (30%)

DUE DATE: 16th March 2024 via myTIMeS (8pm)

STUDENT DECLARATION
1. I confirm that I am aware of the University’s Regulation Governing Cheating in a University Test
and Assignment and of the guidance issued by the School of Computing and IT concerning
plagiarism and proper academic practice, and that the assessed work now submitted is in
accordance with this regulation and guidance.
2. I understand that, unless already agreed with the School of Computing and IT, assessed work may
not be submitted that has previously been submitted, either in whole or in part, at this or any other
institution.
3. I recognise that should evidence emerge that my work fails to comply with either of the above
declarations, then I may be liable to proceedings under Regulation.
No Student Name Student ID Date Signature Score

1 Monish Shrestha 0362091

2 Praphul Shrestha 0362191

3 Khushi Thami 0362676

5
Part A: Machine Learning – A Case Study...........................................................................3
1. Describe your observation and understating on the whole dataset by answering the
following questions............................................................................................................. 3
a. What are the available data types in this data set?..................................................3
b. What is the statistical summary of all the attributes?............................................... 3
c. How to handle the missing values which are those presented as ‘?’ ?.................... 4
d. What are the independent and dependent variables?..............................................4
i. Independent Variables:........................................................................................ 4
ii. Dependent Variables:..........................................................................................4
2. Find the correlation coefficient between independent and dependent variables?..........6
3. By using any algorithm of your choice, create a model which could predict the price of
the laptop............................................................................................................................7
4. Is this a Supervised model or an unsupervised model? Why so? Explain in................. 8
detail...................................................................................................................................8
5. Continue with the same built model in No.3, but choose different independent
variables and compare the results..................................................................................... 9
Part B: Parallel Computing.................................................................................................. 11
Article1: Parallel computing method of deep belief networks and its application to traffic
flow prediction...................................................................................................................11
Study Background:..................................................................................................... 11
Research and Development Methodologies:..............................................................11
1. Pre-training Phase Methodology:..................................................................... 11
2. Fine-tuning Phase Methodology:...................................................................... 11
3. Parallel Architecture:........................................................................................ 12
4. Evaluation Indices:........................................................................................... 12
Performance Analysis:............................................................................................... 12
Results Analysis:........................................................................................................ 13
Conclusion:.................................................................................................................13
Part A: Machine Learning – A Case Study

1. Describe your observation and understating on the whole dataset by answering

the following questions.
a. What are the available data types in this data set?
The types of data that this dataset contains include the data type such as integer, float,
and object (string).

b. What is the statistical summary of all the attributes?

The statistical summary of all attributes includes Count, Mean, Standard Deviation,
Minimum, 25th Percentile, 50th Percentile (Median), 75th Percentile, and Maximum
for numerical attributes.
c. How to handle the missing values which are those presented as ‘?’ ?
To handle missing values represented as '?' in the dataset, we replaced them with NaN
(Not a Number) using the replace() function.

Additionally, we identified numerical and categorical columns in the dataset and

imputed missing values accordingly. Numerical columns were imputed with the mean
value, while categorical columns were imputed with the mode value.

d. What are the independent and dependent variables?

i. Independent Variables:
This is the researcher's choices of variables that might be manipulated or
controlled by them. In the context of these features, the attributes of laptops
include company, product, type name, screen size (in inches), CPU, RAM,
memory, GPU, operating system, and weight too.

ii. Dependent Variables:

This is the variable being studied and measured. In this case, it's the price of
the laptops in euros. The price depends on the values of the independent
variables.
2. Find the correlation coefficient between independent and dependent variables?
The aim of this section is to investigate the relationship between the independent variables
(features) and the dependent variable (price of the laptop). This goes through the formula of
the correlation coefficient calculation. Correlation implies a linear relationship between
variables represented by this coefficient which shows its strength and direction.

Firstly, we processed the dataset by changing the type of 'Weight' to string, removing
non-numeric characters such as "kg" and then finally changing it to a float type. Similarly, we
went on to remove numeric digits coming from the 'Ram' column and make them into a float.

Later on, we picked the proper numerical categories, eg: 'Inches', 'RAM', 'Weight', and
'Price_euros' to find out the correlation. We performed the panda corr() function to obtain the
correlation matrix and the correlation coefficient between the Price_euros column and the
numeric columns.

Next, we analyzed the correlation coefficients obtained:

i. Inches: The value of the correlation coefficient for the laptop screen size (Inches)
versus its price (Price_euros) is around 0.068, showing a very weak positive
correlation.
ii. Ram: The correlation coefficient of RAM capacity with price has been estimated to
be 0.743, thereby indicating a moderately strong positive correlation. This means that
as the memory capacity hikes, the cost of the laptop rises also.
iii. Weight: Among the whole-relationships-between-price-and-weight of the laptop, the
correlation-coefficient equals to roughly 0.210, which suggests that there is weak
positive-correlation. The fact that heavy laptops slightly tend to have higher prices on
the market gives one an idea that heavy laptops are more expensive.

These correlation coefficients give us useful information on the patterns existing

between the independent variables (Inches, Ram and Weight) and the dependent
variable (Price_euros), helping to understand which factors most affect the price of
laptops in consumer markets.
3. By using any algorithm of your choice, create a model which could predict the
price of the laptop.

To develop a predictive model for estimating laptop prices based on certain features, we
employed the Linear Regression algorithm. Here's a step-by-step explanation of the process:

i. Data Preparation:
We had input variables (features), and a dependent variable (price), because of the
dataset. For the straight-forward modeling, a multilinear regression equation was used
and the independent variables were selected as 'Inches' 'Ram' and 'Weight' and
'Price_Euros' as the dependent variable.

ii. Dataset Splitting:

The dataset was divided into a training set and the testing set using the train_test split
function of the sklearn library. 80% of the data set were reserved for training, with the
remaining 20% for testing.

iii. Model Training:

We used a Linear Regression model and trained it using the training data. The model
learned the relationships between the independent variables and the target variable
(price) during this phase.

iv. Prediction:
We worked in a course where we trained a model and used it to make predictions on
testing data in order to estimate the prices of laptops regarding their features.

v. Model Evaluation:
We estimated the performance of the model by calculating the Mean Squared Error
(MSE) between the actual prices and the retained prices on the moderately paced
dataset. The MSE, in turn, presents the structure of how a model's predictions match
the real values, with the smaller value revealing a better model's performance.

vi. Results:
Compared to our model, after evaluation, the output exhibited a Mean Squared Error
(MSE) of around 2.25444.48 We measure this indicator by simply taking an average
squared gap between predicted prices and actual. Though the MSE value is capable of
pointing out the structural accuracy of the model, a detailed analysis apart from the
comparison with other models is equally important to determine the overall
effectiveness of the chosen model.

vii. Conclusion:
The developed Linear Regression model showed a promising capability to predict
laptop prices based on chosen factors. Further results of the model, incorporation of
new algorithms and optimization could boost the precision, predictive performance
and robustness of the model. This foreseen capacity brings great benefit to the
consumers, manufacturers and the retailers among others who are able to drive the
right decisions related to laptop buying, pricing and market competitiveness,
respectively.

4. Is this a Supervised model or an unsupervised model? Why so? Explain in detail.

Yes, the given model is a supervised learning model that was designed to predict
laptop prices based on their specifications or features. A main feature of supervised
learning is the fact that it is the most common and representative paradigm in machine
learning, where in the data datasets, each example is assigned a single label
corresponding to the established output labels. This facilitates the model's learning
process by providing clear guidance on the expected outputs for given inputs.

In the context of the given data model, the dataset will be made of labeled examples
where every laptop will be represented by a tuple of size attributes e.g. 'Inches', 'Ram',
'Weight' as well as the price labeled as 'Price_euros'. This feature alignment with the
target variable would enable the model to understand the relationship between the
input features and the target variable, thereby making possible the prediction of
unseen data.

During training sessions, the supervised learning model becomes capable of mapping
input features with their corresponding output labels while minimizing a pre-defined
loss function. In this situation, the model wants to make the instability as low as
possible in between predictions and actual prices of laptops which are in the training
data. The process of iterative optimization via gradient descent is utilized which leads
the model to fine tune its parameters so that it is capable of predicting price with the
desired accuracy.
Supervised learning does render well if the aim is to learn mapping from input
features to the output labels, and with machine learning, it is possible for a laptop to
be priced based on its specifications. By leveraging the labeled data provided in the
dataset, the model can effectively learn the patterns and relationships within the data,
thereby enabling it to generalize well to unseen instances and make accurate price
predictions.

In summary, the provided model exemplifies supervised learning by utilizing labeled

data to train a model that learns to predict laptop prices based on their specifications.
Through the iterative optimization of a predefined loss function, the model hones its
ability to accurately map input features to output labels, showcasing the effectiveness
of supervised learning in addressing regression tasks such as price prediction.

5. Continue with the same built model in No.3, but choose different independent
variables and compare the results.

To compare the results of the model with different independent variables, we followed a
similar approach as in the previous steps. Here's how we proceeded:

i. Data preprocessing:
We extracted numeric values from the 'ScreenResolution' column to create two new
columns: 'ScreenResolution_Width' and 'ScreenResolution_Height'. These columns
represent the width and height of the screen resolution, respectively. We converted
these new columns to numeric data type and dropped rows where width or height
couldn't be extracted.

ii. Model Training with New Variables:

We defined a new set of independent variables, including 'Ram', 'Weight',
'ScreenResolution_Width', and 'ScreenResolution_Height'. Then, we split the data
into training and testing sets using these new independent variables. We created and
trained a Linear Regression model using the new variables.

iii. Model Evaluation:

We made predictions using the new variables and evaluated the model's performance
by calculating the Mean Squared Error (MSE) between the predicted and actual
values of the dependent variable.

iv. Results:
The Mean Squared Error with the new set of independent variables was
approximately 187697.57.

v. Conclusion:
This comparison helps us assess the impact of including different independent
variables on the model's predictive performance, allowing us to identify which set of
variables yields better results for predicting the price of laptops.
Part B: Parallel Computing

Article 1: Parallel computing method of deep belief networks and its application to
traffic flow prediction

Study Background:
The provided research involves the field of parallel computing applications which of course, focuses
on improving Deep Belief Networks (DBNs) using the methods in parallel computing. The goal of the
study in this article is to develop competent machine learning techniques, especially when huge
datasets are processed in real-time situations. The research centers on the use of parallel computing in
order to improve DBN timelines, namely, pre-training and fine-tuning; this is designed to reduce the
computational time necessary for model training.

Research and Development Methodologies:

The research articles bring into the spotlight the specific techniques for research and development that
would help DBNs receive the full capability of parallel computing which is seen as a good thing in the
long run. The techniques which are used in the studies guide the area of the DBNs improvement while
training within the machine learning algorithms framework, especially using the acceleration of the
pre-training and fine-tuning stage to reduce the time needed for the model development.

1. Pre-training Phase Methodology:

Data Partitioning: The subsequent datasets, Xq (q=1, 2, …, Q) are divided into smaller
subsets, distributed across multiple computing nodes.

Master-Slave Structure: The masterpiece of a computing node commands phasing in weight

and bias updates taken from slaves. A master node is the middleman who aggregates, sums,
updates gradients, and disseminates them to the worker (slaves) units that compute the local
changes and then submit them back to the master node.

Algorithm 4: The pre-training operation with parallel computing methodology requires

partitioning the dataset, initializing the DBN structure, data transmission of parameters,
computing variations, weight and bias update, and then cycle through for each epoch of
training.

2. Fine-tuning Phase Methodology:

Dataset Division: Like the pre-training, the original dataset X is partitioned into Q parts for
fine-tuning as well.

Error Function Revision: Adding an factor m comprises an average error functions ulitizing
sub datasets Xq.

Algorithm 5: Rules fine-tuning is accomplished via several steps: dataset distribution, DBN
model initialization, model parameter broadcasting, variation computation, weights and biases
updating, and epoch periodicity.
3. Parallel Architecture:
Master-Slave Computing Structure: Master-Slave Computing Structure: In both the
pre-training and fine-tuning phases the master-slave computational structure is used.

Comparison with Serial Computing: The multi-nodes and a varying sets of data technique
is known as the combination method, while the single node and a single dataset is called the
serial method. The parallel approach is made in such a way to result both theoretically and in
practice the same as the one achieved by the serial mode of evolution.

4. Evaluation Indices:
Acceleration Ratio and Efficiency: The analysis of the impacts of parallel computing is
done utilizing acceleration ratio and efficiency metrics, computing runtimes and showing the
performance between serial and parallel learning models using examples in pre-training and
fine-tuning phases.
These strategies are designed to reduce the training periods for DBNs and thus to make them
more resource-friendly through combining two computing paradigms - parallel computing
and model reduction approach. The study puts forward the cruciality of optimizing
preprocessing and fine-tuning, utilizing parallel computing in order to speed up the training of
DBNs. However, this is very important in real-time applications where quick data processing
is necessary because it is a crucial element.

Performance Analysis:
The research conducted here concentrates on shortening Deep Belief Networks' (DBNs) training time
via including parallel computing strategies into training stages. The main objective of the study is to
enhance the training process of DBNs through sharing the workload across multiple nodes. The target
of the research is to optimize model training parameters by running the training phases in parallel.
This method should result in immense reduction of the total time for model training. This approach is
very relevant in scenarios where dealing with large datasets and doing it securely and in a time
efficient manner is important for performance of machine learning algorithms.
The applied parallel computing method is modified for training of learning algorithms to increase
their efficiency and speed up the DBN training procedure. The study, therefore, will split the
computational workload up to several nodes with a goal of accelerating the training process, allowing
it to run smoothly and be precise. This approach of parallel computing tactic, which is specified to
solve the complexities of processing big databases, is going to train DBNs which present the process
of obtaining real time predictions and predictions which are fast and accurate.
Essentially, the research is in the domain of parallel computing applications to speed up the training
speed, and with the aim to be more productive than deep belief networks (DBN). The study endeavors
to overcome the problems that would arise from working with a bulky data set by contributions of
nodes to processing power, at the same time. The effort of parallelization takes place in all the stages
of training and leads to the workload distribution across the network. This approach accomplishes two
goals; it improves the learning algorithms performance hence it gives an idea on the suitable parallel
computing techniques which can also help optimize machine learning models for real-time
applications that require high speed and accuracy.
Results Analysis:
The research evaluates the performance of the parallel computing approach by comparing the results
of applied DBNs trained with the parallel methods with those trained on by traditional serial methods.
This leads us to the conclusion that the parallel training algorithms can achieve comparable
predictions results than the serial training algorithms. This is a proof of the fact that our parallel DBN
training techniques achieve a desired alignment and improve the efficiency of prediction. Moreover,
the evaluation indices of acceleration ratio and efficiency are applied to express the improvements in
performance of pre-training and fine-tuning using parallel computing as well.(Zhao et al., 2019)

Conclusion:
In conclusion, the research suggests that the parallel computation makes it possible both to speed up
and repeat the process of training a Deep Belief Network. The study demonstrates how parallel
pre-training and fine-tuning phases can lead to a substantial enhancement in the computation
efficiency and training speed, thus underscoring the significance of using parallel computing
methodologies in improving the performance of machine learning algorithms when dealing with large
scale datasets.
Article2: QuantCloud: A Software with Automated Parallel Python for Quantitative
Finance Applications

Study Background
The study is all about the development of QuantCloud, which is a software designed for the use of
quantitative finance applications in the integration of a parallel system of Python with a Big Data
system constructed in C++ coding. The main task is to accelerate the pace of execution and Software
life cycle, which is of high importance for the trading companies operating in Quantitative Finance.
The parallel execution of Python codes is demonstrated when the software is tested on Intel Xeon E5
processors and Intel Xeon Phi processors, based on moving-window and autoregressive
moving-average (ARMA) algorithms. This is a nearly linear speed up of the processing, which is ideal
for modern multicore processors. The incorporation of C++ for big data structure and Python for the
user method is the solution to fast developing and testing strategy in a finance quantitative investment
industry, which does not implicitly ask for the speed, but rather, speed is critical when competing with
others.

Research and Development Methodologies:

Specific Implementation Techniques:

Coprocess-Based Approach: The study utilizes a coprocess-based strategy for parallel execution of
Python codes, allowing for concurrent processing and efficient utilization of system resources.
Shared Memory System: Data communication between the main C++ program and embedded
Python scripts occurs through a shared memory system, facilitating seamless interaction and data
exchange.
Embedded Python Interface: An embedded Python interface is designed to enable effortless
integration with the big data infrastructure, ensuring smooth execution of Python scripts within the
system.

Optimization Strategies:
Intra-Node Parallelism: The system leverages multithreaded programming for intra-node parallelism
on shared memory, optimizing resource utilization within a single computing node.
Thread Pool Management: A thread pool is employed to manage threads efficiently, enhancing the
scalability and performance of parallel Python execution.
Asynchronous Execution Mechanism: The system implements an asynchronous execution
mechanism to overlap data serialization and analytics operations, reducing latency and improving
overall system efficiency.

Testing Procedures:
Performance Benchmarking: The study conducts performance benchmarking to evaluate the
speedup, parallel efficiency, and wallclock time for executing Python codes in parallel.
Real-World Market Data Testing: Testing procedures involve the application of the system to
real-world market data, assessing the system's performance in handling complex financial datasets.
Comparative Analysis: Extensive comparative studies are conducted between different processors,
such as Intel Xeon E5 and Xeon Phi, to analyze the system's performance under varying hardware
configurations.
By incorporating these specific implementation techniques, optimization strategies, and testing
procedures, the methodology section provides a comprehensive overview of how the QuantCloud
software suite effectively integrates Python and C++ for parallel execution in quantitative finance
applications, catering to the evolving demands of big data analysis in the financial sector.

Performance Analysis:
The performance analysis in the study delves into an in-depth comparative examination to evaluate
the effectiveness of the algorithms implemented in QuantCloud. By assessing various performance
metrics on real-world market data, the study provides valuable insights into the system's efficiency
and speed when executing time series analysis models coded in Python, especially on advanced
multicore processors like Intel Xeon Phi.

Performance Metrics Interpretation:

Wall Clock Time:

Overall Wallclock Time: This metric represents the total elapsed time from the initiation to the
completion of a process pipeline, encompassing data queries, preparation, Python script execution,
and result output. A decrease in overall wallclock time signifies improved efficiency in processing
financial data and executing analysis models.
Embedded-Python Wall Clock Time: This metric focuses on the time spent specifically in executing
Python codes within the coprocess-based parallel strategy. A reduction in embedded-Python wallclock
time indicates enhanced speed and efficiency in Python script execution.

Latency:
Microseconds per Tick: Latency, reported in microseconds per tick, reflects the average time taken
to process a single tick message. Lower latency values indicate quicker processing and improved
responsiveness of the system to market data, enhancing real-time decision-making capabilities.

Speedup and Parallel Efficiency:

Speedup Ratio: The speedup for Python codes is defined as the ratio of elapsed times with varying
numbers of coprocesses. A higher speedup ratio signifies a more efficient parallel execution of Python
scripts, leading to faster processing and analysis of financial data.
Parallel Efficiency: Parallel efficiency provides an estimate of how well the system utilizes
parallelism to enhance Python script execution. Higher parallel efficiency values indicate optimal
resource utilization and improved performance on multicore processor architectures.

Results Interpretation:

Scalability and Performance Improvement:

Consistent Performance Improvement: The study demonstrates consistent performance
improvement with an increase in coprocesses, reaffirming the scalability and efficiency of the system
in executing Python codes in parallel.
Linear Speedup: The nearly linear speedup observed in the performance tests indicates the system's
ability to scale effectively with additional coprocesses, resulting in improved processing speed and
performance.
Comparative Analysis:
Intel Xeon E5 vs. Xeon Phi: The comparison between Intel Xeon E5 and Xeon Phi processors
reveals significant performance differences, with the Xeon Phi processor consistently outperforming
the Xeon E5 in terms of overall wallclock time, latency, and speedup for Python codes.
Optimal Processor Selection: The study highlights the superiority of the Xeon Phi processor in
handling complex financial analysis models, showcasing its ability to reduce overall wallclock time
and improve parallel efficiency compared to the Xeon E5 processor.

Results Analysis:
The outcomes of the study show that QuantCloud, a suite of software, can obviously give big
speedups and default efficiency for executing python codes in parallel. The evaluation showcases that
the supply of running the production time series with quantitative analysis models on a hybrid system
of a Python parallel part and a C++-based data high-speed infrastructure streamlines the
computational implementation to conduct the chosen analysis. The research demonstrates the striking
effect of Intel Xeon Phi processors against Xeon E5 processors in per-stock latency and workload
throughput indicating the importance of interpreting the right processor for the quantitative finance
applications.(Zhang et al., 2018)

Conclusion:
In conclusion, this investigation demonstrates how powerful tools including QuantCloud are in
quantitative finance, bearing in mind that speed remains the prerequisite component for companies to
enjoy a competitive advantage. Through its combined Python and C++ based big data backend, the
software collection offers a fundamentally powerful environment tailor-made for strategy
development and testing in quantitative finance. The findings demonstrate the efficacy of the system
in conducting significant performance improvements and speedups, with a special focus on modern
multicore processors, thus indicating the system’s potential use in enhancing performance of
quantitative finance applications.
References

Zhao, L., Zhou, Y., Lu, H., & Fujita, H. (2019). Parallel computing method of deep belief
networks and its application to traffic flow prediction. Knowledge-Based Systems, 163,
972–987. https://doi.org/10.1016/j.knosys.2018.10.025

Zhang, P., Gao, Y., & Shi, X. (2018). QuantCloud: a software with automated parallel python
for quantitative finance applications. Ieeexplore.ieee.org; IEEE.
https://ieeexplore.ieee.org/abstract/document/8424990

Machine Learning Interview Questions
From Everand
Machine Learning Interview Questions
Tech Interviews
4.5/5 (2)
Laptop Price Prediction Using Machine Learning: International Journal of Computer Science and Mobile Computing
100% (1)
Laptop Price Prediction Using Machine Learning: International Journal of Computer Science and Mobile Computing
5 pages
Living The Good Life How To Live Sanely and Simply in A Troubled World
83% (6)
Living The Good Life How To Live Sanely and Simply in A Troubled World
243 pages
Cmna Mini-Lab Guide (PDF) - English - 2017 06 12
No ratings yet
Cmna Mini-Lab Guide (PDF) - English - 2017 06 12
21 pages
Wa0000.
No ratings yet
Wa0000.
12 pages
Laptop Price Analysis
No ratings yet
Laptop Price Analysis
37 pages
GR P Assignment Code
No ratings yet
GR P Assignment Code
4 pages
Laptop Price Prediction
No ratings yet
Laptop Price Prediction
10 pages
Report - Mini Project
No ratings yet
Report - Mini Project
66 pages
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
From Everand
Python Machine Learning: Machine Learning Algorithms for Beginners - Data Management and Analytics for Approaching Deep Learning and Neural Networks from Scratch
Ahmed Ph. Abbasi
No ratings yet
IJNRD2303124
No ratings yet
IJNRD2303124
11 pages
Laptop Price Prediction
No ratings yet
Laptop Price Prediction
12 pages
ML PPT On Laptop Price Prediction
100% (1)
ML PPT On Laptop Price Prediction
17 pages
Next Level Deep Machine Learning: Complete Tips and Tricks to Deep Machine Learning
From Everand
Next Level Deep Machine Learning: Complete Tips and Tricks to Deep Machine Learning
Joe Grant
No ratings yet
Advanced Dynamic-System Simulation: Model Replication and Monte Carlo Studies
From Everand
Advanced Dynamic-System Simulation: Model Replication and Monte Carlo Studies
Granino A. Korn
No ratings yet
Propo
No ratings yet
Propo
6 pages
CC02 Group6 Report
No ratings yet
CC02 Group6 Report
36 pages
Laptop Price Prediction
No ratings yet
Laptop Price Prediction
11 pages
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
ML_recordjp
No ratings yet
ML_recordjp
35 pages
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
From Everand
Geometric Feature Learning: Unlocking Visual Insights through Geometric Feature Learning
Fouad Sabry
No ratings yet
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
From Everand
Machine Learning - A Comprehensive, Step-by-Step Guide to Learning and Applying Advanced Concepts and Techniques in Machine Learning: 3
Peter Bradley
No ratings yet
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
From Everand
The Supervised Learning Workshop - Second Edition: A New, Interactive Approach to Understanding Supervised Learning Algorithms, 2nd Edition
Blaine Bateman
No ratings yet
Laptop Price Prediction Sanket
No ratings yet
Laptop Price Prediction Sanket
19 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
From Everand
Pedestrian Detection: Please, suggest a subtitle for a book with title 'Pedestrian Detection' within the realm of 'Computer Vision'. The suggested subtitle should not have ':'.
Fouad Sabry
No ratings yet
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Core Concepts in Statistical Learning
From Everand
Core Concepts in Statistical Learning
Tushar Gulati
No ratings yet
data-analytics-manual lab g.anill kumar
No ratings yet
data-analytics-manual lab g.anill kumar
23 pages
Python Machine Learning: Introduction to Machine Learning with Python
From Everand
Python Machine Learning: Introduction to Machine Learning with Python
Frank Millstein
No ratings yet
Dwdm-Lab Manual
No ratings yet
Dwdm-Lab Manual
39 pages
Top 20 MS Excel VBA Simulations, VBA to Model Risk, Investments, Growth, Gambling, and Monte Carlo Analysis
From Everand
Top 20 MS Excel VBA Simulations, VBA to Model Risk, Investments, Growth, Gambling, and Monte Carlo Analysis
Andrei Besedin
2.5/5 (2)
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
ML Lab Manual
No ratings yet
ML Lab Manual
90 pages
Group 14 Xac Xuat Thong Ke 3
No ratings yet
Group 14 Xac Xuat Thong Ke 3
39 pages
View Synthesis: Exploring Perspectives in Computer Vision
From Everand
View Synthesis: Exploring Perspectives in Computer Vision
Fouad Sabry
No ratings yet
3. Machine Learning
No ratings yet
3. Machine Learning
158 pages
Laptop Price Predicton Report
No ratings yet
Laptop Price Predicton Report
30 pages
Thesis
No ratings yet
Thesis
45 pages
Machine Learning
No ratings yet
Machine Learning
7 pages
Laptop Price Prediction Using Machine Learning (Abstract)
0% (1)
Laptop Price Prediction Using Machine Learning (Abstract)
3 pages
Session 1
No ratings yet
Session 1
35 pages
Take It Easy: Created Status Last Read
No ratings yet
Take It Easy: Created Status Last Read
55 pages
Laptop Price Prediction
No ratings yet
Laptop Price Prediction
25 pages
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Important Questions
No ratings yet
Important Questions
4 pages
Machine Learning Algorithms for Data Scientists: An Overview
From Everand
Machine Learning Algorithms for Data Scientists: An Overview
Vinaitheerthan Renganathan
No ratings yet
72b85f60-8523-423f-9efc-ff56aa21f3f3
No ratings yet
72b85f60-8523-423f-9efc-ff56aa21f3f3
29 pages
Reinforcement Learning: A Practical Guide to Algorithms
From Everand
Reinforcement Learning: A Practical Guide to Algorithms
Trilokesh Khatri
No ratings yet
Lesson3
No ratings yet
Lesson3
5 pages
Report
No ratings yet
Report
31 pages
lab mannual of ML
No ratings yet
lab mannual of ML
43 pages
ML Priyesha - 778
No ratings yet
ML Priyesha - 778
23 pages
Article 2
No ratings yet
Article 2
9 pages
Laptop Price Prediction in Machine Learning Using Random Forest Classifier Technique
No ratings yet
Laptop Price Prediction in Machine Learning Using Random Forest Classifier Technique
5 pages
Final 1
No ratings yet
Final 1
6 pages
Synopsis
No ratings yet
Synopsis
2 pages
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
From Everand
Pragmatic Machine Learning with Python: Learn How to Deploy Machine Learning Models in Production
Avishek Nag
No ratings yet
Support Vector Machine: Fundamentals and Applications
From Everand
Support Vector Machine: Fundamentals and Applications
Fouad Sabry
No ratings yet
Stochastic Calculus and Brownian Motion
From Everand
Stochastic Calculus and Brownian Motion
Tejas Thakur
No ratings yet
CSE-R-21 MINOR PROJECT-2 R-1 PPT Template WS 23-24
No ratings yet
CSE-R-21 MINOR PROJECT-2 R-1 PPT Template WS 23-24
25 pages
Download full Charm School Christmas Karma A Steamy Holiday Romance 1st Edition Lynn Carmer ebook all chapters
100% (5)
Download full Charm School Christmas Karma A Steamy Holiday Romance 1st Edition Lynn Carmer ebook all chapters
24 pages
Rubric For Fieldwork
0% (1)
Rubric For Fieldwork
1 page
Assessment BSBHRM523
No ratings yet
Assessment BSBHRM523
32 pages
Introduction To Analytical Chemistry
No ratings yet
Introduction To Analytical Chemistry
40 pages
Hasan Zahid MTCRE- 14-3-25
No ratings yet
Hasan Zahid MTCRE- 14-3-25
4 pages
LODR
No ratings yet
LODR
21 pages
ECE 4215 Lesson 1 Introduction To Pavement Engineering
No ratings yet
ECE 4215 Lesson 1 Introduction To Pavement Engineering
11 pages
DPM 2010 Deployment
No ratings yet
DPM 2010 Deployment
132 pages
Golden Bell Challenge: Acca F6 Taxation - June 2019
No ratings yet
Golden Bell Challenge: Acca F6 Taxation - June 2019
215 pages
Sini Poznic Android - Syd Resume
No ratings yet
Sini Poznic Android - Syd Resume
6 pages
Notes:-: LSB Temerity Infra PVT LTD
No ratings yet
Notes:-: LSB Temerity Infra PVT LTD
1 page
Spam Email Classifier
No ratings yet
Spam Email Classifier
17 pages
212 - Holy Cross Hospital - DRG
No ratings yet
212 - Holy Cross Hospital - DRG
3 pages
C1 Speaking Topics HW
No ratings yet
C1 Speaking Topics HW
25 pages
Indian Air Force: Air Force Common Admission Test Admit Card - Afcat 02/2021
No ratings yet
Indian Air Force: Air Force Common Admission Test Admit Card - Afcat 02/2021
7 pages
Job Description - Reservations
No ratings yet
Job Description - Reservations
2 pages
Pad240 T8
No ratings yet
Pad240 T8
18 pages
Design of Beam: Tension Bars Comp. Bars Stirrups
No ratings yet
Design of Beam: Tension Bars Comp. Bars Stirrups
10 pages
What Is A Cover Page For A Resume
100% (1)
What Is A Cover Page For A Resume
4 pages
ISC Security Design Criteria For New Federal Office Buildings and Major Modernization Projects: A Review and Commentary (2003)
No ratings yet
ISC Security Design Criteria For New Federal Office Buildings and Major Modernization Projects: A Review and Commentary (2003)
64 pages
Illegal Logging
No ratings yet
Illegal Logging
67 pages
Stenhoj ST 7 10 15 20 25 30 40 50 60 Operating Manual T63801 Ver. B1
No ratings yet
Stenhoj ST 7 10 15 20 25 30 40 50 60 Operating Manual T63801 Ver. B1
37 pages
SMD PPT My Notes
No ratings yet
SMD PPT My Notes
168 pages
Dataset 101 Visualizations Using Python (Abouraia A.) (Z-Library)
No ratings yet
Dataset 101 Visualizations Using Python (Abouraia A.) (Z-Library)
122 pages
Download Full Crafting Docs for Success. An End-to-End Approach to Developer Documentation Diana Lakatos PDF All Chapters
100% (1)
Download Full Crafting Docs for Success. An End-to-End Approach to Developer Documentation Diana Lakatos PDF All Chapters
41 pages
Vwap 20131113
No ratings yet
Vwap 20131113
0 pages
Methods of Research Module 1
No ratings yet
Methods of Research Module 1
15 pages
Assessment 3 Marketing
No ratings yet
Assessment 3 Marketing
8 pages

MLPC Group Assignment

Uploaded by

MLPC Group Assignment

Uploaded by

TAYLOR’S UWE DUAL AWARDS PROGRAMMES

JANUARY 2024 SEMESTER

MACHINE LEARNING AND PARALLEL COMPUTING

Assignment 2 – Group (30%)

1 Monish Shrestha 0362091

2 Praphul Shrestha 0362191

3 Khushi Thami 0362676

1. Describe your observation and understating on the whole dataset by answering

b. What is the statistical summary of all the attributes?

Additionally, we identified numerical and categorical columns in the dataset and

d. What are the independent and dependent variables?

ii. Dependent Variables:

Next, we analyzed the correlation coefficients obtained:

These correlation coefficients give us useful information on the patterns existing

ii. Dataset Splitting:

iii. Model Training:

4. Is this a Supervised model or an unsupervised model? Why so? Explain in detail.

In summary, the provided model exemplifies supervised learning by utilizing labeled

ii. Model Training with New Variables:

iii. Model Evaluation:

Research and Development Methodologies:

1. Pre-training Phase Methodology:

Master-Slave Structure: The masterpiece of a computing node commands phasing in weight

Algorithm 4: The pre-training operation with parallel computing methodology requires

2. Fine-tuning Phase Methodology:

Research and Development Methodologies:

Specific Implementation Techniques:

Performance Metrics Interpretation:

Wall Clock Time:

Speedup and Parallel Efficiency:

Scalability and Performance Improvement:

You might also like