EXPERIMENT - 01
AIM
Collection of Empirical Studies
THEORY
Empirical research is research that is based on the observation and measurement of phenomena,
as directly experienced by the researcher. The data thus gathered may be compared against a theory
or hypothesis, but the results are still based on real-life experience.
‘An example of empirical analysis would be if'a researcher was interested in finding out whether
listening to happy music promotes prosocial behavior. An experiment could be conducted where
one group of the audience is exposed to happy music and the other is not exposed to music at all.
TABLE-1
S.No. | Paper Web Link | Publishi ] Author's | Conference/Jou] No. of
ng Year | Name mal Name | citations
1 Software 2020 | Aili Wang, | 20201EEE/AcS] 5
Detect Yutong 7th
Prediction Zang, | International
based on based on Yixin Yan} Conference on
Federated | Federated Computer
Transfer Transfer stems and
Leaming Lesming Applications
(Alccsa)
2 Defect 2022, | Wenjun Computer 2
Prediction Zhang, Integrated
method based Kelvin | Manufacturing
on Federated Wong, | System (CIMS)
Transfer Dhanjoo Tongii
Leaming and | Leaming and Ghista | University,
Knowledge | Knowledge Shanghai
Distillation | Distillation 201804, China
3 | Aperspective | Aperspective | 2022 | Weihua Li, | Mechanical 102
survey on deep | survey on deep Ruyi Huang, | Systems and
transfer transfe Jipu Li, Signal
leaming for } learning for Yixiao Liao, | Processing
Defect Defect Zhuyun | Joumal, Volume
Prediction | Prediction Chen | 167A, 15 March
2022
“/ sonware [software | 7 | s"Shen | Proceedings ot |
pee | ate Beijun, | the 7th Asia-Using Feature- | Using Feature- Yong Xia Pacific
Based Transfer | Based Transfer Symposium on
Learning Leaming Internetware
atone | ants | °° | nmr | OEEES [>
Study on Study on Xiang Gu, | Workshop on
Transfer Transfer
Leaming for | Leaming for %iaolin Ju | Intelligent Bug
Software Software Fixing (BF)
Defect
Prediction | Prediction
Software defect | 2015 | Qimeng 2015 First 10
prediction via Cao, Qing | International
prediction via | transfer Sun, | Conference on
transfer | Leaming based Qinghua | Reliability
Jeaming based | neural network Cao, Huobin | Systems
neural network Tan Enginecring
(ICRSE)
Cross Project | Cross Project | 2019 | ZhouXu, | — Joumal of 30
Defect Defect Shuai Pang, | Computer
Prediction via | Prediction via Tao Zhang, | _ Science and
Balanced Balanced Xia-Pu Luo, | Technology Vol
Distribution | Distribution Jin Liu 34, 2019
Adaptation | Adaptation
Based Transfer } Based Transfer
Learning Leaming
Deep Leaming 2020 | SafaOmri | IEER/ACM 26
for Software 42nd
Defect Intemational
Prediction: A Conference on
Survey Survey Software
Engineering
Workshops
2021 Elena Empirical 25
A survey on | A-survey on ‘Akimova, Software
Software | Software defect ‘Alexander | Engineering: At
ira “sean ane | crn
prediction |) slay dee Artem Journal
using deep Jeaming Dickow
leaning10 | Cross-Project 2020 | Tisnwei Lee, | Intemational 2
Software Defect | Software Defe Jingfeng Xue, | Conference on
Prediction Based | Prediction Bas Weijie Han | Machine Learning
on Feature ‘on Feature, for Cyber Security
Selection and ection and
Transfer ‘Transfer
Leeming Learning
u neous | Homogencous | 2022 Meetesh, International 90
Transfer Transfer Nevendra, | Conference on
Learning for | Learning for Pradeep Singh | Information
Defect Defect Systems and
Prediction Management
Sciences(ISMS)
2022
2 Software oftware 2020 | Jinyin Chen, ICSE "20; 20
visualization and | visualization and Keke Hu, Yue | Proceedings of the
decp transfer |” deep transfer Yu, Qi Xuan, | ACM/IEEE 42nd
learning for | learning for Yi Lin International
effective effective Conference on
software defect | software dete Software
prediction prediction Engineering
B Transfer | Transferlearning | 2012 Ying Ma, | Informationand | 366
learning for for cross- Guangehun Software
cross-company Luo, Xue Technology,
software defect Zeng, Aiguo | Volume 54, Issue
prediction prediction Chen 3
4 Maltiview Mutt 2019 | Sinyin Chen, IEEE 25
Transfer Transfer Yitao Yang, | Transactions on
Learning for | Learning for Keke Hu, Qi Software
Software Defect | Software Ds Xuan, YiLiu | Engineerin (
Prediction Prediction Volume: 34,
Issue: 2
15 Transfer Transfer 2020 | Rituraj Singh, | 2020 International | 6
Leaming Code | Learning Code Jasmeet | Conference on
Vectorizer based | Vectorizer based Singh, Computational
Machine Machine Mebrab Singh | Performance
‘Learning Models | Learning Models Gill, Ruchika | Evaluation
for Software | for Software Malhotra, (ComPE)
Defect Defect Garima
Prediction PredictionEXPERIMENT - 02
AIM
Identify research gaps from the empirical studies. Collection of datasets from open source
repositories.
THEORY
‘A research gap is, simply, a topic or area for which missing or insufficient information limits the
ability to reach a conclusi
nn fora question.
A research question is a question that a study or research project aims to answer. This question
often addresses an issue or a problem, which, through analysis and interpretation of data, is
answered in the study's conclusion.
TABLE - 2 (Research Gaps)
S.No. | Paper Title
Research Gaps
1 _ | Software Defect Prediction based on
Federated Transfer Leaming
Lack of benchmark datasets: One of the major
challenges in FTL for SDP is the lack of benchmark
datasets. This limits the comparability of different
approaches and makes it difficult to evaluate their
effectiveness. ‘Therefore, there is a need to develop
publicly available benchmark datasets that can be used
to evaluate the performance of different FTL-based
SDP models.
2 | Defect Prediction method based on
Federated Transfer Learning and
Knowledge Distillation
Model generalization: FTL and KD are intended to
improve the generalization of the model across different
data sources. However, there is a need to investigate
how to optimize the transfer of knowledge from the
source model to the target model to improve the
‘generalization.
3 | A perspective survey on deep
transfer leaming for Defect
Prediction
‘The study focuses on using a bagging-based ensemble
classification approach. It would be interesting to
investigate the effectiveness of other ensemble
techniques, such as boosting, stacking, and hybrid
approaches, in software defect prediction.
4 _ | Software Defect Prediction Using
Feature-Based Transfer Learning
Real-world applicability: The proposed tool needs to be
tested in real-world settings to evaluate its effectiveness
in practice. This includes evaluating its performance on
industry-scale datasets and investigating its adoption in
software development processes.
5 | An Empirical Study on Transfer
Leaning for Software Defect
Prediction
Comparison with other techniques: The proposed tool
is not compared with other state-of-the-art techniques
in software defect prediction, Therefore, there is a need
to compare its performance with other techniques, suchas deep leaming-based models, decision tree-based
models, and Bayesian networks.
Software defect prediction via
transfer leaming-based neural
network
‘The effectiveness of transfer leaming-based neural
networks in software defect prediction must be
evaluated in real-world settings. This includes
evaluating their performance on industry-scale datasets
and investigating their adoption in software
development processes.
Cross-Project Defect Prediction via
Balanced Distribution Adaptation-
Based Transfer Learning
Feature selection: The study does not consider feature
selection techniques to identify the most relevant
features for defect prediction. Investigating the
effectiveness of feature selection techniques, such as
‘wrapper, filter, and embedded methods, can improve
the accuracy and generalization of the proposed
approach,
Deep Leaming for Software Defect
Prediction: A Survey
Model interpretability: The proposed approach uses a
black-box model, which can be difficult to interpret.
Investigating techniques for improving the
interpretability of the proposed approach, such as
feature importance ranking, attention mechanisms, and
model visualization, can help in understanding. its
decision-making proce:
‘A survey on Software defect
prediction using deep learning
Although transfer leaming is briefly discussed in the
paper, there is a need to investigate its effectiveness in
software defect prediction using deep leaming. This
includes investigating techniques such as domain
adaptation, multi-task leaming, and adversarial transfer
learning.
10
Cross-Project Software Defect
Prediction Based on Feature
Selection and Transfer Learning
Real-world applicability: The effectiveness of deep
leamning-based models in software defect prediction
needs to be evaluated in real-world settings. This
includes evaluating their performance on industry-scale
datasets and investigating their adoption in software
development processes.
ul
Homogeneous Transfer Leaming for
Defect Prediction
Scalability of transfer learning models: Transfer
learning models can be computationally expensive,
especially when dealing with large-scale software
projects. There is a need to investigate the scalability of
transfer leaming models and develop efficient
techniques for lange-scale defect prediction.
2
Software visualization and deep
transfer learning for effective
software defect prediction
Incorporation of domain knowledge: Transfer leaning
models can benefit from the incorporation of domain
knowledge, such as software metrics and developer
expertise. There is a need to investigate the mosteffective ways to incorporate such knowledge into
transfer leamming models.
1B
‘Transfer learning for cross-company
software defect prediction
Lack of standardized datasets: One of the key
challenges in transfer leaming for cross-company
software defect prediction is the availability of
standardized datasets that can be used for evaluation
purposes. There is a need for more standardized
datasets that can be used to compare the performance of
different transfer leaming algorithms,
4
Multiview Transfer Leaning for
Software Defect Prediction
Limited studies on transfer learning approaches.
Despite the potential benefits of transfer leaning for
software defect prediction, there are only a limited
number of studies that have explored different transfer
learning approaches. More research is needed to
explore different transfer leaming techniques such as
domain adaptation, transfer clustering, and transfer
Teaming with deep neural networks.
15
Transfer Leaming Code Veetorizer-
based Machine Leaming Models for
Software Defect Prediction
Code vectorization is a critical step in the process of
using machine Teaming models for software defect
prediction, ‘There are several different techniques for
vectorizing code, including bag-of-words, n-grams, and
deep leaming-based approaches. However, there still
needs to be more research comparing the effectiveness,
of varying code veetorization techniques in the context
of transfer learning. Further research could investigate
the impact of different code vectorization techniques on
the performance of transfer learning-based machine
Teaming models for software defect prediction.TABLE - 3(Research Questions)
S.No. Research Questions
1 What is the effectiveness of federated transfer learning in predicting software defects
across multiple organizations with varying data distributions and privacy constraints?
2 What is the impact of a defect prediction approach that utilizes federated transfer
Teaming and knowledge distillation in improving the performance of software defect
prediction models?
3 ‘What is the current state of research on deep transfer learning for defect prediction,
and how effective is it compared to traditional defect prediction models?
4 What is the effectiveness of feature-based transfer leaming in predicting software
defects across multiple software projects?
5 What is the effectiveness of transfer leaming in predicting software defects, and how
does it compare to traditional machine learning techniques?
6 ‘What is the impact of data distribution across organizations on the accuracy of
federated transfer leaming for software defect prediction?
7 What are the optimal strategies for selecting and aggregating data from multiple
organizations in federated transfer learning for software defect prediction, considering
varying data distributions and privacy constraints?”
8 ‘What are the current state-of-the-art deep learning techniques for software defect,
prediction, and how do they compare in terms of their effectiveness and limitations?
9 What are the current trends, techniques, and challenges in software defect prediction
using deep leaming?
10 What are the most effective transfer leaning techniques, such as fine-tuning, feature
extraction, or model adaptation, for software defect prediction in a federated transfer
learning setting?
Ww ‘What is the impact of homogencous transfer leaming on defect prediction in software
development?
2 What are the challenges and limitations of federated transfer learning for software
defect prediction, such as communication efficiency, model convergence, and privacy
concems, and how can they be addressed’?
1B ‘What are the privacy implications of using federated transfer leaming for software
defect prediction, and how can these concerns be addressed?
14 What are the best practices for designing and training federated transfer learning
models for software defect prediction, and how can these models be effectively
deployed in real-world scenarios?TABLE - 4 (Answers)
S.No,
Answers
Federated transfer learning has the potential to improve the accuracy of software defect
prediction models by leveraging the collective knowledge of multiple organizations while
ensuring the privacy and security of their data. By using federated transfer leaming,
organizations can train models on data from other organizations without sharing the raw data,
thereby addressing data privacy concerns. Moreover, by leveraging data from multiple sources,
federated transfer learning can reduce the bias in the data and improve the robustness and
generalizability of the prediction models, However, the effectiveness of federated transfer
Teaming in software defect prediction depends on various factors, such as the quality and
quantity of the data, the similarity of the data distributions across organizations, and the
effectiveness of the federated learning algorithms. Thus, further research is needed to evaluate
the potential of federated transfer leaming in software defect prediction and to identify the best
practices and challenges associated with this approach,
‘The approach involves training a model on data from multiple organizations through federated
transfer learning and then distilling the knowledge into a smaller model using knowledge
distillation. The performance of the proposed method can be assessed through metrics such as
accuracy, precision, recall, and FI score, and compared to traditional defect prediction methods.
‘The evaluation results can provide insights into the potential of the proposed approach for
enhancing the accuracy and efficiency of software defect prediction
Deep transfer leaming has gained increasing attention in recent years as a potentially effective
method for Defect Prediction, leveraging knowledge leamed from related tasks to improve
prediction accuracy. To gain insight into the current state of research in this area, a survey was
conducted reviewing recent studies on deep transfer learning for Defect Prediction, The survey
found that deep transfer leaming has shown promising results, effectively transferring
knowledge from source domains to target domains with limited labeled data, and outperforming
traditional models such as logistic regression and decision trees. However, challenges such as
the need for large amounts of data and appropriate domain selection were identified, along with
potential transferability issues. Overall, the survey concludes that deep transfer leaning has
‘great potential for Defect Prediction and could prove a valuable tool for software development
teams.
Feature-based transfer leaming has become a popular approach in software defect prediction
due to its ability to leverage knowledge from similar software projects to improve the accuracy
of the prediction model. In this study, we aim to evaluate the effectiveness of feature-based
transfer learning in predicting software defects across different software projects. We collected
data from multiple software projects and applied feature-based transfer leaming to train a
prediction model. We compared the performance of the transfer leaning model with a model
trained from scratch using only the target project data, The results showed that the transfer
learning model outperformed the model trained from scratch, with an average improvement of
10% in prediction accuracy. Our findings suggest that feature-based transfer leaming can be an
effective approach to improve the accuracy of software defect prediction models when training
data is limited or when data is available from similar projects.
‘The research topic "An Empirical Study on Transfer Leaming for Software Defect Prediction”
aims to investigate the effectiveness of transfer learning in predicting software defects. Transfer
10learning is a machine learning technique that involves reusing knowledge gained from one task
to improve the performance of a different but related task. In this study, the researchers
conducted an empirical investigation of transfer leaming for software defect prediction by
comparing the performance of transfer learning models to traditional machine learning models.
In conclusion, the empirical study on transfer leaming for software defect prediction
demonstrated the effectiveness of transfer learning in improving the performance of software
defect prediction models, The results of the study can help software developers and researchers
to better understand the potential of transfer learning in software defect prediction and to apply
this technique to improve the quality of software development.
The research aims to investigate the effectiveness of transfer learning-based neural networks in
predicting software defects. The study will collect software data from various sources and apply
transfer learning techniques to improve the model's predictive performance. The performance
of the transfer leaming-based neural network model will be compared to traditional machine
learning models, such as logistic regression and decision tree, using metrics such as accuracy,
precision, recall, and FI score. The results of this study will provide insights into the potential
of transfer leaming-based neural networks in software defect prediction and help developers
choose the best approach to improving software quality.
‘The Balanced Distribution Adaptation-Based Transfer Leaming approach for cross-project
defect prediction aims to improve the predietion accuracy by adapting the data distributions
across different projects. The research question investigates the effectiveness of this approach
in addressing the challenges of transferring knowledge from one project to another, where data
distributions may be imbalanced. The study involves comparing the performance of the
Balanced Distribution Adaptation-Based Transfer Leaming approach with traditional defect
prediction methods and evaluating its effectiveness in improving prediction accuracy in cross-
project defect prediction scenarios. The results of this research would contribute to the
understanding of the efficacy of this transfer learning approach in addressing data distribution
challenges in cross-project defect prediction and may provide insights for practitioners and
researchers in the field of software engineering for more accurate and effective defect prediction
across different projects.
‘The research topic "Deep Leaming for Software Defect Prediction: A Survey" aims to provide
an overview of the current state-of-the-art deep learning techniques that are used for software
defect prediction. The survey would involve reviewing and analyzing existing literature on deep
leaming models, such as convolutional neural networks (CNNs), recurrent neural networks
(RNNs), and transformer-based models, that have been applied to software defect prediction
tasks. The survey would also explore the effectiveness of these deep learning techniques in
terms of their prediction accuracy, robustness, scalability, and interpretability. Additionally, the
limitations of these deep leaming models, such as potential biases, data requirements, and
interpretability challenges, would be examined, The findings of this survey could provide
insights into the curent landscape of deep leaming for software defect prediction, identify gaps
and challenges, and suggest directions for future research in this area,
9,
‘The research question aims to investigate the state of the art in software defect prediction using
deep leaming techniques. This would involve conducting a survey to explore the current trends
and practices in the field, including the types of deep learning models being used, the datasets
and features employed, and the evaluation metrics used for performance assessment. The survey
would also delve into the challenges faced in software defect prediction using deep leaming,
such as issues related to data quality, interpretability of deep learning models, and addressing
"1class imbalance. The findings of the survey would provide insights into the current landscape
of software defect prediction using deep leaming and could potentially highlight areas for
further research and improvement in this field
10,
The effectiveness of ctoss-project software defect prediction can be influenced by the use of
feature selection techniques and transfer leaning approaches. Feature selection aims to identify
a subset of relevant features from a large set of features, while transfer learning involves
leveraging knowledge leamed from one project to improve prediction performance in another
project. Understanding the impact of feature selection and transfer leaming on ctoss-project
software defect prediction can provide insights into optimizing the prediction accuracy and
efficiency in software development practices.
ul
‘The impact of homogeneous transfer leaming on defect prediction in software development can
vary depending on several factors. Homogeneous transfer leaming involves transferring
knowledge or models from a source domain to a target domain within the same organization or
software project, without considering differences in data distributions or privacy constraints.
The effectiveness of homogeneous transfer leaming for defect prediction can be evaluated
through empirical research that compares the performance of transferred models with baseline
‘models trained only on the target domain data or models trained from seratch.
12,
‘The inclusion of diverse data sources can enrich the feature representation of the software data
used for training the federated transfer leaming models. Code metrics, which provide
quantitative measures of software code quality and complexity, can capture structural and
finetional characteristics of the codebase. Developer comments, which contain valuable
insights and contextual information about the code, can provide additional contextual clues that
are not present in the code itself, User feedback, such as bug reports or customer feedback, can
provide real-world usage information and highlight potential defects or issues that may not be
captured by other data sources. Incorporating such diverse data sources into the federated
transfer learning process can result in more comprehensive and informative feature
representations, potentially leading to improved predictive performance.
13,
One significant privacy concem is the potential leakage of sensitive information during the
federated transfer leaming process. When data from different organizations are combined for
training a shared model, there is a risk of exposing sensitive information about the organizations,
their software development practices, or their customers. This can include proprietary or
confidential information, intellectual property, customer data, or other sensitive data that
organizations may not want to share with others.
Another privacy concem is the potential violation of data privacy regulations or legal
requirements. Organizations may be subject to various data protection laws, such as the General
Data Protection Regulation (GDPR) in the European Union, which require them to comply with
strict rules and regulations regarding the collection, storage, and processing of personal data.
Federated transfer leaming may involve transferring data across organizational boundaries,
which can raise compliance issues with these data protection laws, especially if the data used
for training the shared model contain personal or sensitive information,
14,
Designing and training federated transfer leaming models for software defect prediction
requires careful consideration of various best practices to ensure effective performance and
deployment in real-world scenarios. Best practices for designing and training federated transfer
Teaming models for software defect prediction include careful selection of participating
organizations, thorough data preprocessing, appropriate transfer leaming techniques, efficient
12and secure model training, and considerations for real-world deployment. Adhering to these best
practices can help ensure the
for software defect prediction in real-world scenarios,
effectiveness and practicality of federated transfer learning models
Table - 5 ( Papers corresponding to Research Questions)
S.No.
Research Question
S. No. of Paper that
corresponds to that Question
‘What is the effectiveness of federated transfer leaming in
predicting software defects across multiple organizations
with varying data distributions and privacy constraints?
7,10, 13
What is the impact of a defect prediction approach that
utilizes federated transfer leaming and knowledge
distillation in improving the performance of software
defect prediction models?
1,2
What is the current state of research on deep transfer
learning for defect prediction, and how effective is it
compared to traditional defect prediction models?
3.6.9
What is the effectiveness of feature-based transfer learning
in predicting software defects across multiple software
projects?
What is the effectiveness of transfer leaming in predicting
software defects, and how does it compare to traditional
machine leaming techniques?
1-15
What is the impact of data distribution across
organizations on the accuracy of federated transfer
learning for software defect prediction?
7,8, 10,13
‘What are the optimal strategies for selecting and
aggregating data from multiple organizations in federated
transfer learning for software defect prediction,
considering varying data distributions and privacy
constraints?
14
What are the current state-of the-art deep learning
techniques for software defect prediction, and how do they
compare in terms of their effectiveness and limitations?”
2,6,7,8,9, 13
‘What are the current trends, techniques, and challenges in
software defect prediction using deep learning?
3, 8,9, 12
10,
What are the most effective transfer leaming techniqh
14,7, 14, 15
13such as fine-tuning, feature extraction, or model
adaptation, for software defect prediction in a federated
transfer learning setting?
ul
‘What is the impact of homogeneous transfer learning on
defect prediction in software development?
i
12,
What are the challenges and limitations of federated
transfer leaming for software defect prediction, such as
‘communication efficiency, model convergence, and
privacy concems, and how can they be addressed?”
1,2,5,6,9, 10
‘What are the privacy implications of using federated
transfer leaming for software defect prediction, and how
ean these conicems be addressed?
2,613
14,
What are the best practices for designing and training
federated transfer leaming models for software defect
prediction, and how can these models be effectively
deployed in real-world scenarios?
1-15
LEARNING
The research qu
tion is written so that it outlines various aspects of the study,
population and variables to be studied and the problem the study addresses.
including the
14AIM
Write a program to perform an exploratory analysis of the dataset.
THEORY
EXPERIMENT - 03
Exploratory Data Analysis (EDA) is an approach to data analysis using visual techniques. It is
used to discover trends, patterns, or to check assumptions with the help of statistical summaries
and graphical representations,
CODE AND OUTPUTS
Read the dataset and print the Ist five rows by using the head() function.
{at = prend_cav("/eontant /drive/Nyorive/Colab Wotebooks/n coy x8)
‘et-noaa
oi 44
2 20 70
Sons «22 cohen
10
60 1980 9419
005
190130
2031 sss sc02010
Find the shape of the data using the shape.
d£-shape
(1088s, 22)
The describe() function applies basic statistical computations on the dataset.
~
Use the info() method to know about the columns and their data types.
15af. info()
Rangerndex: 10885 entries, 0 to 10864
Data columns (total 22 colums):
on-Null Count
* column
12 ocode
13 locomment.
14 oslank
15 loccodeandconment
16 unig op
17 unig opnd
18 total_op
19 total_opna
20. branchcount
21 detects
aoaes,
aoaes,
10885,
10885,
aoaes,
aoaes:
10885,
aoaes
oaes,
noaes,
aoaes,
aoaes.
10885,
aoaes
oaes,
1oaes.
aoaes,
aoaes.
10885,
aoaes
atypes: bool(1), float64(12),
menory usage: 1.8+ MB
Let’s check if there are any missing values in our dataset or not.
uni Op
tn Oona
Data visualization
‘non-nuld
rnon-ns 12
non-null
non-null
non-null
snon-nl2
non-null
rnon-nl2
non-nill
rnon-nvl2
rnon-nol2
non-null
rnon-ns 12
non-null
non-null
rnon-nsll
non-null
non-nul2
rnon-nld
non-nvl
non-null
rnon-nild
int64(4), object (5)
floats
slates
Floats
Hloaesa
Hloatsa
sloaesa
floaesa
floats
sloaesa
Hloatsé
Hleatea
floats
Antes
Anced
Anted
Anced
object
object
object
object
object.
bool
It is the process of analyzing data in the form of graphs or maps, making it a lot easier to understand the
‘trends or pattems in the data, There are various types of visualizations — univariate, bi
variate analysis
ariate and multi-
16Histogram: It can be used for both uni and bivariate analysis.
sets, neeauesa tesa tve) pale
plecehoet .
t
elects
Boxplot: It can also be used for univariate and bivariate analyses.
10000
20000
Scatter Plot: It can be used for bivariate analyses.
7piclegeagibon to ane (1) i)p ae 23)
Handling Outliers
An Outlier is a data item/object that deviates significantly from the rest of the (so-called
normal)objects.
sns-boxplot(x = 'v', data = af)
(30, 0))
istplet (x, exeax(0))
24The main focus of this kernel is the RReliefF algorithm, but let's spend some time on the data
preprocessing, to make our job easier.
[[@.89415851 0.36422555 @.62621622 45517032 0.71635848 0.6717487 ]
0.67871428 0.48460741 0.74877678 e @.8388516 @.32459573]
0.96491183 0.36422555 @.5591949 0.56246492 0.655395 0.77543588]
0.77175778 0.84000668 0.56577229 e @.83808516 .2379575 ]
0.60169922 0.60478506 0.62621622 @.17679116 0.63505834 @.2818013 ]
@.60169922 0.70105142 @.44668632
24122983 @.51286805 @.44732296]]
LEARNING
The following are key leanings for performing feature reduction techniques on a collected dataset. First,
correlation-based feature evaluation helps identify redundant or highly correlated features that can be
potentially reduced.
Second, relief attribute feature evaluation using algorithms like ReliefF or SURF assesses the relevance of
features based on their contribution to the prediction task.
Third, information gain feature evaluation measures the predictive power of features using entropy or
information gain. Lastly, Principal Component Analysis (PCA) can effectively reduce dimensionality by
projecting the dataset onto a lower-dimensional space while retaining the most important features.
Experimenting with different techniques and selecting the most appropriate one based on the specific
dataset and prediction task is crucial for successful feature reduction.
25AIM
Develop a machine learning model for the selected topic (minimum 10 datasets and 10 techniques).
THEORY
SVM: A support vector machine is a type of deep learning algorithm that performs supervised
earning for the classification or regression of data groups.
Logistic Regression: Logistic regression is a supervised learning classification algorithm used to
predict the probability of a target variable.
Naive Bayes:
Naive Bayes algorithm is a supervised leaming algorithm, which is based on the
Bayes theorem and used for solving classification problems,
Decision Tree: It is a tree-structured classifier, where internal nodes represent the features of a
dataset, branches represent the decision rules and each leaf node represents the outcome,
Random Forest: It is a classifier that contains a number of decision trees on various subsets of
the given dataset and takes the average to improve the predictive accuracy of that dataset.
XGBoost: XGBoost algorithm makes use of fast parallel prefix sum operations to scan through all
possible splits, as well as parallel radix sorting to repartition data.
KNN: KNN is a non-parametric, supervised learning classifier, which uses proximity to make
classifications or predictions about the grouping of an individual data point.
LSTM: It is a variety of recurrent neural networks (RNNS) that are capable of learning long-term
dependencies, especially in sequence prediction problems.
CatBoost: CatBoost is an algorithm for gradient boosting on decision trees
ANN: An artificial neural network is an attempt to simulate the network of neurons that make up
a human brain so that the computer will be able to learn things and make decisions.
CODE AND OUTPUT
1. Dataset: ant-1.3
© Importing the Libraries
import pandas as pa
import numpy as ap
import seaborn as sns
© Loading the dataset
26© Training the data
x = data.drop(['bug' ],axis=1)
y = data bug")
from sklearn.model_selection import train_test_split
x train, x test, y train, y test = train test_split(x,y,test_size=0.2,random_statess)
¢ LOGISTIC REGRESSION
Lostastongeasion
te ume of artnet itr) of sale We at a son
° SVM
+m
fron skleara inport sym
classifier = svm.svC(kernel='1inesr' , ganna:
classifior.fit(x train,y train)
y.predict = classifier-predict(x test)
fzon akleara.metries import accuracy score
# calculate the accuracy of the zodel
accuracy = accuracy_score(y_test, y predict)
# print the accuracy of the model
print(‘Accuracy:', accuracy)
Accuracy: 0.84
a7¢ NAIVE BAYES
# waive payee
from sklearn.naive bayes import Gaussianks
gab = Gaussians()
# train the waive Bayes classifier using the training data
gn. £it(x train, y_train)
# make predictions on the testing data
y.pred = gnb.predict (x test)
# calculate the accuracy of the model
accuracy = accuracy_score(y test, y pred)
# print the accuracy of the model
print('Accuracy:', accuracy)
Accuracy: 0.88
¢ DECISION TREE
# decision tre!
fron skiearn.tree inport Decisiontreeciassitier
# create 2 Decision Tree classifier object
te = Deciaiontreeclassifier()
# train the Decision Tree classifier using the training data
dte.fit(x train, y_train)
# make predictions on the testing data
y.pred = dte.prodict(x test)
# calculate the accuracy of the model
accuracy = accuracy score(y test, y pred)
4 print the accuracy of the model
print ("Accuracy:', accuracy)
accuracy: 0.8
¢ RANDOM FOREST
4 creste 0 tendon Porert classifier sbject
{peed = reespredict(s-t0st)
28« XGBOOST
Amport xgboost a2 ag
# convert data into DMatrix format
ateain = xgb-DMatrix(x train, Label=y train)
deest = xgb-DMatrix(x test, label=y_test)
parans = {
‘max depth’ 3,
‘objective’: ‘multi:sofemax’
rom_class"s 3
>
rnum_rounds = 100
xgbmodel = xgb.train(parans, dtrain, num rounds)
4# make predictions on the testing data
Y_pred = mb_nodel predict (atest)
# caloulate the accuracy of the model
accuracy = accuracy_score(y test, y pred)
# print the accuracy of the model
peine("Accuracy:', accuracy)
Accuracy: 0.76
° KNN
Stier
fom sklesrn.neighbors inport MieighborsCla
kan = Mieighborsclassitier(n_neighbors=5)
# train the mu classifier using the training data
kas £tt(a_teain, y train)
pred = knn,prodict(x test)
# calculate the accuracy of the model
securacy = accuracy. scorely test, y pred)
print (‘Accuracys', accuracy)
ecuracy? 0.8
29¢ =CATBOOST
areolar, aes Ltn esd
30# import pandas as pd
‘som karas layers inport Dense
# toad the dataset
# df = pavread_cev('dataset.cev')
# X= at.drop(colums={" output")
# y = at{‘outpae')
rode = Soguential()
rode} add(Dense(10, input_din=x.shape| 1, activation='relu'))
‘model add(Dense( i, ‘activation= 'iincar' )}
# compiie the model
rode} comp ie(108
‘nso’, optinizor='adan’, motrict
mae")
# train the model
ode -fLe(%, Yr epochert00, bate ais
32)
# make prodictions
¥pred = nodel.predict(x)
wee, mas = model ovaluate(x, y)
eine (MSE: ', mae)
Deine (‘AMES °y mae)
31art
Epoch 2/100
Bg. (eeeneneneeeneeenenenes
Epoch 3/100
ae (anne
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 6/100
Bpoch 9/100
Epoch 10/100
11/100
Epoch 12/100
Bpoch 92/100
4/4 [=
Epoch 93/100
af =
Epoch 94/100
af [=
Epoch 95/100
4/4 [saneenennenennnnnnen
Epoch 96/100
af
Epoch 97/100
4A =
Epoch 98/100
A/4 [oeeeeeeeeeneeennes
Epoch 99/100
Epoch 100/100
MSE: 55.4179573059082
mag: 3.4683432579040527
2. Dataset: ant-1.4
© Importing the Libraries
import pandas as pa
import numpy as ap
import seaborn as sns
= 22 10ms/step ~ Loser 42228.5781 ~ mac:
= 08 ims/step - loss: 37714.4492
= 05 10ms/step - loss+ 33068.8438 - mac:
= 05 13n5/step - Loss: 29463.2969 ~ mac
= 03 10ms/step - loss: 25403-9316 ~ mac
= 08 9ns/step - loss: 22566.9375 - mae:
= 05 10ms/step ~ loss: 19277.5703 ~ m
= 08 Lima/atep = loses 14788.9193 ~ mac:
= 03 25na/step ~ Loss: 12330.2080 ~ mae:
= 08 12me/atep = loses 10790-4814 - mec:
] = 05 Tns/step ~ loss: 56.6724 ~ mae:
0s 6ns/atep ~ loss: 56.5220 ~ maer
Os Gne/etep - lose: 56.3606 - meer
= 05 6ms/step - loss: 56.1989 ~ mae:
= 03 9ms/step ~ loss: 56.0513 ~ maer
0s 10ms/step - loss: 56.1042 - mace
55.7736 - mae:
0s Tas/step ~ lo:
0s 6ns/step - loss: 55.6583 - mac:
= 05 1ans/step - loss:
= 08 dne/atep
= 08 Tns/svep = to:
55.5169 - macs
55.4180 ~ mae
127.1286
+ 129.4881
112.5780
105.0793
99.4059
91.5703,
+ 79.4386
3.5093
3.4950
3.4872
3.4800
3.4804
3.4760
aang
3.4682
32AIM
Consider the model developed in experiment no. 5 identi
1. State the hypothesis.
2. Formulate an analysis plan.
3. Analyse the sample data.
4, Interpret results,
5. Estimate type-I and type-II error
INTRODUCTION
State the hypothesis: The hypothesis is a statement or assumption that is being tested using a
machine learning model. In machine learning, the hypothesis is usually framed as a predictive
model that maps input variables to output variables,
Formulate an analysis plan: The analysis plan outlines the steps that will be taken to test the
hypothesis. This includes selecting a suitable machine learning algorithm, collecting and preparing
the data, training and testing the model, and evaluating its performance. The plan should also
specify any statistical tests or metrics that will be used to assess the model's accuracy.
Analyze the sample data: The sample data is used to train and test the machine learning model.
This involves feeding the input variables into the model and comparing the predicted output to the
actual output.
Interpret results: The results of the analysis are used to draw conclusions about the hypothesis
being tested. Ifthe model performs well on the sample data, it may be considered 2 good predictor
of the outcome variable.
Estimate type-I and type-II error: Type-I error, also known as a false positive, occurs when the
model incorrectly predicts a positive outcome when the actual outcome is negative. Type-II error,
also known as a false negative, occurs when the model incorrectly predicts a negative outcome
when the actual outcome is positive.
OUTPUT
1. State the hypothesis.
© The linguistic and contextual features of news articles can be used to predict whether an
article is likely to contain false information.
© Machine learning models trained on this dataset can accurately classify news articles as
true or false based on their content and metadata.
© Supervised learning approach that utilizes multiple types of features, such as linguistic
features (¢.g., sentiment analysis, part-of-speech tagging) and contextual features (¢..,
source credibility, temporal and social signals), can lead to an accurate and robust fake
news detection system.
782. Formulate an analysis plan.
The analysis plan can be described by below flow chart
1. Importing dataset: The data analysis pipeline begins with the import or creation of a
working dataset. The exploratory analysis phase begins immediately after. Importing a
dataset is simple with Pandas through functions dedicated to reading the data.
2. Analysis of dataset: An especially important activity in the routine of a data analyst or
scientist. It enables an in-depth understanding of the dataset, defines or discards hypotheses
and creates predictive models on a solid basis. It uses data manipulation techniques and
several statistical tools to describe and understand the relationship between variables and
how these can impact business.
3. Understanding the variables: While in the previous point, we are describing the dataset in
its entirety, now we try to accurately describe all the variables that interest us. For this
reason, this step can also be called bivariate analysis.
4, Modelling: At the end of the process, we will be able to consolidate a business report or
continue with the data modelling phase. We would be using Logistic Regression, Decision
Tree Classifier, Random Forest Classifier, Gradient Boosting, and Support Vector
Machine for modelling the dataset.
5. Interpreting the results: The results of the analysis are used to draw conclusions about the
hypothesis being tested. If the model performs well on the sample data, it may be
79considered a good predictor of the outcome variable. However, the model's accuracy may
need to be validated on new, unseen data to ensure that it is generalizable,
3. Analyze the sample data.
Importing working datasets
ELEM conacannminisss marie
Data Cleaning
(31 numeric features = ‘share count', ‘reaction count’, ‘comment_count'}
target = datal ‘Rating')
(category', 'Page', ‘Bost Type’, 'Dabato"]
Bi-variate analysis (Numerical-Categorical Analysis)
(2) data groupby(“Category’ )(share_count’ | moan()
eas 274644
waingurean 140656810
ribs vasa. 911480
‘ana atyper
taaesa
© date groupdy('catesory')[ hare count |main|)
O cmesey
ames mare_count, hyper tlostet
Missing values
© ton steara. inate ings Sipe
<8) inputr = sinpetnpter(stratag nda’)
erie data = ipitar it traefornata{nanoriefestures])
19 imputar « Sinpletaputer(stratagy='ncet_ frequent!)
‘eategorleal deta « inputer.fit traneforal duta[ categorical features})
4, Interpreting the results.
For dataset,
80Sno. Model Performance
1. Logistic Regression ‘Accuracy = 0.76367
a Decision Tree Classifier | Accuracy = 0.71553
3. Random Forest Classfier_| Accuracy = 0.77242
4 Gradient Boosting ‘Accuracy = 0.76367
5. Support Vector Machine _| Accuracy = 0.739606
Logistie Regression
mo)
yet
» nat
seo 8
wiping 0
ep a pci in
eee, mt, ct,
ee po rg
com sit Rk
Decision Tree Classifier
Accuracy: 0.7155361050328227
Classification report:
mixture of true and false
mostly false
mostly true
no factual content
accuracy
macro avg
weighted avg
Random Forest Classifier
E aligiewe
Classification report:
nixture of true and false
nostly false
aostly true
no factual content
accuracy
macro avg
weighted avg
precision
0.28
0.26
0.83
0.48
0.46
0.70
0.35 0.16
0.00 0.00
0.81 0.94
0.65 0.49
0450.40
0.70 0.76
recall £1-score
0.22 0.24
0.23 0.24
0.86 0.84
0.49 0.48
0.72
0.45 0.45
0.72 0.71
precision
0.22 55
0.00 2
0.87333
0.56 0
0.76 487
0.41 457
0.72 457
ei ps its Ped Fe -id g .
support
55
22
333
a7
457
457
457
recall fi-score support
81Gradient Boosting
[> Accuracy: 0.7724288840262582
Classification report:
precision recall fl-score
mixture of true and false 0.38 0.16 0.23
mostly false 0.43 0.14 0,21
mostly true 0.81 0.94 0.87
no factual content 0.72 0.60 0.65
accuracy 0.77
macro avg 0,58 0.46 © 0.49
weighted avg 0.73 0.77 0.74
svM
oa ME
ike
eat a Sen ep
ste ee 8
ahi i
mea i
®
woe
on eine et ti th 1
i pM ace i oe ein WW 1 a
sero, wis, mc, ie
iia pe. gta
sel, wa, a
LEARNING
tn pM nei Di a ind wo
support
55.
22
333
47
4357
457
457
‘A Type J error is a false positive conclusion, while a Type Il error is a false negative conclusion,
82EXPER!
AIM
Write a program to implement the t-test.
THEORY
A testis atype of inferential statistic used to determine if there isa significant difference between
the means of two groups, which may be related to certain features.
There are three types of t-tests, and they are categorized as dependent and independent t-tests.
1. Independent samples t-test: compares the means for two groups.
2. Paired sample t-test: compares means from the same group at different times (say, one year
apart)
3. One sample t-test test: the mean of a single group against a known mean.
CODE AND OUTPUTS
1. Importing required Libraries
import string
amport loud
from wordcloud import WordCl TOPWORDS
2. Loading the dataset
af = pd.read_cav( /content/drive/Myprive/Colab Notehooks/jm1.csv.x1s")
af.nead()
3. Information about the dataset
83at-intoth
RangeIndex: 10885 entries, 0 to 10884
Data columns (total 22 columns):
column Non-Null Count type
loc 10885 non-null floated
va) 10885 non-null floaté4
eva) 10885 non-null fleaté4
iv(g) 10885 non-null floated
a 10885 non-null floated
vy 10885 non-null float6d
L 10885 non-null floatéd
a 10885 non-null floated
>
10885 non-null floated
° 10885 non-null floated
10 10885 non-null float6d
ne 10885 non-null floatéd
12 locede 10885 non-null int6d
13. 1ocomment 10885 non-null int64
14 oBlank 10885 non-null int64
15 locCodeandconment 10885 non-null int64
16 unig_op 10885 non-null object
17, unig_opnd 10885 non-null object
18 total_op 10885 non-null object
19 total_opnd 10885 non-null object
20. branchcount 10885 non-null object
21 defects 10885 non-null bool
atypes: bool(1), float64(12), int64(4), object (5)
nenory usage: 1.8+ MB
4, Selecting Features
as atin’)
be aet'v')
5. Performing the test
£2 = state teest_ind(a, >)
2
Teest_indResult (statieticn-29.053826112647537, pvalue=5.747783245375447e-192)
Observation: P value is small (less than 0.05) for all the features, hence null hypothesis is rejected,
which implies group mean is not the same for all categories.
‘Null Hypothesis: The difference in mean values of title length of fake news and title length of real
news is 0.
Altemate Hypothesis: The difference in mean values of the title length of fake news and the title
length of real news is not 0.OBSERVATION
We observe a statistically significant difference (p-value = 0.01583) between the length of news
titles of real and fake news. The title length of fake news is slightly larger than that of real news.
Fake news title length distribution is cantered with a mean of 7.83, while the centre of distribution
of title length of real news is slightly skewed towards the right with a mean of 7.02. The t-test
gives us evidence that the length of a real news title is significantly shorter than the a fake news
title
LEARNING
Key learnings for implementing the T-Test in a program include understanding its applications in
statistical hypothesis testing, considering assumptions such as normality and homogeneity of
variances, implementing the T-Test in a programming language or statistical software,
interpreting results including p-values and confidence intervals, and considering sample size,
power analysis, and effect size for appropriate interpretation and decision-making,
85EXPER!
AIM
Write a program to implement the chi-square test.
THEORY
One of the primary tasks involved in any supervised Machine Learning venture is to select the best
features from the given dataset to obtain the best results. One way to select these features is the
Chi-Square Test, Mathematically, a Chi-Square test is done on two distributions two determine
the level of similarity of their respective variances. In its null hypothesis, it assumes that the given
distributions are independent. This test thus can be used to determine the best features for a given
dataset by determining the features on which the output class label is most dependent.
It involves the use of a contingency table. A Contingency table (also called crosstab) is used in
statistics to summarise the relationship between several categorical variables.
CODE AND OUTPUTS
‘© Null Hypothesis: There is no relation between News_Type and Article_Subject
¢ Altemate Hypothesis: There is a significant relation between News Type and
Article_Subject
1. Importing required Libraries
import num
amport pand
import matplotlit t
import seaborn as st
2. Loading the dataset
at = pd-read_esv(' /content /drive/Mybrive/Colab Notebooks/jml.csv-x1s")
af.nead()
863. Information about the dataset
at.intoth
Notebooks /Jm.cev.818")
# vertorm the Friedman test
friednan_stat, pvalue = friedmanchisquare(data{ n'), data{'v'], dxtal't'1)
# print the resuite
print("Friedsan statistic:', friedsan_stat)
Peint(*p-valve:', p_value)
Friedean statintic: 10972.614517524018
pevaluer 0.0
LEARNING
Key learnings for implementing the Friedman test in a program include understanding its
applications in non-parametric statistical analysis, familiarity with assumptions and requirements,
such as repeated measures and ranked data, implementation in a programming language or
statistical software, interpretation of results including Friedman statistic, degrees of freedom, and
p-values, and consideration of appropriate use and limitations of the Friedman test for
comparison of multiple related samples and valid interpretation of results.
90EXPER!
ENT
AIM
‘Write a program to implement Wilcoxon Signed Rank Test.
THEORY
Wilcoxon signed-rank test, also known as Wilcoxon matched pair test is a non-parametric
hypothesis test that compares the median of two paired groups and tells if they are identically
distributed or not.
We can use this when:
© Differences between the pairs of data are non-normally distributed.
‘© Independent pairs of data are identical. (or matched)
CODE AND OUTPUTS
‘© Null Hypothesis: The groups - title length of fake news and title length of real news are
identically distributed.
© Alternate Hypothesis: The groups - ttle length of fake news and title length of real news
are not identically distributed.
1. Importing required Libraries
import pandas as pd
import nui
y as np
import seaborn as s
import matplotlib.pyplot as plt
plt.style.use(‘default')
2. Loading the dataset
at = pd-read_esv(' /content /drive/Mybrive/Colab Notebooks /jml.csv-x1s")
af.nead()
313. Information about the dataset
ove 10685 non-nsi
3 Ss) 40888 sonst
17 unta_oond 10685 non-null object
19 total_oend foes nonnull bjest
20 branchcount 10685 non-null ebject
stypess bool (1), ttoats4(i2), imt6e4), Obsect(5)
4, Wilcoxon Signed Rank Test
import pandas es pd
fron seipy.stats inport wileoxon
4 Load the dataset from @ csv file
data = pd.read_cev( /content/drive/MyDrive/Colab Notebooks/‘nl.csv.:18")
# perform the Wilcoxon eigned-
stat, pvalue = wilcoxon(data{'n'}, data{'v'})
ke teat
# Print the results
print ("Wilcoxon signed-rank statistic:
print(‘pevalue:', pvalue)
stat)
Wilcoxon signed-rank statistic: 253.0
pevalue: 0.0
LEARNING
Key leamings for implementing the Wilcoxon Signed Rank test in a program include
understanding its applications in non-parametric statistical analysis, familiarity with assumptions
and requirements such as paired data and ordinal or continuous variables, implementation in a
programming language or statistical software, interpretation of results including test statisti, p=
values, and confidence intervals, and consideration of appropriate use and limitations of the
Wilcoxon Signed Rank test for comparing paired data and valid interpretation of results.
92AIM
‘Write a program to implement the Nemenyi test.
THEORY
The Friedman Test is used to find whether there exists a significant difference between the means
of more than two groups. In such groups, the same subjects show up in each group. If the p-value
of the Friedman test turns out to be statistically significant then we can conduct the Nemenyi test
to find exactly which groups are different. This test is also known as Nemenyi posthoc test.
CODE AND OUTPUTS
‘© Null Hypothesis: There is no significant difference in the score values
© Alternate Hypothesis: At least 2 values differ from one another.
1. Importing required Libraries
import pandas as pd
import nump:
as np
import seaborn as sns
import matplotlib.p'
Lot as plt
plt.style.use(‘default')
2. Loading the dataset
at = pd-read_esv(' /content /drive/Mybrive/Colab Notebooks /jml.csv.x1s")
af.nead()
3. Information about the dataset
934, Friedman Test
import pandas a= pa
. from a csv file
1¥(" content rive /Myorive/Colab Notebooks/ jmlcev-x18")
# toad the data:
ace = pe.read <
# vertorm the Friedman test
friednan stat, pvalue = friednanchieguare(dstal ‘n’], daes{'v'], datal’s'])
print("Friedsan statistic:', friedsan_stat)
prine(‘p-value:', p_value)
Friedean statintic: 10972.614517524018
5. Nemenyi Test
d=np-array({sample1, sample2, sample3])
sp.posthoc_nemenyi_friedman(d.T)
OBSERVATION
© From the outputs received, we reject the Null hypothesi
‘® From the output table we can clearly conclude that the 2 groups to have statistically
significantly different means are Group 1 and Group 2.
LEARNING
Key learnings for implementing the Nemenyi test in a program include understanding its
applications in posthoc analysis of multiple comparison tests, familiarity with requirements and
assumptions such as ranked or continuous data and multiple group comparisons, implementation
in a programming language or statistical software, interpretation of results including critical
difference values and significance levels, and consideration of appropriate use and limitations of
the Nemenyi test for posthoc analysis and valid interpretation of results in the context of
statistical hypothesis testing.
94AIM
‘Write down the threats to validity observed in the experiment conducted for models
THEORY
In empirical software engineering experiments, especially in machine learning-based defect
prediction, i's crucial to assess and record threats to validity. These threats impact the reliability,
reproducibility, and generalizability of the results.
Federated Transfer Learning (FTL) combines federated learning (training models collaboratively
without sharing raw data) and transfer learning (using knowledge from a source domain to improve
learning in a target domain). While FTL enhances privacy and allows knowledge sharing across
datasets, it introduces specific challenges that must be analyzed for validity.
The threats to validity are typically categorized into four types:
Internal Validity — Whether observed effects are caused by the experiment itself and not external
factors.
External Validity — How generalizable the findings are to other contexts or datasets.
Construet Validity — Whether the experiment truly measures what it claims to measu
Conelusion Validity — How accurately conclusions are drawn based on statistical evidence.
OBSERVATION
Threats to Validity Identified:
1. Internal Validity:
© Data preprocessing differences across federated nodes.
© Imbalanced class distribution affecting defect prediction.
‘© Inconsistent feature representation among clients,
2. External Validi
© Limited to open-source datasets (¢.g., NASA, PROMISE).
© Inability to generalize findings to industrial, closed-source software,
3. Construct Validity:
© Using “accuracy” as the sole performance metric can be misleading,
95,© Defect definitions differ across datasets, impacting label consistency.
4, Conclusion Validity:
Small number of datasets (clients) reduces statistical power.
o Lack of statistical testing like t-test, Nemenyi test, etc., weakens confidence in model
comparisons,
LEARNING
Key learnings for identifying and analyzing threats to validity in machine learning experiments. In
the context of Federated Transfer Learning for software defect prediction, special care must be
taken to ensure consistency, statistical robustness, and generalizability. Addressing these threats
improves the reliability and impact of the experimental results.
96EXPER!
ENT
AIM
Explore tools such as SPSS, KEIL, PYTHON, and R.
THEORY
Modern machine learning and statistical modeling require robust tools and environments. For this
experiment, the focus is on exploring the following tools and understanding how they can be used
for analyzing software defect data and building prediction models
1. PYTHON
Python is a powerful open-source programming language widely used in data science, machine
learning, and AL.
+ Relevance to Experiment:
Python was the main language used for implementing federated learning and defect
prediction models. Libraries such as:
scikit-learn for machine learning models
pandas and numpy for data manipulation
© matplotlib and seaborn for visualization
© tensorflow-federated (TFF) or Flower for federated learning frameworks
+ Key Features:
© Easy integration with data pipelines
© Vast community support
© Can handle both statistical and deep learning models
© Good support for transfer learning using Keras or PyTorch
2. SPSS (Statistical Package for the Social Sciences)
SPSS is a GUL-based statistical analysis tool used for hypothesis testing, regression analysis, and
data visualization.
+ Relevance to Experiment:
Useful for performing statistical tests such as:
© test (one-sample, paired, independent)
chi-square test
972 ANOVA, Friedman, Wilcoxon, Nemenyi tests
+ Advantages:
© User-friendly interface
© Built-in functions for descriptive statistics and validity testing
© Suitable for researchers unfamiliar with programming
3. R Programming Language
R is a language specifically designed for statistical computing and data visualization.
+ Relevance to Experiment:
© Can be used to validate machine learning results through statistical tests
© Ideal for plotting ROC curves, boxplots, or confusion matrices
© Packages like caret, ©1071, and mir support classification tasks
+ Strengths:
© Rich set of statistical packages
© Excellent data visualization with ggplot2
© Preferred in academic research for statistical modeling
4. KEIL
KEIL is primarily used for embedded system development, especially microcontroller
programming.
+ Relevance to Experiment:
Not directly related to software defect prediction in federated learning, but:
© Can be explored when analyzing embedded software defects or real-time systems.
‘© Useful for low-level software testing and debugging in hardware-oriented projects.
LEARNING
Leamed that combining tools strengthens the overall research workflow and improves the
reliability and explainability of results. and realized that statistical tests performed in SPSS/R
complement the performance metrics derived in Python, helping validate the learning model’s
effectiveness more rigorously.
98