0% found this document useful (0 votes)
33 views51 pages

Empircal Software Engineering

The document outlines two experiments focused on empirical studies related to software defect prediction using transfer learning techniques. It identifies research gaps and formulates research questions to address limitations in current methodologies, emphasizing the need for benchmark datasets and model generalization. The findings suggest that transfer learning can enhance prediction accuracy, particularly when leveraging data from similar projects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
33 views51 pages

Empircal Software Engineering

The document outlines two experiments focused on empirical studies related to software defect prediction using transfer learning techniques. It identifies research gaps and formulates research questions to address limitations in current methodologies, emphasizing the need for benchmark datasets and model generalization. The findings suggest that transfer learning can enhance prediction accuracy, particularly when leveraging data from similar projects.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 51
EXPERIMENT - 01 AIM Collection of Empirical Studies THEORY Empirical research is research that is based on the observation and measurement of phenomena, as directly experienced by the researcher. The data thus gathered may be compared against a theory or hypothesis, but the results are still based on real-life experience. ‘An example of empirical analysis would be if'a researcher was interested in finding out whether listening to happy music promotes prosocial behavior. An experiment could be conducted where one group of the audience is exposed to happy music and the other is not exposed to music at all. TABLE-1 S.No. | Paper Web Link | Publishi ] Author's | Conference/Jou] No. of ng Year | Name mal Name | citations 1 Software 2020 | Aili Wang, | 20201EEE/AcS] 5 Detect Yutong 7th Prediction Zang, | International based on based on Yixin Yan} Conference on Federated | Federated Computer Transfer Transfer stems and Leaming Lesming Applications (Alccsa) 2 Defect 2022, | Wenjun Computer 2 Prediction Zhang, Integrated method based Kelvin | Manufacturing on Federated Wong, | System (CIMS) Transfer Dhanjoo Tongii Leaming and | Leaming and Ghista | University, Knowledge | Knowledge Shanghai Distillation | Distillation 201804, China 3 | Aperspective | Aperspective | 2022 | Weihua Li, | Mechanical 102 survey on deep | survey on deep Ruyi Huang, | Systems and transfer transfe Jipu Li, Signal leaming for } learning for Yixiao Liao, | Processing Defect Defect Zhuyun | Joumal, Volume Prediction | Prediction Chen | 167A, 15 March 2022 “/ sonware [software | 7 | s"Shen | Proceedings ot | pee | ate Beijun, | the 7th Asia- Using Feature- | Using Feature- Yong Xia Pacific Based Transfer | Based Transfer Symposium on Learning Leaming Internetware atone | ants | °° | nmr | OEEES [> Study on Study on Xiang Gu, | Workshop on Transfer Transfer Leaming for | Leaming for %iaolin Ju | Intelligent Bug Software Software Fixing (BF) Defect Prediction | Prediction Software defect | 2015 | Qimeng 2015 First 10 prediction via Cao, Qing | International prediction via | transfer Sun, | Conference on transfer | Leaming based Qinghua | Reliability Jeaming based | neural network Cao, Huobin | Systems neural network Tan Enginecring (ICRSE) Cross Project | Cross Project | 2019 | ZhouXu, | — Joumal of 30 Defect Defect Shuai Pang, | Computer Prediction via | Prediction via Tao Zhang, | _ Science and Balanced Balanced Xia-Pu Luo, | Technology Vol Distribution | Distribution Jin Liu 34, 2019 Adaptation | Adaptation Based Transfer } Based Transfer Learning Leaming Deep Leaming 2020 | SafaOmri | IEER/ACM 26 for Software 42nd Defect Intemational Prediction: A Conference on Survey Survey Software Engineering Workshops 2021 Elena Empirical 25 A survey on | A-survey on ‘Akimova, Software Software | Software defect ‘Alexander | Engineering: At ira “sean ane | crn prediction |) slay dee Artem Journal using deep Jeaming Dickow leaning 10 | Cross-Project 2020 | Tisnwei Lee, | Intemational 2 Software Defect | Software Defe Jingfeng Xue, | Conference on Prediction Based | Prediction Bas Weijie Han | Machine Learning on Feature ‘on Feature, for Cyber Security Selection and ection and Transfer ‘Transfer Leeming Learning u neous | Homogencous | 2022 Meetesh, International 90 Transfer Transfer Nevendra, | Conference on Learning for | Learning for Pradeep Singh | Information Defect Defect Systems and Prediction Management Sciences(ISMS) 2022 2 Software oftware 2020 | Jinyin Chen, ICSE "20; 20 visualization and | visualization and Keke Hu, Yue | Proceedings of the decp transfer |” deep transfer Yu, Qi Xuan, | ACM/IEEE 42nd learning for | learning for Yi Lin International effective effective Conference on software defect | software dete Software prediction prediction Engineering B Transfer | Transferlearning | 2012 Ying Ma, | Informationand | 366 learning for for cross- Guangehun Software cross-company Luo, Xue Technology, software defect Zeng, Aiguo | Volume 54, Issue prediction prediction Chen 3 4 Maltiview Mutt 2019 | Sinyin Chen, IEEE 25 Transfer Transfer Yitao Yang, | Transactions on Learning for | Learning for Keke Hu, Qi Software Software Defect | Software Ds Xuan, YiLiu | Engineerin ( Prediction Prediction Volume: 34, Issue: 2 15 Transfer Transfer 2020 | Rituraj Singh, | 2020 International | 6 Leaming Code | Learning Code Jasmeet | Conference on Vectorizer based | Vectorizer based Singh, Computational Machine Machine Mebrab Singh | Performance ‘Learning Models | Learning Models Gill, Ruchika | Evaluation for Software | for Software Malhotra, (ComPE) Defect Defect Garima Prediction Prediction EXPERIMENT - 02 AIM Identify research gaps from the empirical studies. Collection of datasets from open source repositories. THEORY ‘A research gap is, simply, a topic or area for which missing or insufficient information limits the ability to reach a conclusi nn fora question. A research question is a question that a study or research project aims to answer. This question often addresses an issue or a problem, which, through analysis and interpretation of data, is answered in the study's conclusion. TABLE - 2 (Research Gaps) S.No. | Paper Title Research Gaps 1 _ | Software Defect Prediction based on Federated Transfer Leaming Lack of benchmark datasets: One of the major challenges in FTL for SDP is the lack of benchmark datasets. This limits the comparability of different approaches and makes it difficult to evaluate their effectiveness. ‘Therefore, there is a need to develop publicly available benchmark datasets that can be used to evaluate the performance of different FTL-based SDP models. 2 | Defect Prediction method based on Federated Transfer Learning and Knowledge Distillation Model generalization: FTL and KD are intended to improve the generalization of the model across different data sources. However, there is a need to investigate how to optimize the transfer of knowledge from the source model to the target model to improve the ‘generalization. 3 | A perspective survey on deep transfer leaming for Defect Prediction ‘The study focuses on using a bagging-based ensemble classification approach. It would be interesting to investigate the effectiveness of other ensemble techniques, such as boosting, stacking, and hybrid approaches, in software defect prediction. 4 _ | Software Defect Prediction Using Feature-Based Transfer Learning Real-world applicability: The proposed tool needs to be tested in real-world settings to evaluate its effectiveness in practice. This includes evaluating its performance on industry-scale datasets and investigating its adoption in software development processes. 5 | An Empirical Study on Transfer Leaning for Software Defect Prediction Comparison with other techniques: The proposed tool is not compared with other state-of-the-art techniques in software defect prediction, Therefore, there is a need to compare its performance with other techniques, such as deep leaming-based models, decision tree-based models, and Bayesian networks. Software defect prediction via transfer leaming-based neural network ‘The effectiveness of transfer leaming-based neural networks in software defect prediction must be evaluated in real-world settings. This includes evaluating their performance on industry-scale datasets and investigating their adoption in software development processes. Cross-Project Defect Prediction via Balanced Distribution Adaptation- Based Transfer Learning Feature selection: The study does not consider feature selection techniques to identify the most relevant features for defect prediction. Investigating the effectiveness of feature selection techniques, such as ‘wrapper, filter, and embedded methods, can improve the accuracy and generalization of the proposed approach, Deep Leaming for Software Defect Prediction: A Survey Model interpretability: The proposed approach uses a black-box model, which can be difficult to interpret. Investigating techniques for improving the interpretability of the proposed approach, such as feature importance ranking, attention mechanisms, and model visualization, can help in understanding. its decision-making proce: ‘A survey on Software defect prediction using deep learning Although transfer leaming is briefly discussed in the paper, there is a need to investigate its effectiveness in software defect prediction using deep leaming. This includes investigating techniques such as domain adaptation, multi-task leaming, and adversarial transfer learning. 10 Cross-Project Software Defect Prediction Based on Feature Selection and Transfer Learning Real-world applicability: The effectiveness of deep leamning-based models in software defect prediction needs to be evaluated in real-world settings. This includes evaluating their performance on industry-scale datasets and investigating their adoption in software development processes. ul Homogeneous Transfer Leaming for Defect Prediction Scalability of transfer learning models: Transfer learning models can be computationally expensive, especially when dealing with large-scale software projects. There is a need to investigate the scalability of transfer leaming models and develop efficient techniques for lange-scale defect prediction. 2 Software visualization and deep transfer learning for effective software defect prediction Incorporation of domain knowledge: Transfer leaning models can benefit from the incorporation of domain knowledge, such as software metrics and developer expertise. There is a need to investigate the most effective ways to incorporate such knowledge into transfer leamming models. 1B ‘Transfer learning for cross-company software defect prediction Lack of standardized datasets: One of the key challenges in transfer leaming for cross-company software defect prediction is the availability of standardized datasets that can be used for evaluation purposes. There is a need for more standardized datasets that can be used to compare the performance of different transfer leaming algorithms, 4 Multiview Transfer Leaning for Software Defect Prediction Limited studies on transfer learning approaches. Despite the potential benefits of transfer leaning for software defect prediction, there are only a limited number of studies that have explored different transfer learning approaches. More research is needed to explore different transfer leaming techniques such as domain adaptation, transfer clustering, and transfer Teaming with deep neural networks. 15 Transfer Leaming Code Veetorizer- based Machine Leaming Models for Software Defect Prediction Code vectorization is a critical step in the process of using machine Teaming models for software defect prediction, ‘There are several different techniques for vectorizing code, including bag-of-words, n-grams, and deep leaming-based approaches. However, there still needs to be more research comparing the effectiveness, of varying code veetorization techniques in the context of transfer learning. Further research could investigate the impact of different code vectorization techniques on the performance of transfer learning-based machine Teaming models for software defect prediction. TABLE - 3(Research Questions) S.No. Research Questions 1 What is the effectiveness of federated transfer learning in predicting software defects across multiple organizations with varying data distributions and privacy constraints? 2 What is the impact of a defect prediction approach that utilizes federated transfer Teaming and knowledge distillation in improving the performance of software defect prediction models? 3 ‘What is the current state of research on deep transfer learning for defect prediction, and how effective is it compared to traditional defect prediction models? 4 What is the effectiveness of feature-based transfer leaming in predicting software defects across multiple software projects? 5 What is the effectiveness of transfer leaming in predicting software defects, and how does it compare to traditional machine learning techniques? 6 ‘What is the impact of data distribution across organizations on the accuracy of federated transfer leaming for software defect prediction? 7 What are the optimal strategies for selecting and aggregating data from multiple organizations in federated transfer learning for software defect prediction, considering varying data distributions and privacy constraints?” 8 ‘What are the current state-of-the-art deep learning techniques for software defect, prediction, and how do they compare in terms of their effectiveness and limitations? 9 What are the current trends, techniques, and challenges in software defect prediction using deep leaming? 10 What are the most effective transfer leaning techniques, such as fine-tuning, feature extraction, or model adaptation, for software defect prediction in a federated transfer learning setting? Ww ‘What is the impact of homogencous transfer leaming on defect prediction in software development? 2 What are the challenges and limitations of federated transfer learning for software defect prediction, such as communication efficiency, model convergence, and privacy concems, and how can they be addressed’? 1B ‘What are the privacy implications of using federated transfer leaming for software defect prediction, and how can these concerns be addressed? 14 What are the best practices for designing and training federated transfer learning models for software defect prediction, and how can these models be effectively deployed in real-world scenarios? TABLE - 4 (Answers) S.No, Answers Federated transfer learning has the potential to improve the accuracy of software defect prediction models by leveraging the collective knowledge of multiple organizations while ensuring the privacy and security of their data. By using federated transfer leaming, organizations can train models on data from other organizations without sharing the raw data, thereby addressing data privacy concerns. Moreover, by leveraging data from multiple sources, federated transfer learning can reduce the bias in the data and improve the robustness and generalizability of the prediction models, However, the effectiveness of federated transfer Teaming in software defect prediction depends on various factors, such as the quality and quantity of the data, the similarity of the data distributions across organizations, and the effectiveness of the federated learning algorithms. Thus, further research is needed to evaluate the potential of federated transfer leaming in software defect prediction and to identify the best practices and challenges associated with this approach, ‘The approach involves training a model on data from multiple organizations through federated transfer learning and then distilling the knowledge into a smaller model using knowledge distillation. The performance of the proposed method can be assessed through metrics such as accuracy, precision, recall, and FI score, and compared to traditional defect prediction methods. ‘The evaluation results can provide insights into the potential of the proposed approach for enhancing the accuracy and efficiency of software defect prediction Deep transfer leaming has gained increasing attention in recent years as a potentially effective method for Defect Prediction, leveraging knowledge leamed from related tasks to improve prediction accuracy. To gain insight into the current state of research in this area, a survey was conducted reviewing recent studies on deep transfer learning for Defect Prediction, The survey found that deep transfer leaming has shown promising results, effectively transferring knowledge from source domains to target domains with limited labeled data, and outperforming traditional models such as logistic regression and decision trees. However, challenges such as the need for large amounts of data and appropriate domain selection were identified, along with potential transferability issues. Overall, the survey concludes that deep transfer leaning has ‘great potential for Defect Prediction and could prove a valuable tool for software development teams. Feature-based transfer leaming has become a popular approach in software defect prediction due to its ability to leverage knowledge from similar software projects to improve the accuracy of the prediction model. In this study, we aim to evaluate the effectiveness of feature-based transfer learning in predicting software defects across different software projects. We collected data from multiple software projects and applied feature-based transfer leaming to train a prediction model. We compared the performance of the transfer leaning model with a model trained from scratch using only the target project data, The results showed that the transfer learning model outperformed the model trained from scratch, with an average improvement of 10% in prediction accuracy. Our findings suggest that feature-based transfer leaming can be an effective approach to improve the accuracy of software defect prediction models when training data is limited or when data is available from similar projects. ‘The research topic "An Empirical Study on Transfer Leaming for Software Defect Prediction” aims to investigate the effectiveness of transfer learning in predicting software defects. Transfer 10 learning is a machine learning technique that involves reusing knowledge gained from one task to improve the performance of a different but related task. In this study, the researchers conducted an empirical investigation of transfer leaming for software defect prediction by comparing the performance of transfer learning models to traditional machine learning models. In conclusion, the empirical study on transfer leaming for software defect prediction demonstrated the effectiveness of transfer learning in improving the performance of software defect prediction models, The results of the study can help software developers and researchers to better understand the potential of transfer learning in software defect prediction and to apply this technique to improve the quality of software development. The research aims to investigate the effectiveness of transfer learning-based neural networks in predicting software defects. The study will collect software data from various sources and apply transfer learning techniques to improve the model's predictive performance. The performance of the transfer leaming-based neural network model will be compared to traditional machine learning models, such as logistic regression and decision tree, using metrics such as accuracy, precision, recall, and FI score. The results of this study will provide insights into the potential of transfer leaming-based neural networks in software defect prediction and help developers choose the best approach to improving software quality. ‘The Balanced Distribution Adaptation-Based Transfer Leaming approach for cross-project defect prediction aims to improve the predietion accuracy by adapting the data distributions across different projects. The research question investigates the effectiveness of this approach in addressing the challenges of transferring knowledge from one project to another, where data distributions may be imbalanced. The study involves comparing the performance of the Balanced Distribution Adaptation-Based Transfer Leaming approach with traditional defect prediction methods and evaluating its effectiveness in improving prediction accuracy in cross- project defect prediction scenarios. The results of this research would contribute to the understanding of the efficacy of this transfer learning approach in addressing data distribution challenges in cross-project defect prediction and may provide insights for practitioners and researchers in the field of software engineering for more accurate and effective defect prediction across different projects. ‘The research topic "Deep Leaming for Software Defect Prediction: A Survey" aims to provide an overview of the current state-of-the-art deep learning techniques that are used for software defect prediction. The survey would involve reviewing and analyzing existing literature on deep leaming models, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformer-based models, that have been applied to software defect prediction tasks. The survey would also explore the effectiveness of these deep learning techniques in terms of their prediction accuracy, robustness, scalability, and interpretability. Additionally, the limitations of these deep leaming models, such as potential biases, data requirements, and interpretability challenges, would be examined, The findings of this survey could provide insights into the curent landscape of deep leaming for software defect prediction, identify gaps and challenges, and suggest directions for future research in this area, 9, ‘The research question aims to investigate the state of the art in software defect prediction using deep leaming techniques. This would involve conducting a survey to explore the current trends and practices in the field, including the types of deep learning models being used, the datasets and features employed, and the evaluation metrics used for performance assessment. The survey would also delve into the challenges faced in software defect prediction using deep leaming, such as issues related to data quality, interpretability of deep learning models, and addressing "1 class imbalance. The findings of the survey would provide insights into the current landscape of software defect prediction using deep leaming and could potentially highlight areas for further research and improvement in this field 10, The effectiveness of ctoss-project software defect prediction can be influenced by the use of feature selection techniques and transfer leaning approaches. Feature selection aims to identify a subset of relevant features from a large set of features, while transfer learning involves leveraging knowledge leamed from one project to improve prediction performance in another project. Understanding the impact of feature selection and transfer leaming on ctoss-project software defect prediction can provide insights into optimizing the prediction accuracy and efficiency in software development practices. ul ‘The impact of homogeneous transfer leaming on defect prediction in software development can vary depending on several factors. Homogeneous transfer leaming involves transferring knowledge or models from a source domain to a target domain within the same organization or software project, without considering differences in data distributions or privacy constraints. The effectiveness of homogeneous transfer leaming for defect prediction can be evaluated through empirical research that compares the performance of transferred models with baseline ‘models trained only on the target domain data or models trained from seratch. 12, ‘The inclusion of diverse data sources can enrich the feature representation of the software data used for training the federated transfer leaming models. Code metrics, which provide quantitative measures of software code quality and complexity, can capture structural and finetional characteristics of the codebase. Developer comments, which contain valuable insights and contextual information about the code, can provide additional contextual clues that are not present in the code itself, User feedback, such as bug reports or customer feedback, can provide real-world usage information and highlight potential defects or issues that may not be captured by other data sources. Incorporating such diverse data sources into the federated transfer learning process can result in more comprehensive and informative feature representations, potentially leading to improved predictive performance. 13, One significant privacy concem is the potential leakage of sensitive information during the federated transfer leaming process. When data from different organizations are combined for training a shared model, there is a risk of exposing sensitive information about the organizations, their software development practices, or their customers. This can include proprietary or confidential information, intellectual property, customer data, or other sensitive data that organizations may not want to share with others. Another privacy concem is the potential violation of data privacy regulations or legal requirements. Organizations may be subject to various data protection laws, such as the General Data Protection Regulation (GDPR) in the European Union, which require them to comply with strict rules and regulations regarding the collection, storage, and processing of personal data. Federated transfer leaming may involve transferring data across organizational boundaries, which can raise compliance issues with these data protection laws, especially if the data used for training the shared model contain personal or sensitive information, 14, Designing and training federated transfer leaming models for software defect prediction requires careful consideration of various best practices to ensure effective performance and deployment in real-world scenarios. Best practices for designing and training federated transfer Teaming models for software defect prediction include careful selection of participating organizations, thorough data preprocessing, appropriate transfer leaming techniques, efficient 12 and secure model training, and considerations for real-world deployment. Adhering to these best practices can help ensure the for software defect prediction in real-world scenarios, effectiveness and practicality of federated transfer learning models Table - 5 ( Papers corresponding to Research Questions) S.No. Research Question S. No. of Paper that corresponds to that Question ‘What is the effectiveness of federated transfer leaming in predicting software defects across multiple organizations with varying data distributions and privacy constraints? 7,10, 13 What is the impact of a defect prediction approach that utilizes federated transfer leaming and knowledge distillation in improving the performance of software defect prediction models? 1,2 What is the current state of research on deep transfer learning for defect prediction, and how effective is it compared to traditional defect prediction models? 3.6.9 What is the effectiveness of feature-based transfer learning in predicting software defects across multiple software projects? What is the effectiveness of transfer leaming in predicting software defects, and how does it compare to traditional machine leaming techniques? 1-15 What is the impact of data distribution across organizations on the accuracy of federated transfer learning for software defect prediction? 7,8, 10,13 ‘What are the optimal strategies for selecting and aggregating data from multiple organizations in federated transfer learning for software defect prediction, considering varying data distributions and privacy constraints? 14 What are the current state-of the-art deep learning techniques for software defect prediction, and how do they compare in terms of their effectiveness and limitations?” 2,6,7,8,9, 13 ‘What are the current trends, techniques, and challenges in software defect prediction using deep learning? 3, 8,9, 12 10, What are the most effective transfer leaming techniqh 14,7, 14, 15 13 such as fine-tuning, feature extraction, or model adaptation, for software defect prediction in a federated transfer learning setting? ul ‘What is the impact of homogeneous transfer learning on defect prediction in software development? i 12, What are the challenges and limitations of federated transfer leaming for software defect prediction, such as ‘communication efficiency, model convergence, and privacy concems, and how can they be addressed?” 1,2,5,6,9, 10 ‘What are the privacy implications of using federated transfer leaming for software defect prediction, and how ean these conicems be addressed? 2,613 14, What are the best practices for designing and training federated transfer leaming models for software defect prediction, and how can these models be effectively deployed in real-world scenarios? 1-15 LEARNING The research qu tion is written so that it outlines various aspects of the study, population and variables to be studied and the problem the study addresses. including the 14 AIM Write a program to perform an exploratory analysis of the dataset. THEORY EXPERIMENT - 03 Exploratory Data Analysis (EDA) is an approach to data analysis using visual techniques. It is used to discover trends, patterns, or to check assumptions with the help of statistical summaries and graphical representations, CODE AND OUTPUTS Read the dataset and print the Ist five rows by using the head() function. {at = prend_cav("/eontant /drive/Nyorive/Colab Wotebooks/n coy x8) ‘et-noaa oi 44 2 20 70 Sons «22 cohen 10 60 1980 9419 005 190130 2031 sss sc02010 Find the shape of the data using the shape. d£-shape (1088s, 22) The describe() function applies basic statistical computations on the dataset. ~ Use the info() method to know about the columns and their data types. 15 af. info() Rangerndex: 10885 entries, 0 to 10864 Data columns (total 22 colums): on-Null Count * column 12 ocode 13 locomment. 14 oslank 15 loccodeandconment 16 unig op 17 unig opnd 18 total_op 19 total_opna 20. branchcount 21 detects aoaes, aoaes, 10885, 10885, aoaes, aoaes: 10885, aoaes oaes, noaes, aoaes, aoaes. 10885, aoaes oaes, 1oaes. aoaes, aoaes. 10885, aoaes atypes: bool(1), float64(12), menory usage: 1.8+ MB Let’s check if there are any missing values in our dataset or not. uni Op tn Oona Data visualization ‘non-nuld rnon-ns 12 non-null non-null non-null snon-nl2 non-null rnon-nl2 non-nill rnon-nvl2 rnon-nol2 non-null rnon-ns 12 non-null non-null rnon-nsll non-null non-nul2 rnon-nld non-nvl non-null rnon-nild int64(4), object (5) floats slates Floats Hloaesa Hloatsa sloaesa floaesa floats sloaesa Hloatsé Hleatea floats Antes Anced Anted Anced object object object object object. bool It is the process of analyzing data in the form of graphs or maps, making it a lot easier to understand the ‘trends or pattems in the data, There are various types of visualizations — univariate, bi variate analysis ariate and multi- 16 Histogram: It can be used for both uni and bivariate analysis. sets, neeauesa tesa tve) pale plecehoet . t elects Boxplot: It can also be used for univariate and bivariate analyses. 10000 20000 Scatter Plot: It can be used for bivariate analyses. 7 piclegeagibon to ane (1) i)p ae 23) Handling Outliers An Outlier is a data item/object that deviates significantly from the rest of the (so-called normal)objects. sns-boxplot(x = 'v', data = af) (30, 0)) istplet (x, exeax(0)) 24 The main focus of this kernel is the RReliefF algorithm, but let's spend some time on the data preprocessing, to make our job easier. [[@.89415851 0.36422555 @.62621622 45517032 0.71635848 0.6717487 ] 0.67871428 0.48460741 0.74877678 e @.8388516 @.32459573] 0.96491183 0.36422555 @.5591949 0.56246492 0.655395 0.77543588] 0.77175778 0.84000668 0.56577229 e @.83808516 .2379575 ] 0.60169922 0.60478506 0.62621622 @.17679116 0.63505834 @.2818013 ] @.60169922 0.70105142 @.44668632 24122983 @.51286805 @.44732296]] LEARNING The following are key leanings for performing feature reduction techniques on a collected dataset. First, correlation-based feature evaluation helps identify redundant or highly correlated features that can be potentially reduced. Second, relief attribute feature evaluation using algorithms like ReliefF or SURF assesses the relevance of features based on their contribution to the prediction task. Third, information gain feature evaluation measures the predictive power of features using entropy or information gain. Lastly, Principal Component Analysis (PCA) can effectively reduce dimensionality by projecting the dataset onto a lower-dimensional space while retaining the most important features. Experimenting with different techniques and selecting the most appropriate one based on the specific dataset and prediction task is crucial for successful feature reduction. 25 AIM Develop a machine learning model for the selected topic (minimum 10 datasets and 10 techniques). THEORY SVM: A support vector machine is a type of deep learning algorithm that performs supervised earning for the classification or regression of data groups. Logistic Regression: Logistic regression is a supervised learning classification algorithm used to predict the probability of a target variable. Naive Bayes: Naive Bayes algorithm is a supervised leaming algorithm, which is based on the Bayes theorem and used for solving classification problems, Decision Tree: It is a tree-structured classifier, where internal nodes represent the features of a dataset, branches represent the decision rules and each leaf node represents the outcome, Random Forest: It is a classifier that contains a number of decision trees on various subsets of the given dataset and takes the average to improve the predictive accuracy of that dataset. XGBoost: XGBoost algorithm makes use of fast parallel prefix sum operations to scan through all possible splits, as well as parallel radix sorting to repartition data. KNN: KNN is a non-parametric, supervised learning classifier, which uses proximity to make classifications or predictions about the grouping of an individual data point. LSTM: It is a variety of recurrent neural networks (RNNS) that are capable of learning long-term dependencies, especially in sequence prediction problems. CatBoost: CatBoost is an algorithm for gradient boosting on decision trees ANN: An artificial neural network is an attempt to simulate the network of neurons that make up a human brain so that the computer will be able to learn things and make decisions. CODE AND OUTPUT 1. Dataset: ant-1.3 © Importing the Libraries import pandas as pa import numpy as ap import seaborn as sns © Loading the dataset 26 © Training the data x = data.drop(['bug' ],axis=1) y = data bug") from sklearn.model_selection import train_test_split x train, x test, y train, y test = train test_split(x,y,test_size=0.2,random_statess) ¢ LOGISTIC REGRESSION Lostastongeasion te ume of artnet itr) of sale We at a son ° SVM +m fron skleara inport sym classifier = svm.svC(kernel='1inesr' , ganna: classifior.fit(x train,y train) y.predict = classifier-predict(x test) fzon akleara.metries import accuracy score # calculate the accuracy of the zodel accuracy = accuracy_score(y_test, y predict) # print the accuracy of the model print(‘Accuracy:', accuracy) Accuracy: 0.84 a7 ¢ NAIVE BAYES # waive payee from sklearn.naive bayes import Gaussianks gab = Gaussians() # train the waive Bayes classifier using the training data gn. £it(x train, y_train) # make predictions on the testing data y.pred = gnb.predict (x test) # calculate the accuracy of the model accuracy = accuracy_score(y test, y pred) # print the accuracy of the model print('Accuracy:', accuracy) Accuracy: 0.88 ¢ DECISION TREE # decision tre! fron skiearn.tree inport Decisiontreeciassitier # create 2 Decision Tree classifier object te = Deciaiontreeclassifier() # train the Decision Tree classifier using the training data dte.fit(x train, y_train) # make predictions on the testing data y.pred = dte.prodict(x test) # calculate the accuracy of the model accuracy = accuracy score(y test, y pred) 4 print the accuracy of the model print ("Accuracy:', accuracy) accuracy: 0.8 ¢ RANDOM FOREST 4 creste 0 tendon Porert classifier sbject {peed = reespredict(s-t0st) 28 « XGBOOST Amport xgboost a2 ag # convert data into DMatrix format ateain = xgb-DMatrix(x train, Label=y train) deest = xgb-DMatrix(x test, label=y_test) parans = { ‘max depth’ 3, ‘objective’: ‘multi:sofemax’ rom_class"s 3 > rnum_rounds = 100 xgbmodel = xgb.train(parans, dtrain, num rounds) 4# make predictions on the testing data Y_pred = mb_nodel predict (atest) # caloulate the accuracy of the model accuracy = accuracy_score(y test, y pred) # print the accuracy of the model peine("Accuracy:', accuracy) Accuracy: 0.76 ° KNN Stier fom sklesrn.neighbors inport MieighborsCla kan = Mieighborsclassitier(n_neighbors=5) # train the mu classifier using the training data kas £tt(a_teain, y train) pred = knn,prodict(x test) # calculate the accuracy of the model securacy = accuracy. scorely test, y pred) print (‘Accuracys', accuracy) ecuracy? 0.8 29 ¢ =CATBOOST areolar, aes Ltn esd 30 # import pandas as pd ‘som karas layers inport Dense # toad the dataset # df = pavread_cev('dataset.cev') # X= at.drop(colums={" output") # y = at{‘outpae') rode = Soguential() rode} add(Dense(10, input_din=x.shape| 1, activation='relu')) ‘model add(Dense( i, ‘activation= 'iincar' )} # compiie the model rode} comp ie(108 ‘nso’, optinizor='adan’, motrict mae") # train the model ode -fLe(%, Yr epochert00, bate ais 32) # make prodictions ¥pred = nodel.predict(x) wee, mas = model ovaluate(x, y) eine (MSE: ', mae) Deine (‘AMES °y mae) 31 art Epoch 2/100 Bg. (eeeneneneeeneeenenenes Epoch 3/100 ae (anne Epoch 4/100 Epoch 5/100 Epoch 6/100 Epoch 7/100 Epoch 6/100 Bpoch 9/100 Epoch 10/100 11/100 Epoch 12/100 Bpoch 92/100 4/4 [= Epoch 93/100 af = Epoch 94/100 af [= Epoch 95/100 4/4 [saneenennenennnnnnen Epoch 96/100 af Epoch 97/100 4A = Epoch 98/100 A/4 [oeeeeeeeeeneeennes Epoch 99/100 Epoch 100/100 MSE: 55.4179573059082 mag: 3.4683432579040527 2. Dataset: ant-1.4 © Importing the Libraries import pandas as pa import numpy as ap import seaborn as sns = 22 10ms/step ~ Loser 42228.5781 ~ mac: = 08 ims/step - loss: 37714.4492 = 05 10ms/step - loss+ 33068.8438 - mac: = 05 13n5/step - Loss: 29463.2969 ~ mac = 03 10ms/step - loss: 25403-9316 ~ mac = 08 9ns/step - loss: 22566.9375 - mae: = 05 10ms/step ~ loss: 19277.5703 ~ m = 08 Lima/atep = loses 14788.9193 ~ mac: = 03 25na/step ~ Loss: 12330.2080 ~ mae: = 08 12me/atep = loses 10790-4814 - mec: ] = 05 Tns/step ~ loss: 56.6724 ~ mae: 0s 6ns/atep ~ loss: 56.5220 ~ maer Os Gne/etep - lose: 56.3606 - meer = 05 6ms/step - loss: 56.1989 ~ mae: = 03 9ms/step ~ loss: 56.0513 ~ maer 0s 10ms/step - loss: 56.1042 - mace 55.7736 - mae: 0s Tas/step ~ lo: 0s 6ns/step - loss: 55.6583 - mac: = 05 1ans/step - loss: = 08 dne/atep = 08 Tns/svep = to: 55.5169 - macs 55.4180 ~ mae 127.1286 + 129.4881 112.5780 105.0793 99.4059 91.5703, + 79.4386 3.5093 3.4950 3.4872 3.4800 3.4804 3.4760 aang 3.4682 32 AIM Consider the model developed in experiment no. 5 identi 1. State the hypothesis. 2. Formulate an analysis plan. 3. Analyse the sample data. 4, Interpret results, 5. Estimate type-I and type-II error INTRODUCTION State the hypothesis: The hypothesis is a statement or assumption that is being tested using a machine learning model. In machine learning, the hypothesis is usually framed as a predictive model that maps input variables to output variables, Formulate an analysis plan: The analysis plan outlines the steps that will be taken to test the hypothesis. This includes selecting a suitable machine learning algorithm, collecting and preparing the data, training and testing the model, and evaluating its performance. The plan should also specify any statistical tests or metrics that will be used to assess the model's accuracy. Analyze the sample data: The sample data is used to train and test the machine learning model. This involves feeding the input variables into the model and comparing the predicted output to the actual output. Interpret results: The results of the analysis are used to draw conclusions about the hypothesis being tested. Ifthe model performs well on the sample data, it may be considered 2 good predictor of the outcome variable. Estimate type-I and type-II error: Type-I error, also known as a false positive, occurs when the model incorrectly predicts a positive outcome when the actual outcome is negative. Type-II error, also known as a false negative, occurs when the model incorrectly predicts a negative outcome when the actual outcome is positive. OUTPUT 1. State the hypothesis. © The linguistic and contextual features of news articles can be used to predict whether an article is likely to contain false information. © Machine learning models trained on this dataset can accurately classify news articles as true or false based on their content and metadata. © Supervised learning approach that utilizes multiple types of features, such as linguistic features (¢.g., sentiment analysis, part-of-speech tagging) and contextual features (¢.., source credibility, temporal and social signals), can lead to an accurate and robust fake news detection system. 78 2. Formulate an analysis plan. The analysis plan can be described by below flow chart 1. Importing dataset: The data analysis pipeline begins with the import or creation of a working dataset. The exploratory analysis phase begins immediately after. Importing a dataset is simple with Pandas through functions dedicated to reading the data. 2. Analysis of dataset: An especially important activity in the routine of a data analyst or scientist. It enables an in-depth understanding of the dataset, defines or discards hypotheses and creates predictive models on a solid basis. It uses data manipulation techniques and several statistical tools to describe and understand the relationship between variables and how these can impact business. 3. Understanding the variables: While in the previous point, we are describing the dataset in its entirety, now we try to accurately describe all the variables that interest us. For this reason, this step can also be called bivariate analysis. 4, Modelling: At the end of the process, we will be able to consolidate a business report or continue with the data modelling phase. We would be using Logistic Regression, Decision Tree Classifier, Random Forest Classifier, Gradient Boosting, and Support Vector Machine for modelling the dataset. 5. Interpreting the results: The results of the analysis are used to draw conclusions about the hypothesis being tested. If the model performs well on the sample data, it may be 79 considered a good predictor of the outcome variable. However, the model's accuracy may need to be validated on new, unseen data to ensure that it is generalizable, 3. Analyze the sample data. Importing working datasets ELEM conacannminisss marie Data Cleaning (31 numeric features = ‘share count', ‘reaction count’, ‘comment_count'} target = datal ‘Rating') (category', 'Page', ‘Bost Type’, 'Dabato"] Bi-variate analysis (Numerical-Categorical Analysis) (2) data groupby(“Category’ )(share_count’ | moan() eas 274644 waingurean 140656810 ribs vasa. 911480 ‘ana atyper taaesa © date groupdy('catesory')[ hare count |main|) O cmesey ames mare_count, hyper tlostet Missing values © ton steara. inate ings Sipe <8) inputr = sinpetnpter(stratag nda’) erie data = ipitar it traefornata{nanoriefestures]) 19 imputar « Sinpletaputer(stratagy='ncet_ frequent!) ‘eategorleal deta « inputer.fit traneforal duta[ categorical features}) 4, Interpreting the results. For dataset, 80 Sno. Model Performance 1. Logistic Regression ‘Accuracy = 0.76367 a Decision Tree Classifier | Accuracy = 0.71553 3. Random Forest Classfier_| Accuracy = 0.77242 4 Gradient Boosting ‘Accuracy = 0.76367 5. Support Vector Machine _| Accuracy = 0.739606 Logistie Regression mo) yet » nat seo 8 wiping 0 ep a pci in eee, mt, ct, ee po rg com sit Rk Decision Tree Classifier Accuracy: 0.7155361050328227 Classification report: mixture of true and false mostly false mostly true no factual content accuracy macro avg weighted avg Random Forest Classifier E aligiewe Classification report: nixture of true and false nostly false aostly true no factual content accuracy macro avg weighted avg precision 0.28 0.26 0.83 0.48 0.46 0.70 0.35 0.16 0.00 0.00 0.81 0.94 0.65 0.49 0450.40 0.70 0.76 recall £1-score 0.22 0.24 0.23 0.24 0.86 0.84 0.49 0.48 0.72 0.45 0.45 0.72 0.71 precision 0.22 55 0.00 2 0.87333 0.56 0 0.76 487 0.41 457 0.72 457 ei ps its Ped Fe -id g . support 55 22 333 a7 457 457 457 recall fi-score support 81 Gradient Boosting [> Accuracy: 0.7724288840262582 Classification report: precision recall fl-score mixture of true and false 0.38 0.16 0.23 mostly false 0.43 0.14 0,21 mostly true 0.81 0.94 0.87 no factual content 0.72 0.60 0.65 accuracy 0.77 macro avg 0,58 0.46 © 0.49 weighted avg 0.73 0.77 0.74 svM oa ME ike eat a Sen ep ste ee 8 ahi i mea i ® woe on eine et ti th 1 i pM ace i oe ein WW 1 a sero, wis, mc, ie iia pe. gta sel, wa, a LEARNING tn pM nei Di a ind wo support 55. 22 333 47 4357 457 457 ‘A Type J error is a false positive conclusion, while a Type Il error is a false negative conclusion, 82 EXPER! AIM Write a program to implement the t-test. THEORY A testis atype of inferential statistic used to determine if there isa significant difference between the means of two groups, which may be related to certain features. There are three types of t-tests, and they are categorized as dependent and independent t-tests. 1. Independent samples t-test: compares the means for two groups. 2. Paired sample t-test: compares means from the same group at different times (say, one year apart) 3. One sample t-test test: the mean of a single group against a known mean. CODE AND OUTPUTS 1. Importing required Libraries import string amport loud from wordcloud import WordCl TOPWORDS 2. Loading the dataset af = pd.read_cav( /content/drive/Myprive/Colab Notehooks/jm1.csv.x1s") af.nead() 3. Information about the dataset 83 at-intoth RangeIndex: 10885 entries, 0 to 10884 Data columns (total 22 columns): column Non-Null Count type loc 10885 non-null floated va) 10885 non-null floaté4 eva) 10885 non-null fleaté4 iv(g) 10885 non-null floated a 10885 non-null floated vy 10885 non-null float6d L 10885 non-null floatéd a 10885 non-null floated > 10885 non-null floated ° 10885 non-null floated 10 10885 non-null float6d ne 10885 non-null floatéd 12 locede 10885 non-null int6d 13. 1ocomment 10885 non-null int64 14 oBlank 10885 non-null int64 15 locCodeandconment 10885 non-null int64 16 unig_op 10885 non-null object 17, unig_opnd 10885 non-null object 18 total_op 10885 non-null object 19 total_opnd 10885 non-null object 20. branchcount 10885 non-null object 21 defects 10885 non-null bool atypes: bool(1), float64(12), int64(4), object (5) nenory usage: 1.8+ MB 4, Selecting Features as atin’) be aet'v') 5. Performing the test £2 = state teest_ind(a, >) 2 Teest_indResult (statieticn-29.053826112647537, pvalue=5.747783245375447e-192) Observation: P value is small (less than 0.05) for all the features, hence null hypothesis is rejected, which implies group mean is not the same for all categories. ‘Null Hypothesis: The difference in mean values of title length of fake news and title length of real news is 0. Altemate Hypothesis: The difference in mean values of the title length of fake news and the title length of real news is not 0. OBSERVATION We observe a statistically significant difference (p-value = 0.01583) between the length of news titles of real and fake news. The title length of fake news is slightly larger than that of real news. Fake news title length distribution is cantered with a mean of 7.83, while the centre of distribution of title length of real news is slightly skewed towards the right with a mean of 7.02. The t-test gives us evidence that the length of a real news title is significantly shorter than the a fake news title LEARNING Key learnings for implementing the T-Test in a program include understanding its applications in statistical hypothesis testing, considering assumptions such as normality and homogeneity of variances, implementing the T-Test in a programming language or statistical software, interpreting results including p-values and confidence intervals, and considering sample size, power analysis, and effect size for appropriate interpretation and decision-making, 85 EXPER! AIM Write a program to implement the chi-square test. THEORY One of the primary tasks involved in any supervised Machine Learning venture is to select the best features from the given dataset to obtain the best results. One way to select these features is the Chi-Square Test, Mathematically, a Chi-Square test is done on two distributions two determine the level of similarity of their respective variances. In its null hypothesis, it assumes that the given distributions are independent. This test thus can be used to determine the best features for a given dataset by determining the features on which the output class label is most dependent. It involves the use of a contingency table. A Contingency table (also called crosstab) is used in statistics to summarise the relationship between several categorical variables. CODE AND OUTPUTS ‘© Null Hypothesis: There is no relation between News_Type and Article_Subject ¢ Altemate Hypothesis: There is a significant relation between News Type and Article_Subject 1. Importing required Libraries import num amport pand import matplotlit t import seaborn as st 2. Loading the dataset at = pd-read_esv(' /content /drive/Mybrive/Colab Notebooks/jml.csv-x1s") af.nead() 86 3. Information about the dataset at.intoth Notebooks /Jm.cev.818") # vertorm the Friedman test friednan_stat, pvalue = friedmanchisquare(data{ n'), data{'v'], dxtal't'1) # print the resuite print("Friedsan statistic:', friedsan_stat) Peint(*p-valve:', p_value) Friedean statintic: 10972.614517524018 pevaluer 0.0 LEARNING Key learnings for implementing the Friedman test in a program include understanding its applications in non-parametric statistical analysis, familiarity with assumptions and requirements, such as repeated measures and ranked data, implementation in a programming language or statistical software, interpretation of results including Friedman statistic, degrees of freedom, and p-values, and consideration of appropriate use and limitations of the Friedman test for comparison of multiple related samples and valid interpretation of results. 90 EXPER! ENT AIM ‘Write a program to implement Wilcoxon Signed Rank Test. THEORY Wilcoxon signed-rank test, also known as Wilcoxon matched pair test is a non-parametric hypothesis test that compares the median of two paired groups and tells if they are identically distributed or not. We can use this when: © Differences between the pairs of data are non-normally distributed. ‘© Independent pairs of data are identical. (or matched) CODE AND OUTPUTS ‘© Null Hypothesis: The groups - title length of fake news and title length of real news are identically distributed. © Alternate Hypothesis: The groups - ttle length of fake news and title length of real news are not identically distributed. 1. Importing required Libraries import pandas as pd import nui y as np import seaborn as s import matplotlib.pyplot as plt plt.style.use(‘default') 2. Loading the dataset at = pd-read_esv(' /content /drive/Mybrive/Colab Notebooks /jml.csv-x1s") af.nead() 31 3. Information about the dataset ove 10685 non-nsi 3 Ss) 40888 sonst 17 unta_oond 10685 non-null object 19 total_oend foes nonnull bjest 20 branchcount 10685 non-null ebject stypess bool (1), ttoats4(i2), imt6e4), Obsect(5) 4, Wilcoxon Signed Rank Test import pandas es pd fron seipy.stats inport wileoxon 4 Load the dataset from @ csv file data = pd.read_cev( /content/drive/MyDrive/Colab Notebooks/‘nl.csv.:18") # perform the Wilcoxon eigned- stat, pvalue = wilcoxon(data{'n'}, data{'v'}) ke teat # Print the results print ("Wilcoxon signed-rank statistic: print(‘pevalue:', pvalue) stat) Wilcoxon signed-rank statistic: 253.0 pevalue: 0.0 LEARNING Key leamings for implementing the Wilcoxon Signed Rank test in a program include understanding its applications in non-parametric statistical analysis, familiarity with assumptions and requirements such as paired data and ordinal or continuous variables, implementation in a programming language or statistical software, interpretation of results including test statisti, p= values, and confidence intervals, and consideration of appropriate use and limitations of the Wilcoxon Signed Rank test for comparing paired data and valid interpretation of results. 92 AIM ‘Write a program to implement the Nemenyi test. THEORY The Friedman Test is used to find whether there exists a significant difference between the means of more than two groups. In such groups, the same subjects show up in each group. If the p-value of the Friedman test turns out to be statistically significant then we can conduct the Nemenyi test to find exactly which groups are different. This test is also known as Nemenyi posthoc test. CODE AND OUTPUTS ‘© Null Hypothesis: There is no significant difference in the score values © Alternate Hypothesis: At least 2 values differ from one another. 1. Importing required Libraries import pandas as pd import nump: as np import seaborn as sns import matplotlib.p' Lot as plt plt.style.use(‘default') 2. Loading the dataset at = pd-read_esv(' /content /drive/Mybrive/Colab Notebooks /jml.csv.x1s") af.nead() 3. Information about the dataset 93 4, Friedman Test import pandas a= pa . from a csv file 1¥(" content rive /Myorive/Colab Notebooks/ jmlcev-x18") # toad the data: ace = pe.read < # vertorm the Friedman test friednan stat, pvalue = friednanchieguare(dstal ‘n’], daes{'v'], datal’s']) print("Friedsan statistic:', friedsan_stat) prine(‘p-value:', p_value) Friedean statintic: 10972.614517524018 5. Nemenyi Test d=np-array({sample1, sample2, sample3]) sp.posthoc_nemenyi_friedman(d.T) OBSERVATION © From the outputs received, we reject the Null hypothesi ‘® From the output table we can clearly conclude that the 2 groups to have statistically significantly different means are Group 1 and Group 2. LEARNING Key learnings for implementing the Nemenyi test in a program include understanding its applications in posthoc analysis of multiple comparison tests, familiarity with requirements and assumptions such as ranked or continuous data and multiple group comparisons, implementation in a programming language or statistical software, interpretation of results including critical difference values and significance levels, and consideration of appropriate use and limitations of the Nemenyi test for posthoc analysis and valid interpretation of results in the context of statistical hypothesis testing. 94 AIM ‘Write down the threats to validity observed in the experiment conducted for models THEORY In empirical software engineering experiments, especially in machine learning-based defect prediction, i's crucial to assess and record threats to validity. These threats impact the reliability, reproducibility, and generalizability of the results. Federated Transfer Learning (FTL) combines federated learning (training models collaboratively without sharing raw data) and transfer learning (using knowledge from a source domain to improve learning in a target domain). While FTL enhances privacy and allows knowledge sharing across datasets, it introduces specific challenges that must be analyzed for validity. The threats to validity are typically categorized into four types: Internal Validity — Whether observed effects are caused by the experiment itself and not external factors. External Validity — How generalizable the findings are to other contexts or datasets. Construet Validity — Whether the experiment truly measures what it claims to measu Conelusion Validity — How accurately conclusions are drawn based on statistical evidence. OBSERVATION Threats to Validity Identified: 1. Internal Validity: © Data preprocessing differences across federated nodes. © Imbalanced class distribution affecting defect prediction. ‘© Inconsistent feature representation among clients, 2. External Validi © Limited to open-source datasets (¢.g., NASA, PROMISE). © Inability to generalize findings to industrial, closed-source software, 3. Construct Validity: © Using “accuracy” as the sole performance metric can be misleading, 95, © Defect definitions differ across datasets, impacting label consistency. 4, Conclusion Validity: Small number of datasets (clients) reduces statistical power. o Lack of statistical testing like t-test, Nemenyi test, etc., weakens confidence in model comparisons, LEARNING Key learnings for identifying and analyzing threats to validity in machine learning experiments. In the context of Federated Transfer Learning for software defect prediction, special care must be taken to ensure consistency, statistical robustness, and generalizability. Addressing these threats improves the reliability and impact of the experimental results. 96 EXPER! ENT AIM Explore tools such as SPSS, KEIL, PYTHON, and R. THEORY Modern machine learning and statistical modeling require robust tools and environments. For this experiment, the focus is on exploring the following tools and understanding how they can be used for analyzing software defect data and building prediction models 1. PYTHON Python is a powerful open-source programming language widely used in data science, machine learning, and AL. + Relevance to Experiment: Python was the main language used for implementing federated learning and defect prediction models. Libraries such as: scikit-learn for machine learning models pandas and numpy for data manipulation © matplotlib and seaborn for visualization © tensorflow-federated (TFF) or Flower for federated learning frameworks + Key Features: © Easy integration with data pipelines © Vast community support © Can handle both statistical and deep learning models © Good support for transfer learning using Keras or PyTorch 2. SPSS (Statistical Package for the Social Sciences) SPSS is a GUL-based statistical analysis tool used for hypothesis testing, regression analysis, and data visualization. + Relevance to Experiment: Useful for performing statistical tests such as: © test (one-sample, paired, independent) chi-square test 97 2 ANOVA, Friedman, Wilcoxon, Nemenyi tests + Advantages: © User-friendly interface © Built-in functions for descriptive statistics and validity testing © Suitable for researchers unfamiliar with programming 3. R Programming Language R is a language specifically designed for statistical computing and data visualization. + Relevance to Experiment: © Can be used to validate machine learning results through statistical tests © Ideal for plotting ROC curves, boxplots, or confusion matrices © Packages like caret, ©1071, and mir support classification tasks + Strengths: © Rich set of statistical packages © Excellent data visualization with ggplot2 © Preferred in academic research for statistical modeling 4. KEIL KEIL is primarily used for embedded system development, especially microcontroller programming. + Relevance to Experiment: Not directly related to software defect prediction in federated learning, but: © Can be explored when analyzing embedded software defects or real-time systems. ‘© Useful for low-level software testing and debugging in hardware-oriented projects. LEARNING Leamed that combining tools strengthens the overall research workflow and improves the reliability and explainability of results. and realized that statistical tests performed in SPSS/R complement the performance metrics derived in Python, helping validate the learning model’s effectiveness more rigorously. 98

You might also like