Decision Tree for Regression in R Programming Last Updated : 30 Jun, 2025 Summarize Comments Improve Suggest changes Share Like Article Like Report A decision tree is a machine learning algorithm used to predict continuous values (regression) or categories (classification). In regression, a decision tree is used when the dependent variable is continuous, like predicting car mileage based on engine power.Working of Decision Tree Algorithm of RegressionA decision tree for regression works by recursively partitioning the dataset into subsets based on feature values. The goal is to minimize the variance of the target variable (dependent variable) within each subset. 1. Splitting the DataAt each node, the decision tree selects the feature X_j and a split point s_j such that the sum of variances within the resulting subsets is minimized. The variance of the target variable Y within a subset is calculated as:\text{Var}(Y|X_j, s_j) = \frac{1}{N} \sum_{i=1}^{N} (y_i - \bar{y})^2Where:y_i is the actual target value of the i^{th} sample,\bar{y} is the mean of yvalues in that subset,N is the number of samples in the subset.The split at each node aims to find the feature X_j and threshold s_j that minimize the total variance across the two child nodes:\text{Total Variance} = \frac{N_L}{N} \text{Var}(Y|X_j, s_j)_L + \frac{N_R}{N} \text{Var}(Y|X_j, s_j)_RWhere:N_L and N_R are the number of samples in the left and right child nodes,\text{Var}(Y|X_j, s_j)_L and \text{Var}(Y|X_j, s_j)_R are the variances in the left and right child nodes.2. Leaf Node PredictionOnce the data is split sufficiently, each leaf node will contain a subset of the data, and the prediction for that subset is the mean value of the target variable Y:\hat{y} = \frac{1}{N_{\text{leaf}}} \sum_{i=1}^{N_{\text{leaf}}} y_iWhere:\hat{y} is the predicted value for all data points in that leaf,N_{\text{leaf}} is the number of samples in that leaf.3. Making PredictionsTo predict a new data point \mathbf{x}, the decision tree follows the splits starting from the root to the appropriate leaf node. Once in the leaf node, the prediction is the mean of the target values in that node:\hat{y}_{\text{new}} = \frac{1}{N_{\text{leaf}}} \sum_{i=1}^{N_{\text{leaf}}} y_iWhere:N_{\text{leaf}} is the number of samples in the leaf that the new data point \mathbf{x} falls into.Implementation of Decision Tree for Regression in RWe will now demonstrate how to predict the mpg (miles per gallon) using a regression decision tree.1. Installing and loading the Required PackageWe will install the rpart package, which contains the necessary functions for decision tree regression. R install.packages("rpart") library(rpart) 2. Loading the DatasetWe will load the mtcars dataset, which is an in-built dataset in R. We will also use head() function to display first few rows. R data(mtcars) head(mtcars) Output:mtcars dataset3. Fitting the ModelWe will create a regression decision tree to predict mpg based on disp, hp, and cyl using the rpart() function.mpg ~ disp + hp + cyl: This specifies the formula where mpg is the dependent variable, and disp, hp, and cyl are the independent variables.method = "anova": This specifies that we are performing regression, where the algorithm minimizes variance to predict continuous values.data = mtcars: This specifies the dataset to be used for building the model, in this case, the mtcars dataset. R fit <- rpart(mpg ~ disp + hp + cyl, method = "anova", data = mtcars) 4. Plotting the Decision TreeWe will visualize the decision tree by plotting it and saving it as a PNG image.png(): Opens a PNG device to save the plot to a file.plot(): Creates a visual representation of the decision tree.text(): Adds text labels to the plot, showing details like the number of observations at each node.dev.off(): Closes the plotting device, saving the image. R png(file = "decTree2GFG.png", width = 600, height = 600) plot(fit, uniform = TRUE, main = "MPG Decision Tree using Regression") text(fit, use.n = TRUE, cex = .9) dev.off() Output:Decision Tree Structure5. Printing the Decision Tree ModelWe will print the decision tree model to view the splits and other details.print(fit): Displays the decision tree structure, showing how the data was split at each node, the resulting outcomes, and the final predictions. R print(fit) Output:Decision Tree Model6. Predicting MPG with Test DataWe will create a test dataset and use the decision tree model to predict the mpg value for a new set of inputs.df: A new dataset containing the input values for which we want to predict mpg.predict(): Uses the trained model (fit) to predict the mpg value for the new data.method = "anova": Specifies that we are using regression to predict continuous values. R df <- data.frame(disp = 351, hp = 250, cyl = 8) cat("Predicted value:\n") predict(fit, df, method = "anova") Output:Predicted value: 13.41429Advantages of Decision TreesConsider All Possible Decisions: Decision trees evaluate all potential decisions and create a comprehensive model for predictions.Easy to Use: Simple to implement and interpret for both classification and regression tasks.Handles Missing Values: Decision trees can manage datasets with missing values without requiring special preprocessing.Disadvantages of Decision TreesComputationally Intensive: For large datasets, decision trees can be slow to compute.Limited Learning Capability: Decision trees may not perform well with complex data. Techniques like Random Forests can be used for improved performance. Comment More infoAdvertise with us Next Article Decision Tree Classifiers in R Programming U utkarsh_kumar Follow Improve Article Tags : R Language R Machine-Learning R regression R Data-science Similar Reads Machine Learning with R Machine Learning is a growing field that enables computers to learn from data and make decisions without being explicitly programmed. It mimics the way humans learn from experiences, allowing systems to improve performance over time through data-driven insights. This Machine Learning with R Programm 3 min read Getting Started With Machine Learning In RIntroduction to Machine Learning in RMachine learning in R allows data scientists, analysts and statisticians to build predictive models, uncover patterns and gain insights using powerful statistical techniques combined with modern machine learning algorithms. R provides a comprehensive environment with numerous built-in functions and 6 min read What is Machine Learning?Machine learning is a branch of artificial intelligence that enables algorithms to uncover hidden patterns within datasets. It allows them to predict new, similar data without explicit programming for each task. Machine learning finds applications in diverse fields such as image and speech recogniti 9 min read Setting up Environment for Machine Learning with R ProgrammingMachine Learning is a subset of Artificial Intelligence (AI) which enables systems to learn and make predictions without explicit programming. In machine learning, algorithms and models are developed to identify patterns and trends within data, allowing systems to predict outcomes based on observed 2 min read Supervised and Unsupervised Learning in R ProgrammingMachine Learning (ML) is a subset of Artificial Intelligence (AI) that enables computers to learn from data and improve their performance over time without being explicitly programmed. The choice of ML algorithms depends on the type of data and the task at hand which can be broadly divided into Supe 3 min read Data ProcessingIntroduction to Data in Machine LearningData refers to the set of observations or measurements to train a machine learning models. The performance of such models is heavily influenced by both the quality and quantity of data available for training and testing. Machine learning algorithms cannot be trained without data. Cutting-edge develo 4 min read ML | Understanding Data ProcessingIn machine learning, data is the most important aspect, but the raw data is messy, incomplete, or unstructured. So, we process the raw data to transform it into a clean, structured format for analysis, and this step in the data science pipeline is known as data processing. Without data processing, e 5 min read ML | Overview of Data CleaningData cleaning is a important step in the machine learning (ML) pipeline as it involves identifying and removing any missing duplicate or irrelevant data. The goal of data cleaning is to ensure that the data is accurate, consistent and free of errors as raw data is often noisy, incomplete and inconsi 13 min read ML | Feature Scaling - Part 1Feature Scaling is a technique to standardize the independent features present in the data in a fixed range. It is performed during the data pre-processing. Working: Given a data-set with features- Age, Salary, BHK Apartment with the data size of 5000 people, each having these independent data featu 3 min read Supervised Learning Linear Regression in RLinear regression is a statistical approach used to model the relationship between a dependent variable and one or more independent variables. A straight line is assumed to approximate this relationship. The goal is to identify the line that minimizes discrepancies between the observed data points a 4 min read Multiple Linear Regression using RPrerequisite: Simple Linear-Regression using RLinear Regression: It is the basic and commonly used type for predictive analysis. It is a statistical approach for modeling the relationship between a dependent variable and a given set of independent variables.These are of two types:  Simple linear Re 3 min read Decision Tree for Regression in R ProgrammingA decision tree is a machine learning algorithm used to predict continuous values (regression) or categories (classification). In regression, a decision tree is used when the dependent variable is continuous, like predicting car mileage based on engine power.Working of Decision Tree Algorithm of Reg 4 min read Decision Tree Classifiers in R ProgrammingDecision Tree is a machine learning algorithm that assigns new observations to predefined categories based on a training dataset. Its goals are to predict class labels for unseen data and identify the features that define each class. It has a flowchart-like tree structure in which the internal node 6 min read Random Forest Approach in R ProgrammingRandom Forest is an machine learning algorithm which is used for both regression and classification tasks. It is an ensemble method that creates multiple decision trees and combines their outputs to improve model performance.Key points about Random ForestBagging (Bootstrap Aggregating): This method 5 min read Random Forest Approach for Regression in R ProgrammingRandom Forest is a supervised learning algorithm and an ensemble learning model that combines multiple decision trees to improve accuracy and reduce overfitting. By averaging the predictions of several trees, it provides more stable and robust results for both classification and regression tasks. Th 4 min read Random Forest Approach for Classification in R ProgrammingRandom Forest is a machine learning algorithm used for classification and regression tasks. It creates multiple decision trees and combines their outputs to improve accuracy and minimize overfitting. Each tree makes an individual prediction and the final result is determined by aggregating the predi 5 min read Classifying data using Support Vector Machines(SVMs) in RSupport Vector Machines (SVM) are supervised learning models mainly used for classification and but can also be used for regression tasks. In this approach, each data point is represented as a point in an n-dimensional space where n is the number of features. The goal is to find a hyperplane that be 5 min read Support Vector Machine Classifier Implementation in R with Caret packageOne of the most crucial aspects of machine learning that most data scientists run against in their careers is the classification problem. The goal of a classification algorithm is to foretell whether a particular activity will take place or not. Depending on the data available, classification algori 7 min read KNN Classifier in R ProgrammingK-Nearest Neighbor or KNN is a supervised non-linear classification algorithm. It is also Non-parametric in nature meaning , it doesn't make any assumption about underlying data or its distribution. Algorithm Structure In KNN algorithm, K specifies the number of neighbors and its algorithm is as fol 4 min read Evaluation MetricsPrecision, Recall and F1-Score using RIn machine learning, evaluating model performance is critical. Three widely used metricsâPrecision, Recall, and F1-Scoreâhelp assess the quality of classification models. Here's what each metric represents:Recall: Measures the proportion of actual positive cases correctly identified. Also known as s 3 min read How to Calculate F1 Score in R?In this article, we will be looking at the approach to calculate F1 Score using the various packages and their various functionalities in the R language. F1 Score The F-score or F-measure is a measure of a test's accuracy. It is calculated from the precision and recall of the test, where the precisi 5 min read Unsupervised LearningK-Means Clustering in R ProgrammingK Means Clustering is an unsupervised learning algorithm that groups data into clusters based on similarity. This algorithm divides data into a specified number of clusters, assigning each data point to one. It is used in various fields like banking, healthcare, retail and media. In this article we 3 min read Hierarchical Clustering in R ProgrammingHierarchical clustering in R is an unsupervised, non-linear algorithm used to create clusters with a hierarchical structure. The method is often compared to organizing a family tree. Suppose a family of up to three generations. The grandfather and mother have children and these children become paren 4 min read How to Perform Hierarchical Cluster Analysis using R Programming?Cluster analysis or clustering is a technique to find subgroups of data points within a data set. The data points belonging to the same subgroup have similar features or properties. Clustering is an unsupervised machine learning approach and has a wide variety of applications such as market research 5 min read Linear Discriminant Analysis in R ProgrammingLinear Discriminant Analysis (LDA) is a machine learning algorithm used for classification and dimensionality reduction. It works by finding a line (or plane in higher dimensions) that best separates the classes (groups) in the data. It does this by creating linear combinations of the input features 3 min read Model Selection and EvaluationCross-Validation in R programmingCross-validation is an essential technique in machine learning used to assess the performance and accuracy of a model. The primary goal is to ensure that the model is not overfitting to the training data and that it will perform well on unseen, real-world data. Cross-validation involves partitioning 4 min read LOOCV (Leave One Out Cross-Validation) in R ProgrammingLOOCV (Leave-One-Out Cross-Validation) is a model evaluation technique used to assess the performance of a machine learning model on small datasets. In LOOCV, one observation is used as the test set while the rest form the training set. This process is repeated for each data point in the dataset, re 3 min read Bias-Variance Trade Off - Machine LearningIt is important to understand prediction errors (bias and variance) when it comes to accuracy in any machine-learning algorithm. There is a tradeoff between a modelâs ability to minimize bias and variance which is referred to as the best solution for selecting a value of Regularization constant. A p 3 min read Reinforcement LearningMarkov Decision ProcessMarkov Decision Process (MDP) is a way to describe how a decision-making agent like a robot or game character moves through different situations while trying to achieve a goal. MDPs rely on variables such as the environment, agentâs actions and rewards to decide the systemâs next optimal action. It 4 min read Q-Learning in Reinforcement LearningQ-Learning is a popular model-free reinforcement learning algorithm that helps an agent learn how to make the best decisions by interacting with its environment. Instead of needing a model of the environment the agent learns purely from experience by trying different actions and seeing their results 7 min read Deep Q-Learning in Reinforcement LearningDeep Q-Learning is a method that uses deep learning to help machines make decisions in complicated situations. Itâs especially useful in environments where the number of possible situations called states is very large like in video games or robotics.Before understanding Deep Q-Learning itâs importan 4 min read Dimensionality ReductionIntroduction to Dimensionality ReductionWhen working with machine learning models, datasets with too many features can cause issues like slow computation and overfitting. Dimensionality reduction helps to reduce the number of features while retaining key information. Techniques like principal component analysis (PCA), singular value decom 4 min read ML | Introduction to Kernel PCAPRINCIPAL COMPONENT ANALYSIS: is a tool which is used to reduce the dimension of the data. It allows us to reduce the dimension of the data without much loss of information. PCA reduces the dimension by finding a few orthogonal linear combinations (principal components) of the original variables wit 6 min read Principal Component Analysis with R ProgrammingPrincipal component analysis(PCA) in R programming is an analysis of the linear components of all existing attributes. Principal components are linear combinations (orthogonal transformation) of the original predictor in the dataset. It is a useful technique for EDA(Exploratory data analysis) and al 3 min read Advanced TopicsKolmogorov-Smirnov Test in R ProgrammingKolmogorov-Smirnov (K-S) test is a non-parametric test employed to check whether the probability distributions of a sample and a control distribution, or two samples are equal. It is constructed based on the cumulative distribution function (CDF) and calculates the greatest difference between the em 4 min read Moore â Penrose Pseudoinverse in R ProgrammingThe concept used to generalize the solution of a linear equation is known as Moore â Penrose Pseudoinverse of a matrix. Moore â Penrose inverse is the most widely known type of matrix pseudoinverse. In linear algebra pseudoinverse A^{+}    of a matrix A is a generalization of the inverse matrix. The 5 min read Spearman Correlation Testing in R ProgrammingCorrelation is a key statistical concept used to measure the strength and direction of the relationship between two variables. Unlike Pearsonâs correlation, which assumes a linear relationship and continuous data, Spearmanâs rank correlation coefficient is a non-parametric measure that assesses how 3 min read Poisson Functions in R ProgrammingThe Poisson distribution represents the probability of a provided number of cases happening in a set period of space or time if these cases happen with an identified constant mean rate (free of the period since the ultimate event). Poisson distribution has been named after Siméon Denis Poisson(Frenc 3 min read Feature Engineering in R ProgrammingFeature engineering is the process of transforming raw data into features that can be used in a machine-learning model. In R programming, feature engineering can be done using a variety of built-in functions and packages. One common approach to feature engineering is to use the dplyr package to mani 7 min read Adjusted Coefficient of Determination in R ProgrammingPrerequisite: Multiple Linear Regression using R A well-fitting regression model produces predicted values close to the observed data values. The mean model, which uses the mean for every predicted value, commonly would be used if there were no informative predictor variables. The fit of a proposed 3 min read Mann Whitney U Test in R ProgrammingA popular nonparametric(distribution-free) test to compare outcomes between two independent groups is the Mann Whitney U test. When comparing two independent samples, when the outcome is not normally distributed and the samples are small, a nonparametric test is appropriate. It is used to see the di 4 min read Bootstrap Confidence Interval with R ProgrammingBootstrapping is a statistical method for inference about a population using sample data. It can be used to estimate the confidence interval(CI) by drawing samples with replacement from sample data. Bootstrapping can be used to assign CI to various statistics that have no closed-form or complicated 5 min read Like