Regression analysis is a predictive modeling technique used to investigate relationships between dependent and independent variables. It can be used for forecasting, time series modeling, and determining causal effects. The dependent variable is what is being predicted based on the independent variables. Independent variables are manipulated by researchers and affect dependent variables, which depend on the independent variables. Examples include using factors like salary, age, and marital status as independent variables to predict cost of living as the dependent variable.
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
39 views
Unit 3 Big Data
Regression analysis is a predictive modeling technique used to investigate relationships between dependent and independent variables. It can be used for forecasting, time series modeling, and determining causal effects. The dependent variable is what is being predicted based on the independent variables. Independent variables are manipulated by researchers and affect dependent variables, which depend on the independent variables. Examples include using factors like salary, age, and marital status as independent variables to predict cost of living as the dependent variable.
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 18
Regression Modelling
Regression analysis is a form of predictive modelling technique
which investigates the relationship between a dependent (target) and independent variable (s) (predictor) • This technique is used for forecasting, time series modelling and finding the causal effect relationship between the variables. Dependent and Independent variables
• In data science, variables refer to the properties or characteristics of
certain events or objects. • There are mainly two types of variables while performing regression analysis which is as follows: Dependent and Independent variables
• Independent variables – These variables are manipulated or are
altered by researchers whose effects are later measured and compared. • They are also referred to as predictor variables. • They are called predictor variables because they predict or forecast the values of dependent variables in a regression model. Dependent and Independent variables
Dependent variables – These variables are the type of variable that
measures the effect of the independent variables on the testing units. It is safer to say that dependent variables are completely dependent on them. They are also referred to as predicted variables. They are called because these are the predicted or assumed values by the independent or predictor variables. Dependent and Independent variables Dependent and Independent variables • In data models, independent variables can have different names such as • “regressors”, • “explanatory variable”, “ • input variable”, • “controlled variable”, etc. Dependent and Independent variables On the other hand, dependent variables are called • “regressand,” • “response variable”, • “measured variable,” • “observed variable,” • “responding variable,” • “explained variable,” “outcome variable,” “experimental variable,” or “output variable.” • Below are a few examples to understand the usage and significance of dependent and independent variables in a wider sense: • Suppose you want to estimate the cost of living of a person using a regression model. In that case, you need to take independent variables as factors such as salary, age, marital status, etc. The cost of living of a person is highly dependent on these factors. Thus, it is designated as the dependent variable. • Another scenario is in the case of a student's poor performance in an examination. The independent variable could be factors, for example, poor memory, inattentiveness in class, irregular attendance, etc. Since these factors will affect the student's score, the dependent variable, in this case, is the student's score. • Suppose you want to measure the effect of different quantities of nutrient intake on the growth of a newborn child. In that case, you need to consider the amount of nutrient intake as the independent variable. In contrast, the dependent variable will be the growth of the child, which can be calculated by factors such as height, weight, etc. • What is the difference between Regression and Classification? • Regression and Classification both come under supervised learning methods, which indicate that they use labelled training datasets to train their models and make future predictions. • Thus, these two methods are often classified under the same column in machine learning. • Supervised learning, also known as supervised machine learning, is a subcategory of machine learning and artificial intelligence. It is defined by its use of labeled datasets to train algorithms that to classify data or predict outcomes accurately. • However, the key difference between them is the output variable. In regression, the output tends to be numerical or continuous, whereas, in classification, the output is categorical or discrete in nature.