0% found this document useful (0 votes)
25 views

Project 1

This project involves analyzing factors that influence housing prices using real estate data. The objective is to explore the relationships between housing prices and attributes like square footage, number of bedrooms, and neighborhood quality. Tasks include exploring the data, identifying correlated variables, building a regression model with house price as the dependent variable, and interpreting the results. Issues like omitted variable bias and heteroskedasticity are also to be discussed. Findings and recommendations will be presented to a real estate agency.

Uploaded by

aydanabbasova2
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Project 1

This project involves analyzing factors that influence housing prices using real estate data. The objective is to explore the relationships between housing prices and attributes like square footage, number of bedrooms, and neighborhood quality. Tasks include exploring the data, identifying correlated variables, building a regression model with house price as the dependent variable, and interpreting the results. Issues like omitted variable bias and heteroskedasticity are also to be discussed. Findings and recommendations will be presented to a real estate agency.

Uploaded by

aydanabbasova2
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Project: Understanding Factors Influencing Housing Prices

Introduction:

In this project, you will take on the role of data analysts working for a real estate agency.
The agency is interested in gaining a deeper understanding of the factors that influence
housing prices within a specific city. Your analysis will contribute to more informed
property valuation and investment decisions.

Objective:

Your objective is to conduct a comprehensive analysis of housing prices using real-


world data. You will explore the relationships between housing prices and various
attributes, with an emphasis on factors that can influence the value of residential
properties.

Dataset Description:

You will be provided with a dataset containing information on houses within the city.
The dataset includes the following variables:

1. House Price: The price of the house. This is the dependent variable you'll be
analyzing.
2. Square Footage: The size of the house in square feet.
3. Number of Bedrooms: The number of bedrooms in the house.
4. Number of Bathrooms: The number of bathrooms in the house.
5. Neighborhood Quality: A rating of the neighborhood where the house is
located.
6. Year Built: The year the house was constructed.
7. Garage Presence: A binary variable indicating the presence or absence of a
garage.
8. Garage Size: The size or capacity of the garage.
9. Backyard Size: The size of the backyard in square feet.
10. School Rating: A rating of the nearest school's quality.
11. Distance to Work: The distance from the house to a common workplace
location.
12. Crime Rate: A rating of the crime rate in the neighborhood.

Project Tasks:
1. Data Exploration:
 Begin by exploring the dataset to understand its structure and key
statistics.
 Identify any missing values and decide on strategies for handling them.
2. Correlation Analysis:
 Conduct a correlation analysis to determine which independent variables
are strongly correlated with house prices. Identify at least three
independent variables that exhibit a significant correlation with house
prices.
3. Regression Analysis:
 Build a multiple linear regression model with house price as the
dependent variable.
 Include the three independent variables identified in the correlation
analysis as predictors.
 Analyze the coefficients and significance of the independent variables in
predicting house prices.
4. Interpretation:
 Interpret the coefficients of the independent variables to understand their
impact on house prices.
 Provide insights into which factors contribute most significantly to housing
price variations.
5. Discussion on Omitted Variable Bias (OVB):
 Explore the concept of omitted variable bias (OVB) in the context of the
dataset. What happens if we omit the garage size variable from the
analysis considering its strong correlation with the garage presence.
 Discuss how an omitted variable could affect the coefficients and
inferences drawn from the model.
 Generate an artificial instrumental variable which is in high correlation with
the omitted variable and use that as an instrument to treat the omitted
variable bias.
6. Heteroskedasticity Problem:
 Neighborhood Quality and Crime Rate are two strongly correlated
variables.
 Discuss how this strong correlation affects the inference making problem,
particularly, when there is a small sample.
 You can restrict yourself to a very small portion of the sample to describe
the heteroskedasticity problem.
7. Inclusion of Polynomial terms
 Include squares and cubes and interactions and discuss how inclusion of
those terms affect the results.
8. Conclusions and Recommendations:
 Summarize the findings of your analysis, including key factors influencing
housing prices.
 Provide recommendations to the real estate agency based on your
analysis.

Presentation:

At the end of the project, you will be required to present your findings and insights to
the real estate agency. You will explain the methodology, results, and recommendations
based on your analysis.

This project provides a practical opportunity to understand the complexities of multiple


linear regression, identify factors influencing housing prices, and recognize the potential
impact of omitted variable bias.

You might also like