Correlation_sample
Correlation_sample
How will you ensure that the data you collect on crime rates and poverty
levels in each country are reliable and comparable?
Introduction:
When I was kid I was a fan of crime movies. During that time I didn’t pay attention to which
country film was going on. Later on I realized that most of the movies were filming in countries
where there is poverty. For that reason I wanted to search to how poverty is affecting crime in
the country, and in this exploration I will investigate the relationship between crime and poverty.
Nowadays crime rates have increased a lot around the world especially in well-developed
countries such as Canada, Switzerland, Portugal, and etc. People assume that crime rates are
increasing because of poverty in the country. For that reason I wanted to find out if there really is
any connection between poverty and crime.
In this investigation delves into the connection between crime and poverty across 50 countries,
aiming to uncover how these two factors relate to each other. Crime and poverty are big issues
affecting many people around the world. By looking at countries like Latvia, Pakistan, Canada,
and Finland, we can see how different places deal with crime and poverty. I want to understand
if there's a link between how poor a country is and how much crime happens there.
The reason I’m looking into this is because understanding this link can help us make better
decisions about how to deal with crime and poverty. If I find a strong connection, it could mean
that reducing poverty might also help reduce crime in some places.
To do this, I will use data and numbers to see if there's a pattern. I will analyze information about
crime rates and poverty levels in each country. By comparing these factors, I hope to find out if
there's a relationship between poverty and crime.
My study is important because it could help governments and organizations create better
strategies for solving crime and poverty. By understanding the connection between the two, I
might be able to find more effective ways to make communities safer and improve people's lives
around the world.
Hypothesis:
The hypothesis of this study shows there is a relationship between the rates of crime and poverty
in 50 different countries. In the 1993 science-fiction movie Demolition Man, a rebel named
Edgar said that being poor may result in an increase in criminal activity, it is expected that higher
levels of poverty are linked to higher crime rates. To test this hypothesis and find any noteworthy
trends or relationships, data on crime rates and poverty indicators will be analyzed statistically.
With using linear correlation
1
Plan:
1. Using random sampling to list 50 countries
2. Finding crime rates of 50 countries
3. Finding poverty rates in 50 countries
4. Writing data to the TI-84 calculator, and finding scatter.
5. Finding line of best fit
6. Finding linear regression
7. Finding country out of my data
8. Checking how realistic is regression line 5 times
9. Finding percentage error
10. Outlier will be considered
11. Investigation affect linear to the data
1 Louise Gaille, “How Poverty Influences Crime Rates,” Vittana.org, December 16, 2019,
https://vittana.org/how-poverty-influences-crime-rates.
Dependent and Independent variable
The Dependent variable depends on other variables. Independent variables aren’t affected by any
other variables that the study measures.2 The independent variable is the cause. Its value of
Poverty in my investigation . The dependent variable is the effect. Its value depends on changes
in the independent variable.Its value of Crime Rate in my investigation3.
Andorra 8 12.87
Kuwait 2 32.97
Somalia 73 65.20
Taiwan 9 16.71
Anguilla 23 20.10
Libya 40 60.42
Iran 48.6 49.50
Switzerland 16 25.26
Belarus 5 51.02
Scatter graph
A scatter plot uses dots to represent values for two different variables. The position of each point
on the horizontal and vertical axis shows the values for a data point. Scatter plots are used to
observe relationships between variables.
4
The scatter plot is usually described as weak, or strong. The more spread out the data points are,
the weaker in the relationship. If the points are clearly clustered, or closely follow a curve or line,
the relationship is described as strong.
5
4 Describing Scatter Plots¶,” Describing Scatter Plots - Introduction to Google Sheets and
SQL,https://runestone.academy/ns/books/published/ac1/scatter_plots_and_correlation/
describing_scatter_plots.html#.
The formula for linear regression is y= mx+c where m and c are constant for all possible values
of x and y.
❑ ❑ ❑
n (∑ ❑ xy)−( ∑ ❑ x )(∑ ❑ y)
❑ ❑ ❑
r=
¿¿
The picture above shows correlation math formulas without using a calculator.
y=0.59x + 29.5
R in a regression analysis is called the correlation coefficient and it is defined as the correlation
or relationship between an independent and a dependent variable. It ranges from -1 to +1. An R-
10
For this result I used a line of regression. I used poverty for x. axis crime rate for y. axis which is
❑ ❑ ❑
Percentage errors
Percent error is the difference between an approximate or measured value and an exact or known
value. Percent errors indicate how big our errors are when we measure something in an analysis
process. Smaller percent errors indicate that we are close to the accepted or original value.
E = Estimated value
10 “The Meaning of R, R Square, Adjusted R Square, R Square Change and F Change in a Regression
Analysis,” Analysis INN., March 13, 2020, https://www.analysisinn.com/post/the-meaning-of-r-r-square-
adjusted-r-square-r-square-change-and-f-change-in-a-regression-analysis/#:~:text=R%20in%20a
%20regression%20analysis,independent%20and%20a%20dependent%20variable.
11
Country Poverty (x) Crime (y) Crime rate from Percentage Error
regression line
Effect of Outlier
An outlier is a data point that is very different from the rest in a group. Basically, it's up to the
person looking at the data to decide what counts as very different. Before we can point out which
data points are very different, we need to understand what the normal data points look like.
14
Outlier Formula :
Low outlier : Q 1−1.5 × IQR
Upper outlier : Q 3+1.5 × IQR
11 “Percent Error - Definition, Formula, and Solved Examples,” BYJUS, January 6, 2020,
https://byjus.com/maths/percent-error/
12 “Random Country Generator - Test Where You Land and Learn about It.,” Random Country - Explore
the World, May 17, 2022, https://random.country/
13 “Crime,” Cost of Living, https://www.numbeo.com/crime/
14 7.1.6. what are outliers in the data?, https://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm
Calculations
To find Q1, and Q3 we need to know what they are. Q1 is the middle point of the lower
half of the data. I found it by taking the middle value of the data that is below the middle
of the whole set. Q3 is the middle point between the middle of the data and the highest
value. You find it by taking the middle value of the data that is above the middle of the
whole set. To find the IQR I’m finding the difference between Q3, and Q1.
I’m finding the Q1, Q3, and IQR by using data from Poverty rate
Mean:23.81
Q1: 12.35
Q3: 28.725
IQR:16.375
Andorra 8 12.87
Kuwait 2 32.97
Taiwan 9 16.71
Anguilla 23 20.10
Libya 40 60.42
Switzerland 16 25.26
Belarus 5 51.02
Evaluation
From percentage error we can see that there is a strong correlation, but my percentage error was
higher than expected. While I was researching, I found multiple sources for each country. If I
had chosen any other source than I have chosen my percentage error could have been lower.
Although second issue could be that I have chosen 50 countries I needed to choose more than 50
countries to decrease percentage error .From this investigation, I evaluated that in my new
investigations I need to choose more than 50 countries, to find better result also in my new
investigations I will use new type of TI-84 calculators to find more reluctant result. Every year
technology upgrades itself, and with new graphical calculators my results will be superior.
Conclusion
It is important to check level of correlation between Poverty rate, and Crime rate. From the
scatter diagram, and correlation (r value) we can see there is a positively strong linear
correlation. Using Pearson's Rank correlation graph (GDC) proves that the result is reliable. The
statement of the Demolition Man movie was that being poor may result in an increase in criminal
activity, it is expected that higher levels of poverty are linked to higher crime rates. And my
investigation proves that statements of the Demolition Man movie were reliable.
Bibliography
9. “Random Country Generator - Test Where You Land and Learn about It.,” Random
Country - Explore the World, May 17, 2022, https://random.country/ Accessed on
November 2023
10. “The Meaning of R, R Square, Adjusted R Square, R Square Change and F Change in a
Regression Analysis,” Analysis INN., March 13, 2020,
https://www.analysisinn.com/post/the-meaning-of-r-r-square-adjusted-r-square-r-square-
change-and-f-change-in-a-regression-analysis/#:~:text=R%20in%20a%20regression
%20analysis,independent%20and%20a%20dependent%20variable. Accessed on
November 2023
11. “The Meaning of R, R Square, Adjusted R Square, R Square Change and F Change in a
Regression Analysis,” Analysis INN., March 13, 2020
https://www.analysisinn.com/post/the-meaning-of-r-r-square-adjusted-r-square-r-square-
change-and-f-change-in-a-regression-analysis/#:~:text=R%20in%20a%20regression
%20analysis,independent%20and%20a%20dependent%20variable. Accessed on
November 2023