0% found this document useful (0 votes)
39 views

Predictive Tool: Rental Bike Hiring Services

The document discusses a study that aimed to predict daily demand patterns for publicly provided bike hire services in a major city. Descriptive statistics and tables are provided about bike rental data from 2012, including variables like season, month, weather, and casual/registered rentals. A multiple regression analysis was conducted as the predictive tool to identify relationships between independent variables (like season, weather) and the dependent variable of daily bike rental demand. The analysis sought to help determine when and why demand for bike hire services increases.

Uploaded by

Maham Shahid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
39 views

Predictive Tool: Rental Bike Hiring Services

The document discusses a study that aimed to predict daily demand patterns for publicly provided bike hire services in a major city. Descriptive statistics and tables are provided about bike rental data from 2012, including variables like season, month, weather, and casual/registered rentals. A multiple regression analysis was conducted as the predictive tool to identify relationships between independent variables (like season, weather) and the dependent variable of daily bike rental demand. The analysis sought to help determine when and why demand for bike hire services increases.

Uploaded by

Maham Shahid
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

2021

Predictive Tool

RENTAL BIKE HIRING SERVICES


MAJOR CITY
Introduction

Bike hire service in major city is increasing with every passing day are provided publicly. Individuals going to anywhere frequently

hire the bike services. Eventually, the demand for publicly-provided bike hire service in a major city is increasing. It is important to

find out when and under what circumstances the demand for bike hire services is elevating. This can be done by predicting the likely

pattern of daily demand for a publicly-provided bike hire service in a major city. Therefore, the main purpose of the study to find out

the likely pattern of daily demand for a publicly-provided bike hire service in a major city through predictive tool. Below are the

following steps the researcher used to achieve the research objectives:

 To summarize the data using both graphics and descriptive statistics.

 To propose a predictive tool.

 To reflect on the requirements for a data analytics project to be effective.

Key Features of Data

Data was analyzed for the year 2012. The count represents the sample size, i.e. 537 observations of the data was collected for the

study. Two seasons were selected for the data collection purpose, i.e., spring and summer. Further, March to August were the months

for the analysis. The researcher chose working days; hence, no holiday variable was chosen for the research purpose. Thus, it excluded

Sundays from the analysis and includes remaining six days of the week. The mean and median values has to be close together in order

to get symmetric distributions. It can be seen in the below tables that the mean and median are closely together. This represents that it
is a good data set as it leads to a symmetric distribution. However, for the causal and registered rental bikes do not mean and median

close. Moreover, the negative value of kurtosis show that the represents shorter peak and thicker tails than the normal distribution for

variables named: season, month, weekday, working day, weather, temp, Atemp, humidity, and registered. However, positive kurtosis

for the variables: holiday, wind speed and casual represent a taller peak and thinner tails than the normal data. To add further, the peak

is extremely for holiday variable, i.e. 31.12. This value exceeds the normal range of the kurtosis and is known as leptokurtic.

Moreover, skewness for the holiday is also high. The value of skewness is furthest from the mean and zero value.

Further, positive skewness exists for season, month, weekday, weather, wind speed, casual and registered variables. Positive value of

skewness represents slightly right-skewed data as the value is close to zero. However, negative skewness exists for humidity, temp and

Atemp variables. Moreover, this values are also closer to zero; therefore, left-skewed data is eligible. To conclude, the data is not

negligibly skewed. Standard deviation for all the variables are close to the mean value, except for the casual and registered variables.

Table 1 Descriptive Analysis


Season Month Holiday Weekday Working Day

Mean 2.189944 Mean 5.510242 Mean 0.027933 Mean 2.994413 Mean 0.685289
Standard Error 0.045681 Standard Error 0.144141 Standard Error 0.007117 Standard Error 0.086447 Standard Error 0.020059
Median 2 Median 5 Median 0 Median 3 Median 1
Mode 2 Mode 1 Mode 0 Mode 6 Mode 1
Standard 1.05857 Standard 3.340221 Standard 0.164934 Standard 2.003254 Standard 0.464834
Deviation Deviation Deviation Deviation Deviation
Sample 1.12057 Sample Variance 11.15708 Sample 0.027203 Sample Variance 4.013028 Sample 0.21607
Variance Variance Variance
Kurtosis -1.00816 Kurtosis -0.92955 Kurtosis 31.12897 Kurtosis -1.25076 Kurtosis -1.36477
Skewness 0.457921 Skewness 0.439914 Skewness 5.745699 Skewness 0.006301 Skewness -0.80021
Range 3 Range 11 Range 1 Range 6 Range 1
Minimum 1 Minimum 1 Minimum 0 Minimum 0 Minimum 0
Maximum 4 Maximum 12 Maximum 1 Maximum 6 Maximum 1
Sum 1176 Sum 2959 Sum 15 Sum 1608 Sum 368
Count 537 Count 537 Count 537 Count 537 Count 537

Table 1 Descriptive Analysis (Continues)


Weather Temp Atemp Humidity

Mean 1.400372 Mean 0.475028 Mean 0.457009 Mean 0.626297892


Standard Error 0.023791 Standard Error 0.007739 Standard Error 0.006926 Standard Error 0.006490194
Median 1 Median 0.4675 Median 0.461475 Median 0.624583
Mode 1 Mode 0.265833 Mode 0.243058 Mode 0.605
Standard 0.551321 Standard 0.179334 Standard 0.160488 Standard 0.150398956
Deviation Deviation Deviation Deviation
Sample Variance 0.303955 Sample Variance 0.032161 Sample Variance 0.025756 Sample Variance 0.022619846
Kurtosis -0.08813 Kurtosis -1.03922 Kurtosis -0.90994 Kurtosis -
0.159728303
Skewness 0.970535 Skewness -0.01084 Skewness -0.09584 Skewness -
0.107998409
Range 2 Range 0.790037 Range 0.761826 Range 0.9725
Minimum 1 Minimum 0.05913 Minimum 0.07907 Minimum 0
Maximum 3 Maximum 0.849167 Maximum 0.840896 Maximum 0.9725
Sum 752 Sum 255.09 Sum 245.4136 Sum 336.321968
Count 537 Count 537 Count 537 Count 537
Table 1 Descriptive Analysis (Continues)
Wind Speed Casual Registered

Mean 0.196079 Mean 761.8194 Mean 3176.944


Standard Error 0.003348 Standard Error 28.77681 Standard Error 56.07503
Median 0.188839 Median 642 Median 3300
Mode 0.136817 Mode 120 Mode 1707
Standard 0.077576 Standard 666.8526 Standard 1299.441
Deviation Deviation Deviation
Sample Variance 0.006018 Sample Variance 444692.4 Sample Variance 1688548
Kurtosis 0.450778 Kurtosis 2.104919 Kurtosis -0.42969
Skewness 0.616556 Skewness 1.4794 Skewness 0.088805
Range 0.485071 Range 3401 Range 6040
Minimum 0.022392 Minimum 9 Minimum 416
Maximum 0.507463 Maximum 3410 Maximum 6456
Sum 105.2943 Sum 409097 Sum 1706019
Count 537 Count 537 Count 537

Predictive Tool

The predictive tool used for the study is regression analysis. Regression analysis is a way to identify whether there exists any

relationship between independent and dependent variable. There are two types of regression: linear and multiple. Linear regression is

for one independent variable; however, multiple regression is for multiple independent variable. Since, the study has more than one

independent variable, the study has conducted multiple regression. It, furthermore, helps to predict the value for the criterion resulting

from a linear combination of the predictors. The regression equation is as follows:


Y = a + bX1 + bX2 + bX3 +……

Where

Y is dependent variable

a is constant value

b is unstandardized B value

X is independent variable.

Dependent variable for the study is total rental bikes. Moreover, independent variables for the study are month, weekday, weather,

temperature, atemp, humidity and wind speed.

Regression Statistics

Multiple R 0.818361

R Square 0.669714

Adjusted R

Square 0.620418

Standard Error 707.6565


Observations 78

ANOVA

Significance

df SS MS F F

Regression 10 68032950 6803295 13.58545692 1.14E-12

Residual 67 33552112 500777.8

Total 77 1.02E+08

Standard Lower Upper

Coefficients Error t Stat P-value Lower 95% Upper 95% 95.0% 95.0%

Intercept 3475.058 873.1227 3.980034 0.000171936 1732.297911 5217.8183 1732.297911 5217.8183

Season 722.556 268.2088 2.694005 0.008 187.2089494 1257.902962 187.2089494 1257.902962

- -

Month -63.2895 123.2381 -0.51355 0.609 309.2737931 182.6947413 309.2737931 182.6947413

Weekday 147.1247 59.22326 2.484238 0.014 28.91455638 265.3347964 28.91455638 265.3347964


- - - -

Weather -735.057 215.9008 -3.40461 0.001 1165.996732 304.1173141 1165.996732 304.1173141

Temp -9786.28 8304.697 -1.1784 0.242 -26362.5289 6789.96198 -26362.5289 6789.96198

- -

Atemp 17857.29 9152.738 1.951033 0.05 411.6506167 36126.23438 411.6506167 36126.23438

- -

Humidity -2270.33 833.4156 -2.72412 0.008 -3933.8295 606.8206421 -3933.8295 606.8206421

- - - -

Wind Speed -2922.79 1100.95 -2.65479 0.009 5120.293392 725.2844286 5120.293392 725.2844286

The data above is from the regression analysis performed to test whether any significant relation exists between total rental bikes and

month, weekday, weather, temperature, atemp, humidity and wind speed. To examine the variance resulted on total rental bikes

through month, weekday, weather, temperature, atemp, humidity and wind speed, is checked from the value of R square. R square is

also known as the coefficient of determination. The range for R square could be 0 to 1. However, value of R square from 0 to 0.5 is

considered as weak model; above 0.5 to 0.69 is an average model; 0.7 to 0.99 is a strong model range of R square value. Moreover, if

the R square value is one, it means that variable model is perfect. R square for the model is 0.67; means 67% variance exists between

total rental bikes and month, weekday, weather, temperature, atemp, humidity and wind speed. Therefore, the model of total rental
bikes and month, weekday, weather, temperature, atemp, humidity and wind speed is a strong relation model as its close to 0.7 score

of R square. Multiple R value is 0.818 that presents liner correlation. This shows an overall positive correlation between total rental

bikes and month, weekday, weather, temperature, atemp, humidity and wind speed. F value in the ANOVA table is significant at α =

0.05. Moreover, beta coefficient for season, weekday and atemp is positively significant. It presents that a unit increase in season,

weekday and atemp, there will be a unit increase in rental bikes. However, negative beta coefficient exists for month, weather, temp,

humidity and wind speed. Negative beta coefficient denotes that a unit increase in season, weekday and temp, there will be a unit

decrease in rental bikes and vice versa. The results here shows vital findings as follows:

1. Rental bike hiring demand increases for….

1.1. the season of spring and summers;

1.2. weekdays, i.e. working days having no holidays;

1.3. Atemp.

2. Rental bike hiring demand decreases with the increase in

2.1. month, i.e. other than from March to June.

2.2. weather, i.e. cloudy, mist, etc.

2.3. temp, i.e. for more than normal temp

2.4. humidity, i.e. for the max value of 100

2.5. wind speed, i.e. for the max value of 67.


Furthermore, significant relation exists between total rental bikes and season, weekday, weather, atemp, humidity and wind speed as

the value of α = 0.05. However, month and temp have no significant (p value > 0.05) relation with rental bikes hiring, except that they

are correlated. Therefore, the predictive equation, for the predictive model, for the demand of bike hiring service is as below:

Y = a + bX1 + bX2 + bX3 +……

Total number of rental bike = constant + b (season) + b (weekday) + b (weather) + b (atemp) + b (humidity) + b (wind speed)

Total number of rental bike = 3475 + 722.55 (season) + 147.12 (weekday) + (-735.05) (weather) + 17857 (atemp) + (-2270.32)

(humidity) + (-2922.78) (wind speed)

Total number of rental bike = 3475 + 722.55 (season) + 147.12 (weekday) -735.05 (weather) + 17857 (atemp) -2270.32 (humidity) -

2922.78 (wind speed)

Conclusion

The study to found the predictive equation through regression to see the likely pattern of daily demand for a publicly-provided bike

hire service in a major city. The predictive equation is made based on the data collected for the year 2012. Variables that significantly

contribute to the rise in demand of the bike hiring are season, weekday, weather, atemp, humidity and wind speed. The predictive

equation above will have help the service providers to predict the future outcomes. The predictive equation is generalized as the data
collected is from the city. The prediction will aid the service providers to make strategies for the variables that showed up negative

coefficient. This will assure continuous rise in demand of the bike hiring services.

You might also like