STAT2110 Instructions For Practical Work 2021
STAT2110 Instructions For Practical Work 2021
WRITTEN ASSIGNMENT
This assignment work is done in groups of 1-3 students upon individual choice.
The assignment report is submitted via Moodle as a pdf-file. In a group of several students
all students submit the same report separately.
The deadline for submitting this assignment is Sun 16th of January 2022 for those students
who follow Mode 1 to complete this course.
The deadline for submitting this assignment is the same date as the exam date is for those
students who follow Mode 2 to complete this course.
Feedback is given for each student in Moodle. The report will be accepted or it will be
returned for corrections. The corrected report should be submitted within 3 weeks after the
report was returned for corrections. There are at most two retries to correct the assignment.
The writing guidelines of the University of Vaasa should be followed when writing the report.
However, the cover page of the report should have the next information: the name of the
written assignment, the name(s) and student number(s) of the student(s).
The report can be written either in English or in Finnish.
Maximum number of pages for the report: 10 pages (cover page and possible appendices
not included).
1 Introduction
This is a brief introduction where you tell the main features of your research
- what are the cases
- what are your two research problems/questions
(for instance: do younger people have more loan defaults than older
(=relationship between age and default)
is there a linear relationship between income and credit card debt)
4 Summary
Lastly you summarize your whole research.
The dataset for the empirical work is empwork2021. You can download the empwork2021.xlsx -file
from Moodle.
(You can also have a dataset of your own. In that case, you should try to apply these instructions,
too. When submitting the report in Moodle, you should also submit your SAS dataset.)
Here is a short description of the dataset empwork2021:
The data is a hypothetical data file of a bank’s youngish customers. The file contains financial and
demographic information on 279 customers. The variables in the dataset are:
For your two research problems/questions you need to select at least 3 variables. You can also
create new variables on the bases of the existing ones by using variable transformations or
classifications. You formulate exactly two research problems on the bases of the selected variables.
The research problems should be related to statistical dependencies.
You start the statistical analysis of the dataset by first describing or/and illustrating the distributions
of the selected variables. You can do it by creating suitable statistical graphs and/or you can
calculate such basic descriptive statistics that is appropriate to describe either the location and
dispersion or the frequency distribution of a single variable.
The next thing to do is to examine the possible (pairwise) relationships between the variables you
have chosen. (You can also examine multivariate relationships.) Apply just one accurate analysis
method and statistical test per relationship. Altogether you must perform (at least) two statistical
tests of dependency.
In the next table, there are some ideas, which statistical analysis method and test you might want
to use to examine for instance pairwise relationships. Detailed information of these methods (for
instance the assumptions of the tests) can be found in the lecture notes.