[TRANS_ MMW] Data Management Pt. 1
[TRANS_ MMW] Data Management Pt. 1
B2 STAGES IN STATISTICAL
WEEK 1: DATA MANAGEMENT INVESTIGATION
● Continuous Variables
E SAMPLING METHODS
○ Continuous data has an infinite
number of probable values that
can be selected within a given SAMPLING METHODS
range. ● is a way of selecting individual members
○ This type of data can’t be counted or a subset of the population to make
but it can be measured. e.g. statistical inferences from them and
temperature range estimate characteristics of the whole
population.
● The population is the entire group that
D LEVELS OF MEASUREMENT
you want to draw conclusions about.
● The sample is the specific group of
NOMINAL individuals that you will collect data from.
● Values in the variable are used to label or
classify variables. Nominal data has no
E1 TYPES OF SAMPLING METHODS
order.
● Words, letters and alpha numeric
symbols can be used. PROBABILITY SAMPLING
● e.g. school type (public, private), religious ● Means that every member of the
affiliation (Catholic, Christian, Protestant, population has a chance of being
Muslim), in a survey 1 used to represent selected.
male and 2 used to represent female. ● It is mainly used in quantitative research.
INTERVAL
E2 PROBABILITY SAMPLING
● values tell the distances between the
TECHNIQUES
measurements in addition to the
classification and ordering.
● interval values data do not have a true SIMPLE RANDOM SAMPLE
zero point. ● Every member of the population has an
● e.g. temperature, the 0ºC equal chance of being selected.
● Your sampling frame should include the
RATIO whole population.
● is the most informative level of ● Two ways of simple random sampling:
measurement. The combination of first lottery or fishbowl technique and table of
three levels of measurements. random numbers.
● ratio values also order units that have the
same difference. SYSTEMATIC SAMPLING
● ratio values the same as interval values, ● Is similar to simple random sampling, but
with the difference that they have an it is usually slightly easier to conduct.
absolute zero. e.g. height, weight, length, ● Every member of the population is listed
etc. with a number, but instead of randomly
generating numbers, individuals are
chosen at regular intervals.
INTERVIEWS
E3 NON-PROBABILITY SAMPLING
● The researcher asks questions of a large
TECHNIQUES
sampling of people, either by direct
interviews or means of mass
CONVENIENCE SAMPLING communication such as by phone or
● Simply includes the individuals who mail.
happen to be most accessible to the ● This method is by far the most common
researcher. means of data gathering.
QUESTIONNAIRES
● Are a simple, straightforward data LINE GRAPH
collection method. ● It is the most widely used practical device
● Respondents get a series of questions, effective in showing a trend over a
either open or close-ended, related to the period.
matter at hand.
H PRESENTATION OF DATA
TEXTUAL PRESENTATION
● In this method, collected data are
presented in narrative and paragraph
forms. This mode of presentation
combines text and figures in a statistic.
● Examples:
○ 65% of email users worldwide
access their email via a mobile
device. Emails that are optimised BAR GRAPH
for mobile generate ● It is the simplest form of graphic
○ 15% higher click-through rates. presentation. It is generally intended for
○ 56% of brands using emojis in comparison of simple magnitude.
their email subject lines had a ● It may be either horizontal bar graph or a
higher open rate vertical bar graph.
TABULAR PRESENTATION
● This mode of presentation is better than
textual form. The data are systematically
presented through tables consisting of
vertical columns and horizontal rows
with headings for an easier and more
comprehensible comparison of figures.al
report.
● Examples:
○ Excel or Google Sheets
GRAPHICAL PRESENTATION
● Data gathered are presented in visual or
pictorial form. This would enable the
researcher to get clear view of the
relationships of data through pictures
and colored maps.
● Examples:
CIRCLE GRAPH OR PIE CHART
○ line graph, bar graph, circle graph
● It is a circle divided into parts whose sizes
or pie chart, pictograph or
are proportional to the magnitude or
pictogram, etc.
percentages they represent.
● It is used to show component parts of a
whole.
H1 TYPES OF GRAPHICAL PRESENTATION
CLASS MARK
● Also known as class midpoint. It is the
average of the lower and upper limits or
boundaries of each class.
● Class mark may be represented by the
letter x.
PICTOGRAPH OR PICTOGRAM
● A pictograph uses pictorial symbols for CLASS INTERVAL
population to indicate data. ● The range values used in defining a class.
It is simply the length of each class.
● It is the difference or distance between
the upper and lower class boundaries of
each class, and is affected by the nature
of the data and by the number of classes.
CLASS SIZE
● The width of each class interval.
I2 STEPS IN CONSTRUCTING A
FREQUENCY DISTRIBUTION TABLE
I3 DERIVED DISTRIBUTION
FREQUENCY POLYGON
● A closed broken line curve constructed
RELATIVE FREQUENCY DISTRIBUTION by plotting the class marks on the
● Represented by (%RF) is derived by horizontal or x-axis against the class
getting the ratio of the number of items frequencies which are plotted on the
in each class to the total number of vertical y-axis
frequency.
● The relative frequency distribution may
be expressed in percent. Its total sum
must be equal to 100%.