0% found this document useful (0 votes)
35 views7 pages

Ststistical Concepts and Market Returns

The document discusses various statistical concepts including descriptive and inferential statistics, populations and samples, measurement scales, measures of central tendency, and other statistical measures. It provides definitions and examples of nominal, ordinal, interval, and ratio measurement scales. It also defines and explains how to calculate parameters, sample statistics, frequency distributions, measures of central tendency including means, medians, and modes, as well as other statistical measures such as ranges, variations, standard deviations, quartiles, and percentiles.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views7 pages

Ststistical Concepts and Market Returns

The document discusses various statistical concepts including descriptive and inferential statistics, populations and samples, measurement scales, measures of central tendency, and other statistical measures. It provides definitions and examples of nominal, ordinal, interval, and ratio measurement scales. It also defines and explains how to calculate parameters, sample statistics, frequency distributions, measures of central tendency including means, medians, and modes, as well as other statistical measures such as ranges, variations, standard deviations, quartiles, and percentiles.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Statistical Concepts And Market Returns

LOS -1) Distinguish between descriptive and inferential statistics, between a population and a
sample, and among the types of measurement scales.

Statistics can refer to two things – one referring to data and other to methods employed. Descriptive
statistics is used to describe effectively important aspect of a large set of data. It turns data into useful
information. Statistical inference allows us to make judgements and estimation about a larger set of
data from a smaller set of data. We will focus on descriptive statistics in this reading.

Population is entire set of data. Sample is a subset of population. A sample statistic is a quantity used
to describe a sample. Eg) mean of a sample is its sample statistic.

NOMINAL SCALES
When measuring using a nominal scale, one simply names or categorizes responses. Gender,
handedness, favorite color, and religion are examples of variables measured on a nominal scale. The
essential point about nominal scales is that they do not imply any ordering among the responses. For
example, when classifying people according to their favorite color, there is no sense in which green is
placed "ahead of" blue. Responses are merely categorized. Nominal scales embody the lowest level
of measurement.

ORDINAL SCALES
A researcher wishing to measure consumers' satisfaction with their microwave ovens might ask them
to specify their feelings as either "very dissatisfied," "somewhat dissatisfied," "somewhat satisfied," or
"very satisfied." The items in this scale are ordered, ranging from least to most satisfied. This is what
distinguishes ordinal from nominal scales. Unlike nominal scales, ordinal scales allow comparisons of
the degree to which two subjects possess the dependent variable. For example, our satisfaction
ordering makes it meaningful to assert that one person is more satisfied than another with their
microwave ovens. Such an assertion reflects the first person's use of a verbal label that comes later in
the list than the label chosen by the second person.

INTERVAL SCALES
Interval scales are numerical scales in which intervals have the same interpretation throughout. As an
example, consider the Fahrenheit scale of temperature. The difference between 30 degrees and 40
degrees represents the same temperature difference as the difference between 80 degrees and 90
degrees. This is because each 10-degree interval has the same physical meaning (in terms of the
kinetic energy of molecules).

RATIO SCALES
The ratio scale of measurement is the most informative scale. It is an interval scale with the additional
property that its zero position indicates the absence of the quantity being measured. You can think of
a ratio scale as the three earlier scales rolled up in one. Like a nominal scale, it provides a name or
category for each object (the numbers serve as labels). Like an ordinal scale, the objects are ordered
(in terms of the ordering of the numbers). Like an interval scale, the same difference at two places on
the scale has the same meaning. And in addition, the same ratio at two places on the scale also
carries the same meaning.

Los -2)Define a parameter, sample statistic and a frequency distribution.

Parameter is any descriptive measure of a population characteristic. Eg) mean of population


A sample statistic is a quantity computed from or used to describe a sample.
A frequency distribution is a table that displays the frequency of various outcomes in a sample. Each
entry in the table contains the frequency or count of the occurrences of values within a particular
group or interval, and in this way, the table summarizes the distribution of values in the sample.

Los-3) Calculate and interpret relative frequencies and cumulative relative frequencies, given a
frequency distribution.

The relative frequency (or empirical probability) of an event refers to the absolute frequency
normalized by the total number of events: The values of for all events can be plotted to produce a
frequency distribution.
https://www.youtube.com/watch?v=7jUIt39tUBM

The relative cumulative frequency is the quotient between the cumulative frequency of a particular
value and the total number of data. It can be expressed as a percentage.
https://www.youtube.com/watch?v=kBX9aNdOYDg

Los -4) Describe properties of a data set presented as a histogram or a frequency polygon.

A histogram is a graphical representation of the distribution of data. It is an estimate of the probability


distribution of a continuous variable (quantitative variable)
https://www.youtube.com/watch?v=KCH_ZDygrm4
Frequency polygons are a graphical device for understanding the shapes of distributions. They serve
the same purpose as histograms, but are especially helpful for comparing sets of data. Frequency
polygons are also a good choice for displaying cumulative frequency distributions.
https://www.youtube.com/watch?v=i14Ac6M5Xzk

LOS -5)Calculate and interpret measures of central tendency, including population mean,
sample mean, arithmetic mean, weighted average or mean, geometric mean, harmonic mean,
median and mode.

Measure of central tendency specifies where data are centered.


Arithmetic mean is sum of observations divided by total number of observations. Population mean is
arithmetic mean of the population
μ = ( Σ Xi ) / N

Sample mean is arithmetic mean of sample.

x = ( Σ xi ) / n

The median is the numerical value separating the higher half of a data sample, a population, or
a probability distribution, from the lower half. In an odd numbered sample, median occupies the
(n+1)/2 position. In an even numbered sample median is mean of items occupying the n/2 and (n+2)/2
positions. (the two middle terms). Unlike mean ,extreme value do not effect median. Disadvantage is
that it does not use all information about magnitude and size of observations. Also calculating median
is more complex.

Mode is the value occurring most frequently in distribution. A distribution can have more than one
mode or no mode at all. (unimodal, bimodal etc). It is the only measure of central tendency that can
be used with nominal data.

An advantage of mean over median and mode is that it uses all information about size and magnitude
of observations. It is also easy to work with mathematically. A drawback of mean is its sensitivity to
extreme values.

The weighted mean is similar to an arithmetic mean (the most common type of average), where
instead of each of the data points contributing equally to the final average, some data points
contribute more than others. If all the weights are equal, then the weighted mean is the same as
the arithmetic mean.

When we take weighted mean of forward looking data, weighted mean is called expected value.

The geometric mean is a type of mean or average, which indicates the central tendency or typical
value of a set of numbers by using the product of their values (as opposed to the arithmetic
mean which uses their sum). The geometric mean is defined as the nth root of the product of n
numbers.

The geometric mean of a data set is given by:

The geometric mean return formula is used to calculate the average rate per period on an investment
that is compounded over multiple periods. The geometric mean return may also be referred to as the
geometric average return.
Harmonic mean is computed by the following steps:

1. Taking the reciprocal of each observation, or 1/X,

2. Adding these terms together,

3. Averaging the sum by dividing by n, or the total number of observations,

4. Taking the reciprocal of this result.

The harmonic mean is most associated with questions about dollar cost averaging, but its use is
limited. Arithmetic mean, weighted mean and geometric mean are the most frequently used measures
and should be the main emphasis of study.

The harmonic mean H of the positive real numbers x1, x2, ..., xn > 0 is defined to be

Los -6) Calculate and interpret quartiles, quintiles, deciles and percentiles.

These terms are most associated with cases where the point of central tendency is not the main goal
of the research study. For example, in a distribution of five-year performance returns for money
managers, we may not be interested in the mean performer (i.e. the manager at the 50% level), but
rather in those in the top 10% or top 20% of the distribution. Recall that the median essentially divides
a distribution in half.

By the same process, quartiles are the result of a distribution being divided into four parts; quintiles
refer to five parts; deciles , 10 parts; and percentiles, 100 parts. A manager in the second quintile
would be better than 60% (bottom three quintiles) and below 20% (the top quintile) (i.e. somewhere
st
between 20% and 40% in percentile terms). A manager at the 21 percentile has 20 percentiles
above, 79 percentiles below.

Los -7) Calculate and interpret 1) a range and a mean absolute deviation 2) variation and
standard deviation of a population and of a sample.

Range is difference between maximum and minimum value. Advantage is ease of calculation.
Disadvantage is it uses only two data points, it cannot tell us how data is distributed.

The mean absolute deviation (MAD), also referred to as the "mean deviation" or sometimes "average
absolute deviation" is the mean of the data's absolute deviations about the data's mean: the average
(absolute) distance from the mean. Three steps:

a) Find the mean of all values


b) Find the distance of each value from that mean (subtract the mean from each value, ignore
minus signs)
c) Then find the mean of those distances

Population variance is arithmetic average of squared deviations around mean.

In general, the population variance of a finite population of size N with values xi is given by

where

Population standard deviation is square root of population variance.

The unbiased sample variance:

We divide sample size by n-1. By using n-1 instead of n, we improve the statistical properties of
sample variance. Sample variance here is unbiased estimator of population variance. N-1 is also
called degrees of freedom in estimating population variance.

Sample standard deviation is square root of sample variance.

Los -8) Calculate and interpret the proportion of observations falling within a specified number
of standard deviations of the mean using Chebyshev’s inequality.

2
Chebyshev’s inequality says that at least 1-1/K of data from a sample must fall within K standard
deviations from the mean, where K is any positive real number greater than one. Distribution should
have a finite variance.

Los -9) Calculate and interpret coefficient of variation and Sharpe ratio.

Coefficient of variation is ratio of standard deviation of a set of observations to their mean value.

Advantages

The coefficient of variation is useful because the standard deviation of data must always be
understood in the context of the mean of the data. In contrast, the actual value of the CV is
independent of the unit in which the measurement has been taken, so it is a dimensionless number.
For comparison between data sets with different units or widely different means, one should use the
coefficient of variation instead of the standard deviation.
Disadvantages

When the mean value is close to zero, the coefficient of variation will approach infinity and is therefore
sensitive to small changes in the mean. This is often the case if the values do not originate from a
ratio scale.

Unlike the standard deviation, it cannot be used directly to construct confidence intervals for the
mean.

Sharpe Ratio - A ratio developed by Nobel laureate William F. Sharpe to measure risk-adjusted
performance. The Sharpe ratio is calculated by subtracting the risk-free rate - such as that of the 10-
year U.S. Treasury bond - from the rate of return for a portfolio and dividing the result by the standard
deviation of the portfolio returns. Sharpe ratio formula is:

Los -10) Explain skewness and the meaning of a positively or negatively skewed return
distribution.

Los -11) Describe relative locations of mean median mode for a unimodal nonsymmetrical
distribution.

Los -12) Explain measures of sample skewnwss and kurtosis.

Skew, or skewness , can be mathematically defined as the averaged cubed deviation from the mean
divided by the standard deviation cubed. If the result of the computation is greater than zero, the
distribution is positively skewed. If it's less than zero, it's negatively skewed and equal to zero means
it's symmetric. For interpretation and analysis, focus on downside risk. Negatively skewed
distributions have what statisticians call a long left tail (refer to graphs on previous page), which for
investors can mean a greater chance of extremely negative outcomes. Positive skew would mean
frequent small negative outcomes, and extremely bad scenarios are not as likely.
A nonsymmetrical or skewed distribution occurs when one side of the distribution does not mirror the
other. Applied to investment returns, nonsymmetrical distributions are generally described as being
either positively skewed (meaning frequent small losses and a few extreme gains) or negatively
skewed (meaning frequent small gains and a few extreme losses).

Positive Skew Negative Skew

For positively skewed distributions, the mode (point at the top of the curve) is less than the median
(the point where 50% are above/50% below), which is less than the arithmetic mean (sum of
observations/number of observations). The opposite rules apply to negatively skewed distribution:
mode is greater than median, which is greater than arithmetic mean.Below is sample skewness
formula.

Where n is number of observations and s is sample standard deviation.

Kurtosis refers to the degree of peak in a distribution. More peak than normal (leptokurtic) means that
a distribution also has fatter tails and that there are lesser chances of extreme outcomes compared to
a normal distribution.

The kurtosis formula measures the degree of peak. Kurtosis equals three for a normal distribution;
excess kurtosis calculates and expresses kurtosis above or below 3. A distribution that is more
peaked than normal is called leptokurtic ,a distribution that is less peaked than normal is called
platykurtic and a distribution identical to normal distribution is called mesokurtic. Below is sample
kurtosis formula.

Where:

n = sample size

s = sample standard deviation

You might also like