0% found this document useful (0 votes)
41 views

Stats Assigny

Statistics is the science of collecting, organizing, and analyzing data. There are two main categories: descriptive statistics, which presents data through tables and graphs without inferences, and inferential statistics, which uses mathematical tools to make predictions from data. There are several measures of central tendency used in statistics, including the mean, mode, median, and harmonic and geometric means. Each measure is calculated differently and has advantages and disadvantages for different types of data distributions.

Uploaded by

Parth Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views

Stats Assigny

Statistics is the science of collecting, organizing, and analyzing data. There are two main categories: descriptive statistics, which presents data through tables and graphs without inferences, and inferential statistics, which uses mathematical tools to make predictions from data. There are several measures of central tendency used in statistics, including the mean, mode, median, and harmonic and geometric means. Each measure is calculated differently and has advantages and disadvantages for different types of data distributions.

Uploaded by

Parth Sharma
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Introduction to Statistics

Statistics is a mathematical science including methods of


collecting, organizing and analyzing data in such a way
that meaningful conclusions can be drawn from them. In
general, its investigations and analyses fall into two
broad categories called descriptive and inferential
statistics.
Descriptive statistics deals with the processing of data
without attempting to draw any inferences from it. The
data are presented in the form of tables and graphs. The
characteristics of the data are described in simple terms.
Events that are dealt with include everyday happenings
such as accidents, prices of goods, business, incomes,
epidemics, sports data, population data.
Inferential statistics is a scientific discipline that uses
mathematical tools to make forecasts and projections by
analyzing the given data. This is of use to people
employed in such fields as engineering, economics,
biology, the social sciences, business, agriculture and
communications.
Measures of Central Tendency:
Mean

The mean,’ mean or average is calculated by summ­ing


all the individual observations or items of a sample and
divid-ing this sum by the number of items in the sample.
For example, as the result of a gas analysis in a
respirometer an investigator obtains the following four
readings of oxygen percentages:

14.9
10.8
12.3
23.3

Sum=61.3
He calculates the mean oxygen percentage as the sum of
the four items divided by the number of items—here, by
four. Thus the average oxygen percentage is

Mean = 61.3 / 4 =15.325%

Calculating a mean presents us with the opportunity for


learning statistical symbolism. An individual observation
is symbo-lized by Yi, which stands for the ith observation
in the sample. Four observations could be written
symbolically as Yi, Y2, Y3, Y4.

We shall define n, the sample size, as the number of


items in a sample. In this particular instance, the sample
size n is 4. Thus, in a large sample, we can symbolize the
array from the first to the nth item as follows: Y1, Y2…,
Yn. When we wish to sum items, we use the following
notation:

The capital Greek sigma, Ʃ, simply means the sum of


items indica-ted. The I = 1 means that the items should
be summed, starting with the first one, and ending with
the nth one as indicated by the I = n above the Ʃ. The
subscript and superscript are necessary to indicate how
many items should be summed. Below are seen
increasing simplifications of the complete notation
shown at the extreme left:

Properties of Arithmetic Mean:

1. The sum of deviations of the items from the


arithmetic mean is always zero i.e.
∑(X–X) =0.

2. The Sum of the squared deviations of the items from


A.M. is minimum, which is less than the sum of the
squared deviations of the items from any other
values.
3. If each item in the series is replaced by the mean,
then the sum of these substitutions will be equal to
the sum of the individual items.

Merits of A.M:

1. It is simple to understand and easy to calculate.


2. It is affected by the value of every item in the series.
3. It is rigidly defined.
4. It is capable of further algebraic treatment.
5. It is calculated value and not based on the position in
the series.

Demerits of A.M:

1. It is affected by extreme items i.e., very small and


very large items.
2. It can hardly be located by inspection.
3. In some cases A.M. does not represent the actual
item. For example, average patients admitted in a
hospital is 10.7 per day.
4. A.M. is not suitable in extremely asymmetrical
distributions.
Mode
The mode is a statistical term that refers to the most
frequently occurring number found in a set of
numbers. The mode is found by collecting and
organizing data in order to count the frequency of
each result. The result with the highest count of
occurrences is the mode of the set, also referred to
as the modal value.

Examples of Mode:

For example, in the following list of numbers, 16 is


the mode since it appears more times than any other
number in the set:

3, 3, 6, 9, 16, 16, 16, 27, 27, 37, 48


A set of numbers can have more than one mode
(this is known as bimodal if there are 2 modes) if
there are multiple numbers that occur with equal
frequency, and more times than the others in the
set.
3, 3, 3, 9, 16, 16, 16, 27, 37, 48
In the above example, both the number 3 and the
number 16 are modes as they each occur three
times and no other number occurs more than that.

If no number in a set of numbers occurs more than


once, that set has no mode:

3, 6, 9, 16, 27, 37, 48

Advantages of the mode include:

1. It is easy to understand and simple to calculate.


2. It is not affected by extremely large or small
values.
3. It can be located just by inspection in un-
grouped data and discrete frequency
distribution.
4. It can be useful for qualitative data.
5. It can be computed in an open-end frequency
table.
6. It can be located graphically.
Disadvantages of the mode include:

1. It is not well defined.


2. It is not based on all the values.
3. It is stable for large values so it will not be well
defined if the data consists of a small number of
values.
4. It is not capable of further mathematical
treatment.
5. Sometimes the data has one or more than one
mode, and sometimes the data has no mode at
all.
Harmonic Mean

A simple way to define a harmonic mean is to call it the


reciprocal of the arithmetic mean of the reciprocals of
the observations. The most important criteria for it is
that none of the observations should be zero.

A harmonic mean is used in averaging of ratios. The most


common examples of ratios are that of speed and time,
cost and unit of material, work and time etc. The
harmonic mean (H.M.) of n observations is

H.M. = 1÷ (1⁄n ∑ i= 1n (1⁄xi) )

In the case of frequency distribution, a harmonic mean is


given by

H.M. = 1÷ [1⁄N (∑ i= 1n (fi ⁄ xi)], where N = ∑ i= 1n fi


Properties of Harmonic Mean

1. If all the observation taken by a variable are


constants, say k, then the harmonic mean of the
observations is also k.
2. The harmonic mean has the least value when
compared to the geometric mean and the arithmetic
mean.

Advantages of Harmonic Mean:

1. A harmonic mean is rigidly defined.


2. It is based upon all the observations.
3. The fluctuations of the observations do not affect
the harmonic mean.
4. More weight is given to smaller items.

Disadvantages of Harmonic Mean:

1. Not easily understandable.


2. Difficult to compute.
Geometric Mean

A geometric mean is a mean or average which shows the


central tendency of a set of numbers by using the
product of their values. For a set of n observations, a
geometric mean is the nth root of their product. The
geometric mean G.M., for a set of numbers x1, x2, … , xn
is given as

G.M. = (x1. X2 … xn)1⁄n

Or, G. M. = (π I = 1n xi) 1⁄n = n√( x1, x2, … , xn).

The geometric mean of two numbers, say x, and y is the


square root of their product x×y. For three numbers, it
will be the cube root of their products i.e., (x y z) 1⁄3.

Properties of Geometric Means:


1. The logarithm of geometric mean is the arithmetic
mean of the logarithms of given values.
2. If all the observations assumed by a variable are
constants, say K >0, then the G.M. of the observation
is also K.
3. The geometric mean of the ratio of two variables is
the ratio of the geometric means of the two
variables.
4. The geometric mean of the product of two variables
is the product of their geometric means.

Advantages of Geometric Mean:

1. A geometric mean is based upon all the


observations.
2. It is rigidly defined.
3. The fluctuations of the observations do not affect
the geometric mean.
4. It gives more weight to small items.

Disadvantages of Geometric Mean:

1. A geometric mean is not easily understandable by a


non-mathematical person.
2. If any of the observations is zero, the geometric
mean becomes zero.
3. If any of the observation is negative, the geometric
mean becomes imaginary.

You might also like