0% found this document useful (0 votes)
636 views

QA

Uploaded by

hell no
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
636 views

QA

Uploaded by

hell no
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 264
| MU pai) OMe ir PVG Prof. R. M. Baphana NU das » | INN CHAPTER Introduction to Statistics Functions — Importance — Uses and Limitations of Statistics. Statistical data-Classification, Tabulation, Diagrammatic & Graphic representation of data, 1 12 13 14 Introduction. 1.1.1. Definitions of Statistics 1.1.2 Functions of Statistics. Importance and uses of statistics. 1.2.1. Statistics in Planning... 1.2.2 Statistics in State..... 1.2.3. Statistics in Mathematics 1.2.4 Statistics in Economics. 1.2.5 — Statistics in Industry. 1.2.6 Statistics in Astronomy... 1.2.7 Statistics in Social Sciences. 1.28 Statistics in War... Limitations of Statistics ....0..srnsnnnnin oe 1.3.1. Statistics does not Study Qualitative Phenomenon 1.3.2 Statistics Fails to Cover (Study) Individuals....... 1.3.3 Statistical Laws are not ExaCt....n.:rn0 1.3.4 Statistics Is Liable to be Misused... Statistical data-classification ...... 1.4.1. Functions of Classification 1.4.2 Reasons for Data Classification ..... Quantitative Analysis (MU-Sem.6-Comp) 1.4.3 Types of Data Classification . 4.4.4 Three Main Types of Data Classification. 1.4.5 Determining Data Risk. 1.4.6 Using a Data Classification Matrix. 1.4.7 Anexample of Data Classification. 1.4.8 Regression Algorithm Versus Clas: 4.4.9 Application Domains. 4.4.10 Rules for Classification 1.4.11 Bases of Classification ..... 1.4.12 Geographical Classification .. 1.4.13 Chronological Classification ... 1.4.14 Qualitative Classification... 1.4.15 Quantitative Classification... 1.5 Tabulation-Meaning and its Importance 1.5.1 The Parts of a Table... 1.8.2 Types of Tabulation... 1.5.3 Types of Table..... 1.5.4 Solved Examples on Preparation of Tables .... 1.6 Diagrammatic and graphic reprosentation of data 1.6.1. Difference between Diagrams and Graphs 4.6.2 General Rules for Constructing Diagrams... 1.6.3 Univariate analysis (U.A) 1.6.4 Steps to be Followed for U.A.. 1.7 Methods of Univariate Distribution .. 1.7.1. Frequency Distribution (F.D. 1.7.2 Bar-Charts... 1.7.3 Bar-Graph.. 1.7.4 Histogram... 1.7.5 Pie-Chart 1.7.6 Pie-diagram a 1.7.7 Comparison Table (Bar Chart V/s Histogram), 1.7.8 Comparison Between Histogram and Bar Graph... 1.7.9 Frequency Polygon. 1.8 _ Important Questions for Exar + Chapter Ends..... ee 12 | ute (New Syllabus wef academic year 21-22) (M6-76) A SACHIN SHAH V2 ‘ech-Neo Publications Quantitative Analysis (MU-Sem.6-Comp) (Introduction to Statistics) ...Pg. No. (1-3) ‘The subject of statistics is as old as human civilisation, although the sphere of its utility was very much restricted. The word statistics is derived from the Latin word ‘Status’, which means a Political state. In the ancient times the scope of statistics was primarily limited to the collection of the following data by the governments for military and fiscal policies. @) Age and sex-wise population of the country. Gi) Property and wealth of the country. @ Helps the government to have an idea of the manpower of the country, so that it can safeguard the country against any outside aggression and Gi) Provides the government with information for introduction of new taxes and levies. Noted Englishmen did pioneering work in developing the subject of statistics. * One writer has used the following words : “R.A. Fisher is the real giant in the development of the theory of statistics.” Indian statisticians also have made valueable contribution in the theory of statistics. * Notable among them are : C.RRao (Statistical Inference), Parthasarthy (Theory of Probability), P.C. Mahalanobis and P.V. Sukhatme (Sample Surveys); S.N. Roy (Multivariate Analysis); R.C. Bose, K.R. Nair, J.N. Srivastava (Design of Experiments) and others have placed India’s name in the world map of statistics. % 1.1.1 Definitions of Statistics The word statistics conveys different meanings in singular and plural sense. When used as plural, statistics means numerical set of data and when used in singular, it means the science of statistical methods used for collecting, analysing and drawing inferences from the numerical data, “Statistic may be defined as the aggregate of facts affected by a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to a reasonable standard of accuracy, collected in a systematic manner, for a pre- determined purpose and placed in relation to each other” _ Prof. Horace Secrist. A SACHIN SHAH Venture (New Syllabus w.ef academic year 21-22) (M6-76) [Hal rech-Neo Publications... — i Quantitative Analysis (MU-Sem.6-Come) (Introduction to Statistics) Po. No. (1.4 © Someone has jokingly said : “gince statistics is the science freeze, then the temperature of of averages, 50 if head is kept in boiler and lege, the stomach is statistics.” YS 1.1.2 Functions of Statistics Three main functions of ‘statistics are : Collection of data Following are methods of ‘collection of data : a Direct personal enquiry method. (a) (b) Indirect oral investigation. (©) By fitting of schedules (@) By mailed questionnaires. Information from local agents and correspondents. ) (f) By old records. (@) By direct observational methods. (2) Presentation of data ‘There are two kinds of statistical data, they are : (a) Primary data and (b) Secondary data (8) Analysis of data ‘The requisites are : (a) It should be complete. (b) It should be consistent. (c) It should be accurate. (@) It should be homogeneous in respect of unit of information. El i 1.2 IMPORTANCE AND USES OF STATISTICS 2 After the second world war, the concept of welfare state took its roots almost all over the world. 5 cae scope of statistics has widened to social and economic phenomens. Today | statist 2 is a not only as a mere device of collecting numerical data but 8 a | means of sound techniques for their handling and drawing inferences from them. (New Syllabus we academic year 21-2) (6-76) Bal rech-Neo Publications ..A SACHIN SHAH Ven® Quantitative Analysis (MU-Sem.6-Comp) (Introduction to Statistics) ...Pg. No. (1-5) + Because of mathematical treatment, it includes all sciences : social, physical and natural. «It finds numerical applications in various diversified fields such as agriculture, industry, sociology, biometry, planning, economics, business, management, psychometry, insurance, accountancy and auditing and so on. «The importance of statistics is correctly expressed by the commissioner of labour Carol D. Wright. : “To a very striking degree our culture has become a statistical culture. Even a person who may never have heard of an index number is affected ... by ... of those index numbers which describe the cost of living. It is impossible to understand psychology, Sociology, Economics, Finance or a Physical Science without some general idea of the meaning of an average, of variation, of sampling, of how to interpret charts and tables.” + According to H.G. Wells : “Statistical thinking will one day be as necessary for effective citizenship as the ability to read and write.” * Also, according to Bowley : “A knowledge of statistics is like a knowledge of foreign language or of algebra : It may prove of use at any time under any circumstances”. + Now, we discuss importance of statistics in some different disciplines. % 1.2.4 Statistics in Planning © The modern age is termed as ‘age of planning’ and almost all organisations in the government or business or management are resorting to planning for efficient working and for formulating policy decisions. To achieve this end, the statistical data relating to production, consumption, prices, investment, income, expenditure and so on and the advanced statistical techniques such as index numbers, time series analysis, demand analysis and forecasting techniques for handling such data are very important. In India, the use of statistics in planning was well visualised long back and the National Sample Survey (N.S. .) was primarily set up in 1950 for the collection of statistical data for planning in India. ‘%® 1.2.2 Statistics in State «The idea of welfare state is taking its roots in almost all countries. Today statistical data and techniques are must to the government in future economic programmes. © Mortality (death) statistics serve as a guide to the health authorities for sanitary improvements, improved medical facilities and public cleanliness. (New Syllabus w.ef academic year 21-22) (M6-76) [abrech-neo Publications ...A SACHIN SHAH Venture (ritoduoton 0 Statistics) Po. No. (1 tres and the government departing, ux in all the m governmen een ntl agencies in India are Con the main stat © The various statistic: in the centre and the states, Statistical Organisation (C.8.0.). YW. 1.2.3. Statistles In Mathematics «The modern theory of statics is based on the theory of probability. Alo, ay development of statistical techniques and theories for application to various acing, Social, Physical and natural are based on different mathematical models, «The increasing role of mathematics into statistics has led to a new branch of statist called ‘Mathematical Statistics’ In short, “Statistics is a branch of appig mathematics which specialises in data.” YW. 1.2.4 Statistics In Economics Statistical data and advanced techniques of statistical analysis are immensely usefy in the solution of a variety of economic problems such as production, consumption, distribution of income and wealth, wages, prices, profits, savings, expenditure, investment, unemployment, poverty ete. For example, the studies of consumption statistics reveal the pattern of the consumption of the various commodities by different sections uf the society and also enable us to have some idea about their purchasing capacity and their standard of living. ‘The studies of production statistics enable us to strike a balance between supply and demand which is provided by the laws of supply and demand. The income and wealth statistics are helpful in reducing the disparities of income. + The statistics of prices are needed to study the price theories and the general problem of inflation through the construction of the cost of living and wholesale price index, numbers. « Exchange statistics reflect upon the commercial development of a nation and tell us about the money in circulation and the volume of the business in the country. «The advanced and sound statistical techniques have been used successfully in the analysis of cost functions, production functions and consumption functions. | + Time series analysis, index numbers, forecasting techniques and demand analysis ar some of the very powerful statistical tools are used in economic data and economic planning. + Time series analysis is used for the study of series relating to prices, produetion and consumption of commodities, bank clearings, money in circulation etc. (New Slabus wer academic yea 21-22 (M6-76) [al rech.Neo Puications a SACHIN SHAH Vertut ja Index numbers are termed as ‘economic barometers’. They are the numbers and exhibit the changes over specified period of time in (i) Prices of different commodities. (ii) Industrial / Agricultural preduction. (ii) Sales iv) Imports and exports. (W) Cost of living ete. ‘The demand analysis consists in making an economic study of the market data to determine the relation between : (i) The prices of a given commodity and its absorption capacity for the market i.e. demand and Gi) The price of a commodity and its output i.e. supply. 1.2.5. Statistics in Industry In industry, statistics is used in ‘Quality control’. The main aim in any production process is to control the quality of the manufactured product so that it conforms to specifications. This is called process control and is achieved through the powerful technique of control charts and inspection plans. 1.2.6 Statistics in Astronomy Astronomers made statistical data recordings about the movements of heavenly bodies like stars and planets for the study of eclipses. The principle of least squares was developed by Gauss to obtain the equation of the famous ‘Normal Law of Errors’ in Astronomy. Gauss used the normal curve to describe the theory of accidental errors of measurements involved in the calculation of orbits of heavenly bodies. 1.2.7 Statistics in Social Sciences According to W.I. King, “The science of statistics is the method of judging collective, natural or social phenomenon from the results obtained from the analysis or collection of estimates”. Every social phenomenon is affected to a marked extent by a multiplicity of factors which bring out the variation in observations from time to time, place to place and object to object. (New Syllabus w.ef academic year 21-22) (M6-76) Ul rech-neo Publications ..A SACHIN SHAH Venture ion to Statistics) Quantitative Analysis (MU-Sem.6-Comp) (Introduction istics) .P9.No. (14g © Statistical tools of Regression and Correlation Analysis can be used to study the effey of each of these factors on the given observation. «Sampling Techniques and Estimation Theory are powerful tools for any social survey, © Statistical data and statistical techniques have been used extensively in social sciences. © Croxton and Cowden have remarked : “Without an adequate understanding of the statistical methods, the investigator in the social sciences may be like the blind man grouping in a dark room for a black cat that is not there. The methods of statistics are useful in an over-widening range of human activities in any field of thought in which numerical data may be had”. g 1.2.8 Statistics in War Statistics can also be used in war-times. Without the data concerning the military strength of the enemy, it is not possible to face a war. Pakistan lost the war in spite of the fact that it had much more sophisticated military equipments in terms of war aeroplanes. The statistical analysis revealed that this was due to inadequate, insufficient and inferior training given to the Pakistan military personnel in the use of equipment. * This shows that war cannot be won by merely modernising the fleet of war planes, bombers, tanks etc. + The army officials should be given sufficient, right training for the effective use of the equipment. »1_1.3_ LIMITATIONS OF STATISTICS Although statistics is used widely in almost all sciences : social, physical and natural and also used in almost all spheres of human activity, there are still some limitations which restrict its scope and utility. % 1.3.1 Statistics does not Study Qualitative Phenomenon ‘+ Statistics are numerical statements. Since statistics deals with only numerical data, it can be applied to only those phenomena which can be measured quantitatively. + For example, the standard of living of the people in Pune has gone up considerably 8s compared with the last year or the population of India has increased considerably during the last few years do not constitute statistics, (New Syllabus w.e.f academic year 21-22) (M6-76) Wlrech.neo Publications ....A SACHIN SHAH Venture (Introduction to Statistics) ...Pg. No. (1-9) ‘Thus, it means that statistics cannot be used for the study of quality characteristics like health, beauty, honesty, welfare, poverty etc. which cannot be measured quantitatively. Even then the techniques of statistical analysis can be applied to qualitative phenomenon indirectly. We can assign them quantative standards or express them numerically. For example, attribute of intelligence in a group of individuals can be studied using ‘Intelligence Quotient’ (IQ) and it may be regarded as the quantitative measure of the individual's intelligence. ‘1.3.2 Statistics Falls to Cover (Study) Individuals A single item or figure cannot be regarded as statistics unless it is a part of any particular field of data. Thus statistical methods do not give any recognition to an object or a person or an event. + For example, the price of a single commodity, the profit of a p production of a particular business house do not make statistics because these figures articular concern or the are not comparable. « But the sum-total of figures relating to prices and consump’ commodities, the sales and profits of a business house, the income, expenditure, production ete. over different places and different times will form statistics. For example, the average income of a group of people of a particular state in some given year is of no meaning unless we are also given the figures of other groups in tion of various same year. ‘Thus statistics is based mainly to those problems where group characteristics are mentioned. 1.3.3. Statistical Laws are not Exact ‘The statistical laws are based on probability inferences. Such inferences are approximate and not exact, as the mathematical or scientific inferences are. Statistical laws are more or less true only on the average. 1 + For example, if we throw a die, the probability of getting a number say 5 is ¢- That does not mean that if we throw a die 12 times, we will get number 5 two times. «We may get the same number 0, 1, 2, .., 12 times. But, if we carry on the same experiment indefinitely, then we expect all the numbers to be equally spaced (New Syllabus w.e academic year 21-22) (M6-76) Tl tech. Neo Publications ..A SACHIN SHAH Venture (Introduction to Statistics) ‘1.3.4 Statistics Is Llable to be Misused * Statistics neither proves nor disproves anything. It is merely a tool which, if right), used may prove to be usefull but if misused by inexperienced, unskilled and dishonest statisticians might lead to dangerous conclusions. According to Bowley, “Statistics only furnishes a tool though imperfect which ig dangerous in the hands of those who do not know its use and deficiencies”. Statistica] methods are the most dangerous tools in the hands of the inexpert. * The main point is that statistics deals with figures which are innocent in themselves and do not represent quality and can also be easily distorted, manipulated by politicians, unskilled and dishonest workers, dishonest people for personal selfish motives. * Thus the most significant limitation of statistics is that it must be used by experts, bi_1.4 STATISTICAL DATA-CLASSIFICATION ———————— EEE ‘We mention below some definitions of classification. “Classification is the process of arranging data into sequences and groups according to ‘their common characteristics, or separating them into different but related parts”. ~ Secrist. “A classification is a scheme for breaking a category into a set of parts, called classes, according to some precisely defined differing characteristics possessed by all the elements of the category”. - Tuttle AM. + Thus ‘classification’ is the arrangement of the data into different classes, which are to be determined depending upon the nature, objectives and scope of the enquiry. + For example, the number of students registered in Pune University during the academic year 2020-21 may be classified on the basis of the following criterion : (3) Sex, Gi) Age, (ii) Religion, (iv) The state to which they belong, (v) Different faculties : Engineering, Medical, Arts, Science, Law, Commerce ete. (vi) Heights or weights, (vii) Institution / College and so on. * The same data can be classified into different groups in a number of ways. That is based on physical, mental or social characteristics, + The data in one class will be different from those of another class with respect to some characteristic called the basis or criterion of classification, | (New Syllabus wef academic year 21-22) (M6-76) [ll tech-neo Publications... SACHIN SHAH Venture | {_ Quantitative Analysis (MU-Sem.6-Comp) (Introduction to Statistics) 1. (1-11) «Thus we observe that to analyse any statistical date, classification need not be restricted to one criterion or basis only. So we classify the data with respect to two or more bases simultaneously. * This technique of dividing the given data into different classes is called cross-classification. + For example, the students in Pune University may be classified with respect to sex, age, religion and so on. % 1.4.1 Functions of Classification Different (modes or) functions of classification are : (1) Geographical classification : This is according to place, area or region. (2) Condensation of data : Classification presents the huge raw data in a condensed form, And it is easily comprehensible to the mind and also highlights the main features contained in the data. (8) It facilitates comparison : Classification allows us to make meaningful comparisons depending on the basis of classification. (4) It helps to study relationships : The classification of data with respect to two or more comparisons enables us to study the relationship between the two criterion. For example, sex of the students and faculty they join in the university. (5) It gives statistical treatment of the data : Classification arranges the huge heterogeneous data into relatively homogeneous groups according to their points of similarities. This way data is made more intelligible and useful. And further processing like tabulation, analysis and interpretation of data is possible. %. 1.4.2 Reasons for Data Classification * Data classification has improved significantly over time. Today, the technology is used for a variety of purposes, often in support of data security initiatives. * Maintaining regulatory compliance and to meet various other business or personal objectives. In some cases, data classification is a regulatory requirement, as data must be searchable and retrievable within specified timeframes. * For the purpose of data securities, data classification is a useful tactic that facilitates proper security responses based on the type of data being retrieved, transmitted or copied. (New Syllabus wees academic year 21-22) (M6-76) [¥] tech-Neo Publications ...A SACHIN SHAH Venture (introduction to Statistics) (MU-Sem.6:Comp) No. (1-12) QA 1.4.3 Types of Data Classification (1) Data classification often involves a multitude of tags and labels that define the type oy data, its confidentiality and its integrity. (2) Availability may also be taken into consideration in data classification processes, (3) Data's level of sensitivity is often classified and it is based on varying levels op importance or confidentiality, which then correlates to the security measures put in place to protect each classification level. 1.4.4 Three Main Types of Data Classification These are considered as industry standards : @ Content : Content-based classification inspects and interprets files looking for sensitive information. Gi) Context : Context-based classification looks at application, location or creator among other variables as indirect indicators of sensitive information. (iii)User : User-based classification depends on a manual, end-user selection of each document. User-based classification relies on user knowledge and discretion at creation, edit, review or dissemination to flag sensitive documents. Content-context and user-based approaches can be both right or wrong depending on the business-need and data-type. 1.4.5 Determining Data Risk + _ In addition to the types of classification, an organisation must determine the relative risk associated with the types of data, how that data is handled and where it is stored/sent (end points). + Acommon practice is to separate data and systems into three levels of risk. ) Low risk : If data is public and it is not easy to lose permanently (i.e. recovery is easy), this data collection and the systems surrounding it are likely a lower risk than others. (ii) Moderate risk : Essentially, this data is not public or is not used internelly (ie. by the organisation and/or partners). Also, it is also not likely to be critical to operations or sensitive to be of “high risk". Proprietary operating procedures, cost of goods and some company documentation may fall into the moderate category. (ii) High risk + Anything remotely sensitive or crucial to operational security 8° into the high risk eategory. (New Syllabus w.e academic year 21-22) (M6-76) nn Tech-Neo Publications ...A SACHIN SHAH Vente Quantitative Analysis (MU-Sem.6-Comp) (Introduction to Statistics) ...Pg. No. 3) Also, pieces of data that are extremely hard to recover (if lost). All confidential, sensitive and necessary data falls into a high risk category. 1.4.6 Using a Data Classification Matrix * Creating and labelling data may be easy for some organisations. If there are not a large number of data types or perhaps your business has fewer transactions, determining the risk of data and your systems is likely less difficult. « Many organisations dealing with high volume or multiple types of data are likely to need a comprehensive way of determining their risk. For this, many use a “data classification matrix’. + Creating a matrix rating data and /or systems from how likely they are to be compromised and how sensitive that data is will help you to determine quickly how to classify better and to protect all things sensitive. ‘3 1.4.7 An example of Data Classification + A number of different category lists can be applied to the information in a system. One way to classify sensitivity categories might include classes such as secret, confidential, business-use only and public. + Anorganisation might use a system that classifies information as based on the type of qualities it drills down into. «For example, types of information might be content that goes into the files looking for certain characteristics. Context based classification examines applications, users, geographic location or creator about the application. © User classification is based on what an end user chooses to create, edit and review. 5 Data reclassification It is important for an organisation to continuously update the classification system by reassigning the values, ranges and outputs to more effectively meet the organisations classification goals. % 1.4.8 Regression Algorithm Versus Classification Algorithm + Both regression and classification algorithms are standard data management styles. When it comes to organisation of data, the biggest difference between regression and classification algorithms fall within the type of expected output. + For any system that will produce a single set of potential results within a finite range, classification algorithms are ideal. (New Syllabus w.e.f academic year 21-22) (M6-76) [Ral rech-neo Publications ..A SACHIN SHAH Venture (Introduction to Statistics) ...Pg, No somp) Qui titative Analysis (MI + When the results of an algorithm are continuous, such #8 an output of time op using a regression algorithm or linear regression algorithm is more efficient, Jenga, 2 1.4.9 Application Domains Classification has many applications. In some of these it is employed as a gy, mining procedures, while in others more detailed statistical modelling is undertaken, (1) Computer vision (a) Medical imaging and medical image analysis. (b) Optical character recognition. (©) Video tracking (2) Drug discovery and development (a) Toxico genomics (b) Quantitative structure-activity relationship, (3) Geo-statisties (4) Speech recognition (5) Handwriting recognition ©) Biometric identification (7) Biological classification (8) Statistical natural language processing (9) Document classification (10) Internet search engines (11) Credit scoring (12) Pattern recognition. (13) Recommender system (14) Micro-array classification . 2. 1.4.10 Rules for Classification * Technically sound classification of the data in any statistical investigation depend on the nature of the data and objectives of the enquiry. + But, consistent with the nature and objectives of the enquiry, the following guiding principles may be observed for good classification, G It should be unambiguous + The classes should be rigidly defined so that they should not lead to any ambiguity. That is, there should not be any room for doubt or confusion regarding the placement of the observations in the given classes, (New Syllabus w.e academic year 21-22) (M6-76) ¥ Tech-Neo Publications ...A SACHIN SHAH Venture * For example, if we have to classify a group of individuals as ‘employed’ and ‘un-employed’, it is a must to define in clear terms as to what we mean by an employed person and unemployed person. Gi) It should be exhaustive and mutually exclusive + The classes must be exhaustive in the sense that each and every item in the data must belong to one of the classes. ¢ A good classification must be free from residual class like others or miscellaneous because such classes do not reveal the characteristics of the data completely. «But if the classes are very large in number as is the case in classifying various commodities consumed by people in a certain locality, then it is necessary to introduce a residual class. * Also the various classes should be mutually disjoint or non-overlapping so that the observed value must belong to one and only one of the classes. ¢ For example, if we classify the students in a college as males, females and addicts to a particular drug then the classification is faulty because the group “addicts to a particular drug” includes both males and females. + In this case a proper classification will be with respect to two criteria, i.e. with respect to sex (males and females) and further dividing the students in each of these two classes into ‘addicts’ and non-addiets to the given drug. ii) It should be stable * In order to have meaningful comparisons of the results, an ideal classification must be stable, i.e., the same pattern of classification must be adopted throughout the analysis. * For example, in the census 1991, the population was classified with respect to profession was as under (a) Main activity : C is working (Cultivator (C), Agricultural labourer (AL), Household Industries (HHI), Other Works (OW)]. (b) Broad category : Non-worker (Houshold duties (H), student (ST), Renteer or Retired person (R); Dependent, Beggars, Institutions and others (DBIO)]. © Here, the results obtained in the two categories can be compared meaningfully. (iv) It should be suitable for the purpose © The classification must be in keeping with the objectives of the enquiry. « For example, if we want to study the relationship between the university education and sex, it is worthless to classify the students with respect to age and religion. es (New Syllabus wee academic year 21-22) (M6-76) UB Tech-Neo Publications ...A SACHIN SHAH Venture Introduction to Statistics) Quantitative Analysis (MU-Sem.6-Comp) (v) It should be flexible ‘The good classification should be flexible 80 that it can be adjustable to the ney and changed situations and conditi No classification is good enough to be used for ever : Changes in the classificatig, are necessary with the changes in time and changed circumstances. But flexibility should not be resulted in instability of classification. The give, jor groups which more or less remain population must be classified into some ms i Stable and allowing for adjustment due to changed circumstances or conditions, ‘This can be done by sub-dividing these major groups into sub-groups or sup. jons. classes which can be made flexible. the classification can maintain the character of flexibility along with © Hence, stability. 2 1.4.11 Bases of Classification The criteria or the bases with respect to which the data are classified depends on the objectives and the purpose of the enquiry. Generally, the data can be classified on the following four bases : (i) Geographical i.e. Area wise or Regional. (Gi) Chronological i.e. with respect. to recurrence of time. (iii) Qualitative i.e. with respect to some character or magnitude. (iv) Quantitative i.e. with respect to numerical values or magnitudes. We elaborate these one by one. 1.4.12 Geographical Classification * The basis of this classification is the geographical or locational differences between the various items in the data, like cities, Regions, Areas, Zones ete. For example, the density of the population in different cities of India, is given in the following table : / Density of population in different cities of India, (Per square kilometer) City D Calcutta 710 Mumbai 672 Delhi 451 Chennai 215 Chandigarh 57 (New Syllabus we academic year 21-22) (M6-76) [abrech.weo Publications ..A SACHIN SHAH Ventu® Quantitative Analysis (MU-Sem.6-Comp) {introduction to Statistics) ..Pg. No. (1-17) * Graphical classifications are presented according to size or values. This lays more emphasis on the important area or region. ‘% 1.4.13 Chronological Classification * Chronological classification is one in which the data are classified on the basis of difference in time, e.g. the population of a given country for different years, the profits of a big business house over different years. + We mention the population of India for different decades. Population of India (In crores) Year 1931 27.9 1941 31.9 1951 96.1 1961 43.9 1971 54.8 1981 68.3 1991 84.4 2001 111.3 2011 122.7 2021 132.8 * This time-series data occurs quite frequently in Economic and Business statistics and they are classified chronologically, starting with the first period of occurrence. 1.4.14 Qualitative Classification © When the data are qualitative i.e. which are not capable of quantitative measurement like intelligence, occupation, employment literacy ete. then the classification is called as qualitative or descriptive or with respect to attributes. + In qualitative classification the data are classified according to the presence or absence of the attributes in the given units. * If the data are classified into only two classes with respect to an attribute like its presence or absence among the various units, the classification is called as simple or (New Syllabus w.ef academic year 21-22) (M6-76) TBlrech-Neo Publications ...A SACHIN SHAH Venture (Introduction to Statistic given population is classified into more than two it is called as manifold classification. aay e various classes can be : dichotomous. Again, i with respect to a given attribute, * For example, for the attribute intelligence th ‘%® 1.4.15 Quantitative Classification «If the data are classified on the basis of quantitative measurement like age, height weight, production, prices, income, sales, profits, expenditure etc. then it is termed x quantitative classification. «The quantitative phenomenon is called as variable and hence this classification ig also called as classification by variables. « For example, the earnings (daily) of different stores may be classified as : Daily earnings (in 100 rupees) of 60 Departmental stores : Daily earnings | Number of store Upto 100 i 101-200 13 201-300 9 301-400 8 9 5 401-500 501-600 601-700 701-800 4 * In the above classification, the daily earnings of the stores are termed as variables and the number of stores in each class as the frequency. * The above classification is also called as grouped frequency distribution. © Variable * The quantitative phenomenon under study is termed as variable or a variate. + Variables are of two types ; (i) Continuous variable (ii) Diserete variable (Discontinuous variable) Tech-Neo Publications ...A SACHIN SHAH Ventu® > (New Syllabus w.ef academic year 21-22) (M6-76) Quantitative Analysis (MU-Sem.6-Comp) (introduction to Statistics) ..Pg. No. (1-19) @ Continuous variable * The variable which can take all possible values (integral as well as fractional) in a given specified range are called as continuous variables. «For example, height (in cms), weight (in 1 bs), distance (in kms). « Precisely, a variable is said to be continuous, if it is capable of passing from any given value to the next value by infinitely small gradations. (ii) Discrete variable * The variable which can of take discrete values is called discrete variable. © For example, family size (members of family), the population of the city, typing mistakes per page etc. pi_1.5_TABULATION-MEANING AND ITS IMPORTANCE __ + By Tabulation is meant the systematic presentation of the information contained in the data, in rows and columns according to some given characteristics. * AM. Tuttle has defined it as : “A statistical table is the logical listing of related quantitative data in vertical columns and horizontal rows of numbers with sufficient explanatory and qualifying words, phrases and statements in the form of titles, headings and notes to make clear the full meaning of data and their origin.” + Professor Bowley defines it as : “The intermediate process between the accumulation of data in whatever form they are obtained and the final reasoned account of the result shown by the statistics.” * Tabulation is one of the most important device of presenting the data in a condensed and comprehensible form. * It tries to make the maximum information contained in the data in the minimum possible space and maintains the quality and usefulness of the data. ‘* According to prof. Bowley : “ In the tabulation of data common sense is the chief requisite and experience is the chief teacher.” (New Syllabus w.ef academic year 21-22) (M6-76) [Bl rech-neo Publications ..A SACHIN SHAH Venture Q 1.5.1 The Parts of a Table titative Analysis (MU-Sem. e vary from problem to problem. stigation. in a good statistical table : The parts of a tabl 3 of the data and purpose of the inves The following points are important ii) Title (iv) Captions and stubs (vi) Foot-note Table number Gii) Head (main) notes (v) Body of the table (vii) Source-note ‘We discuss these in brief. @ Table number If the data contains more than one table, logical sequence for proper identification and easily accessible for further reference, Gi) Title Every table must be given a suitable title. A title must be self-explanatory. It must describe in brief and concise form the contents of the table. the source of data. ‘The title should be brief but not incomplete one and not at cost of clarity. Sometimes, it is necessary to use long title for the sake of clarity. « Insuch a case a ‘brief note’ may be given above the main title. Gii) Head notes It 50 required, head not is given just below the title in a prominent type usually centred and enclosed in brackets for further description of the contents of the table. * It provides an explanation concerning the entire table or its major parts. (iv) Captions and stubs + Captions are the headings or designations for vertiéal columns and stubs are the headings or designations for the horizontal rows. «They should be brief and self-explanatory Captions are written in the middl the columns in small letters, je of (New Syllabus wef academic year 21-22) (M6-76) (Ral tech-Neo Publications ..A SACHIN SHAH Vent® (introduction to Statistics) Po. No. (1.99 It should describe the nature of the data, the place (i.e. geographical region or area to which the data release), the time (i.e. period to which the data relate) and And that depends upon the natun, then all the tables should be numbered in a | Quantitative Analysis (MU-Sem.6-Comp) (Introduction to Statist Each column and row must be given a number for reference. If two or more columns or rows correspond to similar classifications (or with the same headings) then they may be grouped together under a common heading to avoid repetitions. > (wv) Body of the table + The arrangement of the data according to the descriptions given in the columns (captions) and rows (stubs) forms the body of the table. + Numerical information forms the most important part of the table. + To increase the usefulness of the table, totals must be given for each separate class below the columns or against the rows. > (vi) Foot note * Foot notes are to be used, when some characteristic or feature of the item of the table needs elaborate explanation. © Foot-notes, if any are placed below the body of the table. © Foot notes are identified by the symbols *, **, ***, @ ete. > (wii) Source note «The source note is required if the secondary data are used. * Source note is to be given at the bottom of the table. If the data are taken from a research journal or periodical, then the source note should give the name of the journal or periodical along with the date of publication, its volume number, page number, so that anybody who uses this data may verify the accuracy of the figures by referring to the original source. * A table should have an attractive get up which is appealing to the eye and the mind so that the reader may grasp it without any strain. © Hence special attention to the size of the table and proper spacings of rows and columns must be given. Y& 1.5.2 Types of Tabulation «Statistical tables are generally constructed in the following ways : (i) Objectives and scope of the enquiry. Nature of the enquiry (primary or secondary). (iii) Extent of coverage given in the enquiry. * We mention below diagrammatic scheme that displays the various forms of tables commonly used in practice. (New Syllabus w.e.f academic year 21-22) (M6-76) [Ral rech-eo Publications A SACHIN SHAH Venture Quanttate Anatyls (MU-Sem 6-COM) (Introduction to Statistics) ...Pg, No SA 1.5.3 Types of Table ‘On the basis of nature of enquiry Original or Derived or Derivative Table YS. 1.5.4 Solved Examples on Preparation of Tables Present the following information in a ‘suitable tabular form, supplying the figures. In 1995, out of total 2000 workers in a factory, 1550 were members at a trade ‘union. The number of women workers employed was 250, out of which 200 did not belong to any trade union. In 2000, the number of union workers was 1725 of which 1600 were men. The number of non-union, workers was 380 among which 155 were women. 1 som.: Comparative study of the membership of trade union in a factory in 199 and 2000. var» me. ee ‘Trade union Males Female Tot | (Malo i teens. S : “Taal + = 2 arbors | 1850-80 1500 |250~ 20050] 1880 tooo | vas to00= 125] 1725 armen |1750-1500-280] 20 [amo 180. 480| 980-15 0205, 185 380 Tots | 2000-260= 1750) 250 2000 1600+ 205 = 1825] 125+ 155 = 200 | 1725 + 9802105 Here, we have presented the comparative atudy of the membership of trade union in * factory in 1995 and 2000. (New Syllabus w.ef academic year 21-22) (M6-76) Wal rech.neo Publications ...A SACHIN SHAH Vertu® ae (Introduction to Statistics) Out of a total number of 10,000 candidates who applied for jobs in a government department, 6854 were males, 3146 were graduates and others, non-graduates. The number of candidates with some experience was 2623 of whom 1860 were males. The number of male graduates was 2012. The number of graduates with experience was 1093 that includes 323 females. First we find the data. We have, Total number of applicants = 10000 Males 6854 Graduates = 3146 Experienced = 2623 Total number of females = 10000 - 6854 = 3146 ‘Total number of graduates = 10000 - 3146 = 6854 ‘Total number of In-experienced persons = 10000 - 2623 = 7377 ‘We summarise the information in the table Distribution of candidates for Government jobs, sex-wise, education-wise and experience-wise. | Total | Experienced | in-experienced | To Total Mae | 770 rea |2012| 1000 9752 6854 Female| 923 en |tta4] 40 1572 |2ore| 763 283 | 9148 Tol | 1098 20s [14s] 1590 sac [ess] ase rarr__| 10000 Ex.1.53: In 1990, out of a total of 2000 students in a college 1400 were for Graduation (New Syllabus w.e¥ academic year 21-22) (M6-76) and the rest for post-Graduation (P.G.). Out of 1400 Graduate students 100 were girls. However, in all there were 600 girls in the college. In 1995, number of graduate students increased to 1700, out of which 250 were girls, but the number of P.G. students fell to 500 of which only 50 were boys. In 2000, out of 800 girls, 650 were for Graduation whereas the total number of graduates was 2200. The number of boys and girls in P.G. Classes was equal. Represent the above information in tabular form. Also calculate the percentage increase in the number of graduate students in 2000 as compared to 1990. H Tech-Neo Publications ...A SACHIN SHAH Venture (Introduction to Statistics) i (! ane pay SA 2 Ey 8 heehee of the number of students with respect to level of education 2d is obtained as follows : ‘Year 1990 : |_| Graduation _| Post-Graduation Total® Girls 100 600 - 100 = 500 600 Boys | 1400-100 = 1300 | 600-500=100 | 2000 - 600 = 1400 Total 1400 2000 - 1400 = 600 2000 Girls 250 500 — 50 = 450 250 + 450 = 700 Boys | 1700 - 250 = 1450 50 1450 + 50 = 1500 Total 1700 500 1700 + 500 = 2200 Year 2000 : a wana” __| Post-Graduation Girls 650 800 — 650 = 150 800 Boys | 2200-650 = 1550 150 1550 + 150 = 1700 Total 2200 150 + 150 = 300 2500 Distribution of students according to > degree and sex for years 1990 to 2000. Degree > ‘Graduation: Post-Graduation Year Boys | Girls | Total (a) | Boys | Girls | Total ) Total (a) + (b) | 4 1990 1300 | 100 1400 100 500 600 2000 1995 1450 | 250 1700 50 | 450 500 2200 | 2000 1550 | 650 2200 150 | 150 300 2500 | Total _| 4300 | 1000 | 6300 | 300 | 1100] 1400 6700 | (New Syllabus w.ef academic year 21-22) (M6-76) [al tech-neo Publications ...A SACHIN SHAH Ventu® a (Introduction to St 'g. No. (1-25) Percentage increase in the number of graduate students in 2000 as compared to 1990 is : (2200 = 57.14% Ex.1.5.4: Represent the following information in suitable tabular form with proper rulings and headings The annual report of a public library reveals the following points regarding the reading habits of its members. Out of the total 3713 books issued to the members in the month of June 2000, 2100 were fictions. There were 467 members of the library during the period and they were classified into five classes A, B, C, D and E. The number of members belonging to the first four classes were respectively 15, 176, 98 and 129 and the number of fictions issued to them were 103, 1187, 647 and 58 respectively, Number of books, other than text books and fictions, issued to these four classes of members were respectively 4, 390, 217 and 341. Text books were issued only to members belonging to the classes C and D and E and the number of text books issued to them were respectively 3, 317 and 160. During the same period, 1246 periodicals were issued. These included 396 technical journals of which 36 were issued to members of class B, 45 to class D and 315 to class E. To members of the classes B, C, D and E the number of other journals issued were 419, 26, 231 and 99 respectively. ‘The report however showed an increase by 3.9% in the number of books issued over last month though there was a corresponding decrease by 6.1% in the number of periodicals and journals issued to members. Y somn.: > StepI: Figures for the month of June 2002 : (Total number of members = 46 Number of members belonging to class E = 467-(15 + 176 +98 + 129) = 49 (ii) Total number of books (fiction, textbooks and other books) = 3713 ‘Total number of textbooks issued = 3 +317 + 160 = 480 ‘Total number of fictions issued = 2100 :. Total number of other books issued = 3713 - (2100 + 480) = 1133 Hence, number of books of other books issued to the members of class E are obtained on subtraction as 1133 - (4 + 390 + 217 +341) = 181 Tech-Neo Publications ...A SACHIN SHAH Venture (New Syllabus w.e.f academic year 21-22) (M6-76) , p Gii) Total number of periodicals (technical journals and other journals) E Number of technical journals = 396 Number of other journals = 1246 ~ 396 = 850 Now, the other journals jasued to the members of class A are obtained as : 950 - (419 + 264 291+99) = 650-775 = 6 > Step IT: Figures for last month (May) > iv) Let N be the figures for month of June and M for last month ie. May 2000. Books : Since it is reported that there is an increase of 3.9% in the number of booky iseued in June over the last month, we have, 3.9 N = M+7o9 M 39 a ( : 7000) aoe ; N M = 7999 x 1000 Thus, the total number of books issued in the last month Similarly, total number of fictions issued last month 2100 x 1000 = = 2021.17 Total number of other books issued last month 8713 x 1000 1039 = 3573.62 = 3574 = 2021 = 3574 -(2021 + 462) = 1091 > Step III: Periodicals Since a decrease of 6.1% is reported in the issue of periodicals to its members in June over the last month, we have 6.1 M-qoo M = IN 61 a to) M : (1 61 ~ 7000 = N _ Nx 1000 : 939 —— : _ ee (New Syllabus wes academic year 21-22) (M6-76) (a) Tech-Neo Publications ...A SACHIN SHAH ventu® Quantitative Analysis (MU-Sem.6-Cor (Introduction to Statistics) . Total periodicals issued last month __ 1246 x 1000 = "939 = 1326.94 = 1327 > Step IV: Number of technical journals issued last month 396 x 1000 = 939 = 421.72 = 422 Number of other journals issued last month = 1327-422 = 905 Now, we can set to represent the given data in a tabular form as shown. > Step V : Public library - Annual Report - June 2000 on Reading Habits of its Members Class of members Total A| B/C/;D|E This Last Month Month Total number of 15 | 176 | 98 | 129| 49 467 members | Books issued Fictions 103 | 1187 | 647| 58 | 105| 2100 2021 | Text books - | - | 3 | 817} 160 480 | 462 Other books 4 | 390 | 217 | 341 | 181 133 | 1001 Total books 107 | 1677 | 867 | 716 | 446| 3713 3574 | Periodicals issued ‘Technical Journals - | 36 | - | 45 | 315 396 422 Other Journals 76 | 419 | 26 | 231| 99 850 905 Total periodicals 15 | 455 | 26 |276| 414| 1246 1327 Tech-Neo Publications ...A SACHIN SHAH Venture (New Syllabus w.e f academic year 21-22) (M6-76) tation has a number of advantages * Diagrammatic and graphic represen! * We mention some of them : _ F (2) Diagrams and graphs are visual aids. They present the data in simple ay comprehensible form. ; _ (2) Diagrams are generally more attractive, impressive and fascinating than the * of numerical data. A pennon, who has no statistical background, can understany the diagrams clearly and can derive the conclusions easily. (8) They register meaningful impression on mind. They also save lot of time and i, becomes easy to draw inferences from the diagrams. One may not like to through the numerical data but one can easily grasp the inferences from the figures. (4) Graphs exhibit the trends directly, which is not easily can be exhibited by numerical data. °. 1.6.1. Difference between Diagrams and Graphs (2) In the construction of a graph, one can understand the mathematical relation (or sometimes a function) between the two variables. | On the other hand, diagrams are presented on a plane paper and one can compare the relationships between the two variables. (2) In the construction of a graph, points, lines are drawn to draw the inferences from the given data. While on diagraphs, we draw line diagram, histogram, bar chart, pie-diagram and so on. (3) From graph, one can have only approximate information from the given data. But from diagram, one can have far better information from the given data. (4) Construction of graphs is easier and less time consuming than construction of graphs. YS 1.6.2 General Rules for Constructing Diagrams G) Neatness : Use of appropriate devices, colors to he shown, The diagram must be clea” and attractive and should exhibit the data accurately. (i) Title and footnotes : Each diagram should be given a suitable title to indicate t® subject-matter and the various facts depicted in the diagram. The title should be brish self-explanatory and clear, a foes (New Syllabus w.ef academic year 21-22) (M6-76) TB rech.ne © Publications ..A SACHIN SHAH Ventv® (Introduction to Statistic If so required, foot notes should be given to left hand bottom of the diagram to explain certain points and facts, not otherwise covered in the title. / (iii) Selection of scale : One of the most important point in constructing the diagram is the selection of scale. As a guiding principle, the scale should be selected according to the choice of the paper, and the size of the observations to be displayed so that the diagram is neither too small nor too large. (iv) Choice of a diagram : A large number of diagrams are used to represent data. It primarily depends on the nature of data, magnitude of the observations and type of people for whom the diagrams are meant and it requires great amount of expertise skill and intelligence, An inappropriate choice of the diagram for a given set of data may give a distorted diagram. Hence, the choice of a diagram to present the given data should be made with almost care and caution. (v) Index : A brief index explaining various types of shades, colours, lines and designs used in the construction of the diagram should be given for clear understanding. ‘1.6.3 Univariate analysis (U.A) Univariate analysis is the simplest form of analysing data ‘Uni’ means ‘one’, so in other words, the data has only one variable. Te does not deal with causes or relationships (unlike regression) and its major purpose is to take data, summarise that data and find patterns in that data. 356 eee Dal rech-Neo Publications ... ‘A SACHIN SHAH Venture (New Syllabus w.e academic year 21-22) (M6-76) _— Introduction to Statistics)... Quantitative Analysis (MU-Sem.6-Comp) (int 0. No. (4 2B 1.6.4 Sceps to be Followed for UA. y Prepare your dataset. / Choose analyais (type) : Descriptive statistics or frequencies. Click statistics and analyse the required data and then click continue, To click the chart. Choose the expected chart and then click continue. Click O.K. and finish the analysis. See and interpret the output. Univariate analysis means analysis of one variable or one feature. Unive; analysis basically tells us how data in each feature is distinguished and also tells us aby, the central tendencies like mean, median and mode. U.A. is characterised by g dependent on only one random variable, uni-linear model. In a dataset, it explon, each variable separately. NP 9m wt Different methods of univariate distribution are : Univariate distribution can be described as : 1. Frequency Distribution 2. Bar-Charts 3. Bar graph 4. Histogram 5. Pie-diagram 6. Frequency polygon %S 1.7.1 Frequency Distribution (F.D.) F.D. reflects how often an occurrence has taken Place in data. It gives a brief idea of the data and makes it easier to find patterns : Consider an experiment of taking 1@ test of 17 students in a Engineering college. The scores obtained are ; 118, 139, 141, 142, 144, 147, 148, 149, 157, 152, 154, 157; The frequency representation i 118-138-1; 139-1447 145-149-4; — 152-157-5 aeaneaee — fe Tech-Neo Publications ..A SACHIN SHAH Vent (New Syllabus w.e f academic year 21-22) (M676) Quer (Introduction to Statistics) 5 Line Diagram £x.1.7.1: The following data shows the number of accidents sustained by 314 drivers of a public utility company over a period of five years. Number of accidents: 0123456789 1011 Number of drivers : 82 44 68 41 25 20187 5 4 3 2 Represent the data by a line-diagram. © son: Fig. Ex. 1.7.1 %. 1.7.2. Bar-Charts © The bar-chart is very convenient while comparing categories of data or different groups of data. It helps to track changes over time. + It is best for visualising data. For example, the data relating the sales, profits, production, population ete. For different periods may be presented by bar diagram. Remark If there are a large number of items or values of the variable under study, then instead of bar diagram, line diagram may be drawn. %. 1.7.3 Bar-Graph A Bar graph (also known as a bar chart or bar diagram) is a visual tool that uses bars to compare data among categories. ‘A bar graph may be horizontal or vertical. The longer the bar, the greater its value. Bar graphs consist of two axes : On a vertical graph, the horizontal axis P or X-axis, shows the data categories. The vertical axis is the scale. (New syllabus w.ef academic year 21-22) (M6-76) [bl recn-Neo Publications ..A SACHIN SHAH Venture (Introduction to Statistics) | Quantitative Analysis (MU-Sem. Comp), 5 attributes of Bar Graphs i) A bar diagram makes it easy to compare sets of data between different Broups, @ ar dil i lue i (ii) The graph represents categories on one axis and a discrete value in the other, represents the relationship between the two axes. i i time. (iii) Bar charts can also big changes in data over ti (i) Bar graphs are an effective way to compare items between different groups, (ii) They are effective visual in presentations and reports. (iii) From the bar-graph, one can recognitions patterns or trends for more easily than table of numerical data. 5 Types of a Bar graph () Vertical bar Graph @ The most common type is the vertical bar graph. It is useful when presenting a serss of data over time. One disadvantage is that they don’t leave much space at the bottom if long labels ax required. Horizontal bar Graph Gi Here there is plenty of room for long label along the vertical axis. Best performing S and P 500 stocks of the decade. ; Example ‘The following data relating to the strength of the Indian merchant shipping fleet the Gross Registered Tonnage (GRT) as on 31* Dec, For the different years. Year 1961 1966 1971 1975 1976 GRTin’000 901 1,792 2,500 4,464 5,115 | Represent data by suitable bar-diagram, (New Sllabus wef academic year 21-22) (6-76) ee Tect ent @ch-Neo Publications ...A SACHIN SHAH ¥ (Introduction to Statistics) No. (1-33) 5115 4500 4464 = tne we O Se eee + Strength of Indian Merchant shipping fleet (Gross registered tonnage in 1000) Best performing S&P 500 stocks of the decade Fig. 1. Netflix - NFLX MarketAxcess Holdings - MKTX BIOMED - ABMD Broadcom - AVGO Regeneron Pharmaceuticals - REGN United Rentals - URI Take-Two Interactive Software - TTWO TransDesign Group - TDG ‘Align Technology - ALGN NVDIA - NVDA © — 500° 1.000 1.500 2.000 2.500 3.000 3.500 4,000 Fig. 1.7.2 5 Bar diagram problem —x172: The data below give the yearly profits (in thousand of rupees) of two companies A and B. Profits in (1000 Rupees) Year_| Company A | Company B 1994-95 120 90 1995-96 135 95 1996-97 140 108 1997-98 160 120 1998-99 175 130 Represent the data by means of a suitable diagram. (New Syllabus w.ef academic year 21-22) (M6-76) {$i Tech-Neo Publications ....A SACHIN SHAH Venture Serer (Introduction to Statistics) titative Analysis (MI & son: We represent the data by a multiple bar diagram as shown below : | | Company A laa | Fig. Ex. 1.7.2 ‘% 1.7.4 Histogram © Ahistogram is the most commonly used graph to exhibit frequency distribution It looks very much like a bar chart, but these are important differences between then Histogram is to be used when 1. The data is numerical. 2. To note the shape of the date’s distribution, especially when whether the output of process is distributed approximately normal. To check whether a process can satisfy the customer's requirements. To analyse the output from a supplier’s process. To observe whether a process change has occurred from one time period to another To check whether the outputs of two or more processes are different. | NPae oe To communicate the distribution of data quickly and easily to others. 1H vert (New Syllabus wef academic year 21-22) (M6-76) [lB] tech-Neo Publications ..A SACHIN SHAI tative Analysis (MU-Sem.6-Comp) introduction to Statistics) ..Pg. No. (1-35) Histogram of Qualty Defects Fig. 1.7.3 : Histogram Example To create a Histogram 1. Collect at least 50 consecutive data points from a process. 2. Draw X-and Y-axis on graph paper. Mark and label the Y-axis for counting data values. Mark and label the X-axis the values of data. The spaces between these numbers will be the bars on histogram. There is no space between bars. ‘@ Typical Histogram shapes and what they mean (1) Normal Distribution * It isa bell-shaped curve known as the “normal distribution”. * In a normal or “typical” distribution, points are as likely to occur on one side of the average as on the other. Normal distribution Fig. 1.7.4 (2) Skewed Distribution * The skewed distribution is asymmetrical because a natural limit prevents outcomes on one side. * The distributions peak is off centre toward the limit and a tail stretches away from it. * These distributions are called right or left-skewed Right-skewed distribution according to the direction of the tail. Fig. 175 ‘Tech-Neo Publications ...A SACHIN SHAH Venture (New Syllabus wef academic year 21-22) (M6-76) (Introduction to Statistics) Quantitative Anal (3) Double peaked or Bimod! * Here the outcomes of two proces: net distributions are combined in one set of data. © The bimodal distribution looks like the back of a two-humped camel. * It is one of the most pop devices for charting continuous fred distribution. © It consists in erecting a series of adjacent vertical rectangles on the sections oy horizontal axis (f-axis), with bases (sections) equal to the width w corresponding class intervals and heights are s0 taken that the areas of the Teta are equal to the frequencies of the corresponding classes. ses with different lar and commonly used —Bimodal (double-peaked) ai uency tr ty Fig. 1.7.6 x.1.73: Represent the adjoining distribution of marks of 100 students in y examination by a histogram. =r Marks obtained | No. Of students | Less than 10 4 Less than 20 6 Less than 30 24 Less than 40 a Less than 50 67 Less than 60 86 Less than 70 - Less than 80 99 Less than 90 a (New Syllabus w.ef academic year 21-22) (mg — 16-76) re Tech-Neo Publications ...A SACHIN SHAH vert (Introduction to Statist No. of students 0 10 20 30 40 60 60 70 80 90 Marks Fig. Ex. 1.7.3 : Marks 10-20 |6-4=2 20-30 | 24-6=18 30-40 | 46-24=22 40-50 | 67-46 =21 50-60 | 86-67=19 60-70 | 96-86=10 70-80 | 99-96=3 80-90 | 100-99=1 %® 1.7.5 Pie-Chart "© what is pie-chart * Apie chart is a type of a change that displays data in a circular graph. If one of the most commonly used graphs to represent data using the attributes of circle spheres, and angular data to represent real world information. * A pie-chart is a pictorial representation of data in the form of a circular chart or pie where the slices of the pie shown the size of data. A list of numerical variables along with categorical variables is needed to represent data in the form of a pie-chart. (New Syllabus w.e.f academic year 21-22) (M6-76) LEY Tech-Neo Publications ...A SACHIN SHAH Venture ee r™m™rmrmrmrrrrrrrUrr Cr r——_——sc tr (Introduction to Statistics) m,6-COMP) No. (4 h slice and the area and the central angle sli Quantitative Analysis (Mi forms in Pie ay. * The are length of each sit Oo is proportional to the quantity ° diagrams i represents. ie interpret and represent data more leary», * Pie charts, also known as P ne . also used to compare the given data. " Ple-chart advantages (2) A-straight forward and easy-to-understand illustrations. Tt visually portrays data as a fraction of a whole, and is an important commis (2) tool for even inexperienced audience. It allows the viewer to do an immediate analysis or quickly comprehend details, One can manipulate data in the pie-chart to highlight points one wants to make, Pie-charts are pleasing, therefore great for gaining the attention of the viewers, (3) 5 Disadvantages of the pie-chart (1) When there are many data points in a pie-chart, it loses its effectiveness. (2) Ifthere are many pieces of data, they can become confusing and difficult to read (3) Since the chart reflects one data set, you will need a serves of pie-charts to compare different settings. (4) It is not easy to compare data slices because the reader has to account for angles and compare non-adjacent pieces. (5) Where there is negative data, a pie-chart is not a good choice. % 1.7.6 Pie-diagram ‘Steps of construction of pie-diagram (1) Express each of the component values as a Percentage of the respective total. (2) Since the angle at the centre of a circle is 360° and each component part is to be Sxpressed proportionately in degrees. Since 1 percent of the total value is equal #9 100 = 3.6% the percentage of the common value from step 1. Can be converted degrees by multiplying each of them by 3.6, New fates wef seem year 21-22) (46-76) Sl reeh-neo Publications ...A SACHIN SHAH Vert™® — ‘Quantitative Analysis (MU-Sem.6-Comp) %. 1.7.7 Comparison Table (Bar Chart V/s Histogram) (Introduction to Statistics) (1-39) Bar chart Histogram chart Usage ‘To compare different categories | To display the frequency of of data occurrences. Indicates Discrete values Non-discrete values Data Categorical data Quantitative data Reordering bars | Bach data point is rendered as | The data points are grouped and a separate bar rendered based on the bin value Space between | Can have space No space bars Reorienting bass | Can be reordered Cannot be reordered Axis label Axis labels can be placed on or | Axis labels are placed on the ticks placement betwoon the ticks Required values | x and y Only y 1.7.8 Comparison Between Histogram and Bar Graph are no spaces between bars - Basis for Histogram a Meaning Histograms refers to a graphical | Bar graph is a pictorial representation, that displays data | representation of data that by way of bars to show the | used bars to compare different frequency of numerical data categories of data Indicates Distribution of non-discrete Comparison of discrete variables variables Orients Quantitative data Categorical data Spaces Bars touch each other, hence there | Bars do not touch each other, hence there are spaces (New Syllabus we. academic year 21-22) (M6-76) Tal rech.neo Publications ..A SACHIN SHAH Venture comparison Elements Blements are grouped together, so | Elements are taken as that they are considered as ranges. individual entities Can bars be No Yes reordered Width of bars Need not be same same Ex.17.4: The following data represent expenditure by a state government for the yey 1997-98. Draw a pie-diagram. Items Agriculture Industries | Health and | Miscellaneous andrazi | andurban’ | edueetion development | development : Proposed 4,200 1,500 1,000 500 expediting in million Rs. © sotn.: Calculation for pie-chart Items ‘Proposed expenditure (in | Angle at the centre million B4), 0 obo | @ [ Gree 4) jeulture and rural 4,200 4200 | evelopment 7200 * 360° = 210° | Industries and urban 1,500 1500 development 7200 *860°=75 | | Health and education 1,000 1000 | 7200 *360°=50° | Miscellaneous 500 oa a 7200 *360°= 25 Total 7,200 360° Pie-diagram representing proj , items for 97-98. Proposed expenditure by state-government oP jitter snan vet (New Syllabus w.ef academic year 21-22) (M6-76) [3] Tech-Neo Publications ....A SACHIN ications .. (Introduction to Statistics) ...Pg. No. (1-41) Agriculture and rural development Miscellaneous oe Health and education Fig. Ex.1.7.4 The following table shows the area in millions of sq. km. of oceans of the world Ocean | Area (Million sq. km.) Pacific 70.8 Atlantic 41.2 Indian 28.5 Antarctic 16 Arctic 48 Draw a pie-diagram to represent the data. @ son.: > StepI: Calculation for pie-diagram Sr.No. | Ocean | Area (Million sq.km.) | Angle at the centre 1, | Pacific 10.8 70.8 750.9 * 360° = 166.7° ‘ 41.2 2. Atlantic 41.2 752.9 * 360° = 97.0° 3. | Indian 28.5 7 360° = 67.1° | 4. | Antarctic 768 w& x 360° = 17.9° 5. | Aretic 48 ie 9 * 360° = 11.3° Total 162.9 oe | PIE diagram showing the area (in millions of square kms) of oceans of the world (New Syllabus w.e,f academic year 21-22) (M6-76) [al Tech-Neo Publications ...A SACHIN SHAH Venture | (Introduction to Statistics) Fig. Ex. 1.7.5 £1.76: The following data shows the expenditure on various heads in the fr = five year plans (in crores of rupees) Subject. _[irst plan | Second plan | Third pian Agriculture, C.D. 361 529 1068 Irrigation and power 561 865 1662 Village and small industries 173 176 264 Industry and minerals 292 900 1520 Transport and communications 497 1300 1486 Social services and Miscellaneous 477 830 1500 | Total 2361 4600 7500 % sol 5 Represent the data by angular (pie) diagram Calculation for PIE-Diagram Expenditure (In cores of Rupees) ae Spee First Plan Second Plan Third Plan Rs, Degrees Rs. Degrees Re. Degrees: 361 Agriculture and C.D. 361 | eT 360° = 55.1° | 529 FB, x 280° = 677° 1068 $058 3 360" 2512" 581 | Inigation and power 501 | Ft 3960" = 055° | 265 BRS 360° = 67:70 | 1662 | 1882 5 san" = 786 is 173 | Village and small industries | 173 Ba6t * 360° = 26.4° | 176 aa x 360° = 138° | 264 as 960? = 12.7 200 e130" Industry Minerals 292 | BE 5500 « 445° | o00 29. sear = roa | 1520 | 1522 x00 78 : Se (New Syllabus wef academic year 21-22) (M6-76) [Ral tech-Neo Publications ...A SACHIN SHAH V2" Se Expaeftre (core of Rupees) ar . 1488 spe 2 71.9° Transport and communications | 497.) 927 x a80° = 768° | 1900 | 1990 x aege = 101.7°| 1488 | FEB x 360° = 71.3 4j Socal sorices as Messages [477/377 x ag0>=72:79|890 | S80, abo = 06.0% | 1800 | JERR x 360° = 720° Total 2381 | 360° 4800 | 360° 7500 | 360° ‘Sq. root 48.59 67.82 86.60 Industry and minerals 8] Transport and irrigation Village and ‘small industries Irrigation and power Agriculture and CD Fig. Ex. 1.7.6 % 1.7.9 Frequency Polygon + Frequency polygons are a graphical representation of data distribution that helps in understanding the data through a specific shape. Frequency polygons are very similar to histograms but are useful while comparing two or more data. © Definition Frequency polygon is defined as a form of a graph that interprets information or data that is widely used in statistics. This visual form of data representation helps in depicting the shape and trend of the data in an organised and systematic manner. Frequency Polygons through the shape of the graph depict the number of occurrence of class intervals. While a histogram is a graph with rectangular bars without spaces, a frequency polygon graph is a line graph that represents cumulative frequency distribution data. (New Syllabus w.e- academic year 21-22) (M6-76) [BH Tech-Neo Publications ...A SACHIN SHAH Venture a (introduction to Statistics) somp) Quantitative Analysis (M \s SF steps to construct frequency Polya” eu tex ina frequency polygon is drawn on X-axis and os : aie represen alt ca aaa area Y-axis shows the number of occurrences of each category, value in a dat > Step 1: Mark the class interval = i of the class interval which is the cla; > Step 2: Calculate the midpoint of each e 8 mark > Step 3: Mark the class-marks on © X-axis > Step 4: Plot the frequency according 40+ to each class mark 30 > Step 5: Once the points are marked, aol join them with a line segment similar 49 to a line graph. The curve that is obtained by this line segment is the 2 4 6 8 > frequency polygon. Fig. 1.7.7: Frequency polygons Ex.1.77: The following data show the number of accidents sustained by 313 drivers ofa public utility company over a period of 5 years. Draw the frequency polygon, No. of accidents |0 [1 |2 |3 |4 |5 |6 |7|8/9/10/ 11 No. ofdrivers | 80 | 44| 68 | 41/ 25|20/13|7|5|4|3 |2 © som. : Frequency polygon Number of drivers. O12 3°45 6 7 8 8 10 41 ___ Number of accidents Fig. Ex. 1.7.7: Number of accidents eee 2 (New Syllabus we.f - tl i *Feeademic year 21-22) 6-76) [Bl tech. Neo Publications A SACHIN SHAH Ve™ s for each class on X-axis while We Plot the oun, 4 Q Q Q Q oon 0 DO DO DO oO Oo oO oO Oo Oo jo o o Quantitative Analysis (MU-Sem.6-Comp) (Introduction to Statistics) ...Pg. No. (1-45) 1.8 IMPORTANT QUESTIONS FOR EXAM at a2 as a4 as a6 a7 as 9 10 " 12 13 14 15 16 7 0000 0 0 2 0 0 oO 18 Q20 Qat Q22 Mention brief introduction of statistics. Explain the ‘concept of statistics, Mention the functions of statistics, Mention importance of statistics and uses of statistics in (i) Planning (ii) in state, (ii) in mathematics (iv) in economics (v) in industry (vi) in Astronomy (vii) in war. What are limitation of statistics 2 Explain data-clat ication. Also mention functions and reasons for data-classification. What are different types of data classification ? What are different types of data Risk ? Explain in detail applications and rules of classifi ation. Explain in detail with examples bases of classification. Explain meaning and importance of tabulation. Explain types of tabulation. What are advantages and difference of diagrammatic and graphic representation of data ? Mention rules for constructing diagram. Explain univariate analysis, and discuss frequency distribution with example, Explain and mention types of Bar-charts. Explain histogram. When histogram is to be used ? How to create a histogram ? Mention histogram shapes with diagram and their meanings for : (i) Normal distribution (ii) Skewed distribution (ii) Bimodal distribution What is pie-chart ? Mention advantages and disadvantages of pie-chart. Mention method of construction of pie-diagram. Compare bar-chart v/s histogram. Compare histogram and bar-graph. Explain frequency polygon and explain method of constructing frequency polygon. — : ».Chapter Ends gaa CHAPTER Data Collection and Sampling Methods Primary & Secondary data, Sources of data, Methods of collecting data. Sampling - Census & Sample methods -Methods of sampling, Probability Sampling and Non-Probabilty Sampling. 2.1 Primary and Secondary Data = 2.1.1 Data may be Collected from the Following Two Sources oe 2.4.2 Definition of Primary Data and Secondary Data... 2.1.3 Difference between Primary and Secondary Data. 2.1.4 Internal and External Data.... 22 SOUTCES Of Data. nnn 2.2.1. Types of Sources of Data... a 22.2 Types of Data and the Methods of Collecting the Data = 2.2.3 We Elaborate the Above-Mentioned Point... 2.2.4 Sources of Secondary Data, 23 Methods of Collecting Data... 23.1 Primary Data Collection Methods. 232 Quantitative Methods. 23.3 Time Series AnalySi8 nn 234 Barometric Method 235 Qualitative Methods... 2.36 Surveys 237 Polls. 26 27 28 Quantitative (New Syllabus wef a a cademic year 21-22) (M6-76) lal rech-Neo Publicati jax Ver -Neo Publications ....A SACHIN SH! 238 2.29 Delphi Technique 2.2.10 Focus GrouPs part Questionnaire a collection Met pare Secondary Del ‘Sampling 24.1 Definitions. se 242 The Purpose of ‘sampling Census nn 2.5.1 Census-Method 25.2. Census COUNTS a 25.3. Merits and Demerts of Census Mathod 25.4 Advantages and Disadvantages of Consus Method Probability Sampling en ve 2.6.1 Types of Probability sampling 26.2 Example... stratified Random Sampling 263 Random Cluster Sampling. 264 265 Systematic Sampling 2.66 Steps Involved in Probability sampling 2.6.7 When to use Probability Sampling Be 268 Advantages of Probability SampIng Non-PRobabilty (techniques) sampling ~~ 27.1 Definition... Types of Non-probability Sampling: 272 27.3 Non-probability Sampling Examples... 2.7.4 When to use Non-probabilty Sampling ? 2.75 Advantages of Non-probability Sampling Important Questions for Exar : © Chapter Ends... Quantitative Analysis (MU-Sem.6-Comp) (Data Collection & Sampling Methods) ...Pg. No. (2-3) SSS eee yy 2.1__PRIMARY AND SECONDARY DATA « The most important factor in any statistical data is original collected data must be correct and proper. If there are shortcomings at the very source of data, no useful and valid conclusions can be drawn even after applying the best techniques of data analysis. + In this context, we quote the remarks made by a judge an Indian statistics : “Cox, when you are bit older you will not quote Indian statistics with that accuracy”. The governments are very keen on amassing statistics - they collect them, add them, raise them, to the n® power take the cube root and prepare wonderful diagrams. + But what you must never forget is that every one of those figures comes in the first instance from the "Chowkidar” (ie. the village watchman) who just puts down what he down pleases.” % 2.1.1 Data may be Collected from the Following Two Sources (i) The investigator or the organising agency may conduct the enquiry originally or (ii) Necessary data may be obtained for enquiry from some other sources or agencies who had already collected the data on that subject. Y%_ 2.1.2. Definition of Primary Data and Secondary Data + Primary data : The data collected by an agency for the first time for any statistical investigation and used for statistical analysis is called primary data. + Secondary data : i) Published or unpublished data collected by some agency and (ii) Processed by some agency and (iii) Collected from there and used for the statistical work, is called secondary data, + The second agency, which publishes this data becomes the secondary source. * Thus secondary source is the agency who publishes the data for use by others but which was not originally collected and processed by it. * The clear distinction between primary and secondary data is as follows (i) The distinction is a matter of degree only. (ii) The same data may be secondary in the hands of one and primary in the hands of others, (iii) In general, the data are primary to the source who collects and processes them for the first time and are secondary who later use this data. (New Syllabus wes academic year 21-22) (M6-76) Tal rechNeo Publiatons..a SACHIN SHAH Venture i tion & Sampling Methods) (Data Collect Quantitative Analysis (MU-Sem . ortality (death rates) and fort; (iv) . the data relating to m¢ . 7 lity iv For example th iy the office of Registrar General of India are Dring ™ ited Nations Organisation (yy The same data reproduced by the U! Statistical Abstract becomes: secondary: ry and Secondary Data DA 2.1.3 Difference becween Prima (2) In case of primary data, the entire scheme of the plan begins with the a various terms used, units to be employed type of enquiry to be conducted, “tin accuracy, etc. to be formulated. - ny But in case of secondary data, the compilation of the existing data is to be 4... (2) The proper choice between primary and secondary data is to be made, the nature, objective and scope of enquiry, the time and finances at the rent he status of the agency, ay 6-Comp). 2) in ey q agency, the degree of precision aimed at and t (3) Itis best to obtain the secondary data from the primary source as far ag doing so, we can save ourselves from the errors of transcriptions. (4) The primary source also provides us with detailed discussion about terminology statistical units employed, size of the sample and the technique of sampling m, ‘Neel of data collection and analysis of results. And we can assure ourselved if these a Possible our purpose. (5) Primary data are used if there do not exist any secondary data under study, In some cases, both primary and secondary data are used. 2% 2.1.4 Internal and External Data « Internal data are collected by an origination, business or economic concern or fir] from its own internal operations like production, sales, profits, loans, imports «i exports, capital employed etc. and used by it for its own purpose. External data are those which are obtained from the publications of some 0} agencies like governments, international bodies, private research institutions et use by the given organisation. rw 2.2 SOURCES OF DATA = The sources of data can be classified into two types : (i) Statistical and (ii) Non-statistical incor * Statistical sources refer to data that is gathered for some official purposes censuses and officially administered surveys, YF al HVE (New Syllabus w.ef academic year 21-22) (M6-76) [ka] rech-Neo Publications ..A SACHIN SH . titative Analysis (MU-Sem.6-Comp) Colection & Sampling Methods) ...Pg. No. (2: «+ Non-Statistical sources refer to the collection of data for other administrative purposes or for the private sector. Ys. 2.2.1 Types of Sources of Data (1) Internal sources + When data is collected from reports and records of the organisation itself, they are called as internal sources, + For example, a company publishes its annual reports as profit and loss, total sales, loans, wages etc, (2) External sources + When the data is collected from sources outside the organisation, they are known as external sources. + For example, if a tour and travel company gets information on Kashmir tourism for Kashmir Transport Corporation, it is known as external source of data. ‘2.2.2 Types of Data and the Methods of Collecting the Data * There are two types of data ; (i) Primary data (ii) Secondary data * We have already discussed these two types in Section 2.1.2. « We study the methods of collecting these data. (A) Methods of collecting primary data Direct personal investigation i) Indirect oral investigation (iii) Information though correspondents (iv) Telephonic interview (¥) Mailed questionnaire and (vi) the questionnaire filled by enumerators. "© Remarks (a) One who conducts statistical enquiry and seeks information is known as an investigator. It can be an individual person or an organisation. (b) An enumerator is a person who helps investigators in the collection of data. (© An informant is the respondent who supplies the information to the investigators. (New Syllabus w.e,f academic year 21-22) (M6-76) [recreo Publications ...A SACHIN SHAH Venture (Data Collection & Sampling Methoc, Quantitative Analysis (MU-Sem.6-Comp) (a) Under this method, the investigator obtains the first hand information f fr Om, respondents themselves. (b) He personally visits the respondents to collect the data (information), (c) Merits of direct personal investigation The data collected is first-hand and original in nature. So it is mop, Teli L and accurate. In this method, the questions can be modified according to the leve or the| 2. respondent or other situations. 3. Some additional information can also be obtained along with the rel information. 4, This additional information can be used in further investigations, (@) Demerits of direct personal investigation (a) It is not suitable if the coverage-area is considerably wide. (©) _ It is time-consuming method since the investigator personally visits varing places and meets different people to collect information. (c) This method is expensive, particularly when the field of investigation i: large. The data collected in this method is subjected to personal bias. @ (ii) Indirect Oral Investigation (a) The investigator interview several other persons who are directly or indirectly it touch with the informants. Here there is no direct approach to the informants. (b) Merits of indirect oral investigation Through this method, a large area can be brought under investigation. L 2. (c) Demerits of indirect oral investigation It is economical in terms of time, money and manpower. Since the information is not collected directly from the party, there §* 1 possibility that it may not be completely true. os 2. As compared to direct. personal investigation the degree of accuracy of data is likely to be lower. a Information collected from different persons for the same party may ®° homogeneous and comparable. { f [A SACHIN SHAH Ve" (New Syllabus w.esf academic year 21-22) (M6-76) [BB] Tech-Neo Publications 3. 4, Respondent / witness can modify the information according to his personal interest. (ill Information though Correspondents (a) For this method, local agents or correspondents are appointed and trained to collect the information from the respondents, (0) If the field of investigation is large and the information is to be collected from different parts of the country, then this method is useful. (©) This method is economical and time-saving. (@) The method is convenient for some special purpose investigations. (e) Itis very useful for collecting information on a regular basis. (f) Demerits of information though correspondents : 1 The information supplied by different correspondents often lacks homogeneity, hence it is not comparable. 2, Data obtained by this method may not be very reliable because of the possibility of personal bias and prejudice of the enumerator. For a high degree of accuracy, this method is not very-useful. 4. To collect: the information through correspondence, a lot of time and money is ‘spent. (Iv) Telephonic interviews (a) In this method, data is collected through interviews over the telephone. (b) Merits of telephonic interviews 1. Again this method is quite economical and time-saving. 2. This method is useful where the field of investigation is very wide and the information is to be collected from different parts of the country. 3. The data is reliable since itis obtained directly from the party. (© Demerits of telephonic interviews 1. The disadvantage of this is limited accessibility to people. One cannot reach to the people, who do not have or own a telephone or mobile. 2, Telephone interviews obstruct visual reactions of the respondents. These reactions become helpful in obtaining information on sensitive issues. (v) Mailed Questionnaire method (@) In this method, a questionnaire contains a number of questions related to the investigation. They are prepared thoroughly. (ew Sylabus wef academic year 21-22) (M6-76) [ab rech-neo Pubicatons..A SACHIN SHAH Vertue eee aeeaeeeeee ©) @) (e) 2. Many of the informants do not return the questionnaire. 4. Informants may fail to understand the correct sense of some questions, and 5. The process is time-consuming, particularly when the information is to be (vi) Questionnaire and its Qualities (@) A questionnaire is a list or set of printed ‘questions, which is filled by the informants. If itis filled by enumerate, then it is known as a schedule. (b) Characteristics of a good questionnaire 1. Questions should be short, simple and straightforward. 2. The number of questions should be limited and they should be in a logic#! order. To assist the importance, clear instructions should be given. To know the shortcomings of a questionnaire, it should be tried on 2 5! selected group, —___Selected grc (New Syllabus w.e.f academic year 21-22) (M6-76) Breen. along with the t by post to informant oa The i ee ‘after filling up the questionnaire, send it bag, _ e ini . investigator. 4 Merits of mailed questionnaire metho a Again, this method is useful where the field of investigation is very jay, | the information is to be collected from different parts of the country, This method is economical as it requires less money and labour, Since the informants are directly involved in the collection of data, the is very much original. 4. Every question is interpreted by the respondent in his own way. Hence, itis free from the personal bias of the investigator. 5. The method is very convenient for sensitive questions and maintaing anonymity of respondents. a: Aaty Demerits of the mailed questionnaire method : 1. This method is applicable only where the respondents have good knowledg about the questions that are sent. The informants are least interested in the investigation, hence there is a lack of response from their side. may not answer them. Sometimes, informants may provide vague and ambiguous answers. obtained by post. tu? Neo Publications ...A SACHIN SHAH Ve - cuonitatve Analysis (MU-Sem6-Comp) (Data Colecton & Sampling Methods) ..Pg. No. (2-8 5. Questions containing mathematical calculations should be completely avoided 6, Personal questions affecting sentiments and controversial questions related to religion, polities, ete. should be avoided, 7, Respondents should be given assurance that their response will not be shared with anyone, 8. Toconvey the purpose of how it will help the parties involved, a precise cover letter should be enclosed. (© Method of filling Questionnaire Under this method, an enumerator personally visits informants along with a questionnaire, asks questions, and note down their response in the questionnaire in his own language. (@ Merits of questionnaires 1. Since the investigator has direct contact with the respondents, he can have accurate and reliable information. 2. The presence of enumerator may induce the respondents to give information. Hence the chances of no response in questionnaire method are very less. But in mailed questionnaire, there is possibility of no response. 3, This method can be used even though the respondents are not educated. But in case of mailed questionnaire, this is not possible. () Demerits of Questionnaire 1. This method is expensive as expenditure on training, renumeration and conveyance is to be borne by the investigator. 2. This method is very time consuming as the enumerator has to visit the informants personally. 3. If the enumerators are not properly trained or of biased views, then they become inefficient and unable to carry out the enquiry properly. This affects adversely on the results of the enquiry. 7%. 2.2.4 Sources of Secondary Data \®) Secondary data refers to the data that has already been callected by some other Person or agency and is used by us, ©) Sources of secondary data can be classified under two categories : (i) Published sources (ii) Unpublished sources eee ee Saree ‘New jes wef academic year 2-22) (M6-T0) [ad recn.ieoPublieations .A SACHIN sian Venture ae Quantitative Analysis (MU-Sem.6-COMP) (Data Collection & Sampling Methods) ...Pg, No, a y > @ Published sources Published sources mean data available in printed form. It includes periodicals published by various governmen data regarding prices, Production 9 Sen, published by Economic Times, p; = Fin, (a) Magazines, journals and government and private organisations, related to birth, death, education ete. tay Express ete. ene (b) Reports of various committees oF commissions like reports of pay comm, port ete. ing report, finance commission re} es that are regularly published by ageng, 8 ig (©) Reports of international agenci UNO, WHO, IMF ete. > (i) Unpublished sources (@) All the statistical material is not always published. () This category includes the records maintained by various government and priv, offices. yr some institutions. (©) It includes the research done by scholar students 0 companies can also be us (a) Sources like reports prepared by private investigation depending upon the need. Precautions to be taken while using second We must ensure ourselves about the reliability dat ator must ensure that the data is suitable for the lary data (e) a that has been published. @ Gi) The investig: present enquiry. purpose of th (iii) Depending upon the nature, objectives, time of collection etc. of the seconday data, the suitability of the data can be determined. d, so that biases and prejudices leading to incorre! (iv) Adequate data is to be use conclusions can be avoided. 1e data. at was used in collecting th is mayb (v) The investigator must find out the method thi es, sampling method! (vi) Depending upon the mode of selection of sampl biased. (vii) One must make himself sure before making use of the secondary data. oe A SACHIN SHA > (New Syllabus w.e.f academic year 21-22) (M6-76) Tech-Neo Publications « Pee CO snitave Analysis (MU-Sem &-Comp) a {Data Collection & Sampling Methods) ...Pg. No. (2-11) METHODS OF COLIEQNUA Roo m 23 METHODS OF COLLECTING DATA Data is a collection of facts, figures, objects, symbols and events gathered from different sources. Organisations collect data to make better decisions. Without data, it would be very difficult for organisations to make appropriate decisions, and so data is collected at various points in time from different audiences, For example, before launching a new product an organisation needs to collect data on product demand, customer preferences, competitors etc. In case data is not collected beforehand, the newly launched product of an organisation may lead to failure for many reasons, such as less demand and inability to meet customer needs. Collected data miust be analysed or processed to get the required result, otherwise it will not serve any purpose. Data collection can be categorised into : 1. Primary methods of data collection and 2, Secondary methods of data collection, Y 2.3.1 Primary Data Collection Methods Primary data is collected from the first hand experience and is not used in the past. ‘The data gathered by primary collection methods are specific to the research’s motive and highly accurate. Primary data collection methods can be divided into two categories : 1. Quantitative methods 2. Qualitative methods % 2.3.2 Quantitative Methods Quantitative techniques for demand forecasting and market research make use of statistical tools. Based on historical data, demand is forecast. These methods of primary data collection are generally used to make long-term forecasts. 2.3.3 Time Series Analysis (New Syllabus w.e.f academic year 21-22) (M6-76) ‘The term time series refers to a sequential order of values of a variable, known as a trend, at equal time intervals. Using patterns, an organisation can predict the demand for its products and services for the projected time. Tech-Neo Publications ...A SACHIN SHAH Venture it ignificant trends, smoothing techni ime series lacks significant , a ee iene tion from the historical demand. es used. They eliminate a random varia’ : It helps in identifying and demand levels to estimate future demand. The most common methods used in smoothing demand forecasting technique, simple moving average method and the weighted moving average method, % 2.3.4 Barometric Method ‘This method is also known an leading indicators approach. Researchers ys, method to speculate future trends based on current developments. thig When the past events are considered to predict future events, they act ag lead indicators. 2.3.5 Qualitative Methods When historical data is not available, Here there is no need of numbers of mathematical calculations. Qualitative research is closely associated with words, sounds, feeling, emotions, colon and other elements that are non-quantifiable. There techniques are based on qualitative methods are used in such situation, experience, judgement, intuition, conjecture, emotion, etc. Quantitative methods do not provide the motive behind participant’s responses, often don’t reach under presented populations and span long periods to collect the data, Hence, it is best to combine quantitative methods with qualitative methods. 2.3.6 Surveys Surveys are used to collect data from the target audience and gather insights into their preferences, opinions, choices and feedback related to their products and services. One can also use a ready-made survey template to save on time and effort. Online surveys can be customised as per the business’s brand by changing the theme, log etc. ‘They can be distributed through several distribution channels such as email, website offline, app, QR Code, social media, ete. Depending on the type and source of audience one can select the channel. Once the data is collected, survey software can generate various reports and ™ analytics algorithms to discover hidden insights. fe Tech-Neo Publications ...A SACHIN SHAH Vert > cuatave Analysis (MU-Som 6-Comp) {Data Collection & Sampling Methods) ...P9. . Asurvey dashboard can give you the statistics related to response rate, completion rate, filters based on demographics, export and sharing options, etc + One can maximise the effort spent on online data collection by integrating survey puilder with third-party apps. a. 25.7 Polls «Polls comprise of one single or multiple choice questions. When it is required to have a quick pulse of the audience's sentiments, one can go for polls. Because they are short in length, it is easier to get responses from the people. + Similar to surveys, online polls, also can be embedded into various platforms. Once the respondents answer the question, they can also be shown how they stand compared to others responses. ‘2.3.8 Interviews « In this method, the interviewer asks questions either face-to-face or through telephone to the respondents. In face-to-face interviews, the interviewer asks a series of questions and note down responses. «In case it is not possible to meet the person, the interviewer can go for a telephonic interview. + This form of data collection is suitable when there are only a few respondents. It is too time-consuming and tedious to repeat the same process it there are many participants. % 2.3.9. Delphi Technique + In this method, market experts are provided with the estimates and assumptions of forecasts made by other experts in the industry. + Experts may revise or reconsider their estimates. That will depend upon the information provided by other experts. * The consensus of all experts on demand forecasts constitutes the final demand forecast. °3. 2.3.10 Focus Groups Ina focus group, a small group of people, around 8-10 members, discuss the common areas of the problem. Each individual gives his insights on the issue concerned. A moderator regulates the discussion among the group members. (New Sjlabus w.ef academic year 21-22) (M6-76) HB Tech-Neo Publications ..A SACHIN SHAH Venture oo Quantitati .ction & Sampling Method ive Analysis (MU-Sem.6-COmP) (Data Colle pl 9) At the end of the dise he group reaches @ consensus. Na y tt the discussion, the gro TH 2.3.11 Questionnaire * A questionnaire is a printed set of questions. It is either open-ended or oly, " * The respondents are expected to answer the questions. It is based on thei, leo and experience, with the problems (questions) concerned. ley * The questionnaire is a part of the survey, whereas the questionnaire’s end-goa) May or may not be a survey. A 2.3.12 Secondary Data Collection Method Secondary data is the data that has been used in the past. The researcher cq, nd data from the sources, both internal and external, to the organisation : (i) Internal sources of secondary data (a) Organisation’s health and safety records. (b) Mission and vision statements. (c) Financial statements (@) Magazines (e) Sales report | (CRM software (g) Executive summaries | Gi) External sources of secondary data | (a) Government reports. | (b) Press releases (c) Business journals (d) Libraries (e) Internet * The secondary data collection methods also involve both quantitative and qualitative techniques. * Secondary data is easily available and hence it is less time-consuming and les expensive as compared to the primary data. * But with the secondary data collection methods, the authenticity of the dsl (collected) cannot be verified. (New Syllabus we academic year 21-22) (M6-76) [Ral tech-Neo Publications ..A SACHIN SHAH Ver™

You might also like