0% found this document useful (0 votes)
8 views

2. Biostatistics and Research Methodology

Uploaded by

jssortho2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

2. Biostatistics and Research Methodology

Uploaded by

jssortho2022
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 132

BIOSTATISTICS AND

RESEARCH
METHODOLOGY-PART 1

Presented by-
Rochana Mahesh
1st MDS
BIOSTATISTICS
Contents:
• Introduction • Types of variability
• Definition • Measures of dispersion
• Normal distribution
• History
• Tests of significance
• Applications
• Analysis and interpretation
• Sources and collection of data
• Correlation and Regression
• Data sampling
• Conclusion
• Presentation of data
• Measures of statistical average
Introduction:
• Any science needs precision for it’s development.

• For precision, facts, observations or measurements have to be expressed in


figures.

• Similarly in medicine, be it diagnosis, treatment or research everything


depends on measurement.

• E.g. If you have to measure the cause effect relationship, we need statistics.

• Hence, training in statistics has been termed “indispensable” for the students
of medical science.
Definition:
Statistics: It is the science of compiling, classifying and tabulating
numerical data and expressing the results in a mathematical and
graphical form.

Biostatistics: It is that branch of statistics concerned with the


mathematical facts and data related to biological events.

It is the development and application of statistical reasoning and


methods in addressing, analyzing and solving problems in public health,
health care and biomedical, clinical and population-based research.
Constant: Quantities that do not vary.

For e.g., in biostatistics- mean, standard deviation are considered constant for a
population.

Variable: Characteristics which takes different values for different persons, places or
things, such as height, weight, blood pressure.

Parameter: It is the summary value or constant of the variable that describes a


population.

For e.g., in an institution there are 25% men. This describes the population, hence it is
a parameter.

Attribute: A characteristic based on which the population can be described into


categories or class.
History:
• The science of statistics is said to have originated from two main
sources:
1) Government records
2) Mathematics
• It developed from registration of heads of families in ancient Egypt to
the Roman census on military strength, birth and deaths, etc. and
found it’s application gradually in the field of health and medicine.
• John Graunt who is neither a physician nor a mathematician is the
“Father of Health Statistics.”
What is statistics?
• Statistics is not merely a compilation of computational techniques

• Statistics is :

- a way of learning from data

- concerned with all elements of study design, data collection and


analysis of numerical data

- does require judgement

• Biostatistics is statistics applied to biological and health problems.


Why statistics:
• Variability in measurement can be handled using statistics.
For e.g., investigator makes observations according to his judgement of
the situation, depending on his skills, knowledge and experience.
• Epidemiology and biostatistics are sister sciences or disciplines.
• Epidemiology collects facts relating to group of population in places,
times and situations.
• Biostatistics converts all facts into figures and at the end translates
them into facts, interpreting the significance of their results.
• Epidemiology and Biostatistics both deal with the facts-figures-fact 
Quantitative Methodology
Applications of Biostatistics:
• Dentistry

• Pharmacology

• Physiology and Anatomy

• Community medicine

• Community dentistry

• Public health

• In the field of research


Collection of Data
• The collective recording of observations either numerical or descriptive of
things is called data.
• Demographic data comprises details of population size, distribution, geographic
distribution, ethnic group, socio-economic factors and their trends over time.
• It is obtained from census and other public service reports.
• Depending on the nature of the variable, data is classified into:
1) Qualitative data- attributes or qualities:
a) nominal
b) ordinal
2) Quantitative data- through measurements using calipers, like arch length, arch
width, fluoride concentration in water supply etc
a) discrete
b) continuous
Sources of data

Surveys: Records:
Experiments:
Carried out for epidemiological
Performed to collect data for Records are maintained as a
studies in the field by trained
routine in registers and books
investigations and research by teams to find incidence or
over a long period of time,
one or more workers. prevalence of health or disease in
a community providing readymade data.
Collection of Data

Secondary Source:
Primary Source:
Already recorded data.
Data obtained by the
investigator himself Eg. Hospital records,
records from OPD
Primary data can
be obtained by:
Direct personal
Questionnaire
interviews: Oral health method:
-Face to face examination:
contact with the -List of questions
-When information pertaining to the
person. is needed on health survey
-Subjective status “questionnaire” is
phenomena prepared.
-Cannot be used in
-Accurate and any extensive studies
ambiguity can be
-Various
clarified -Includes treatment informants are
requested to supply
-Cannot be used in the information.
extensive studies
Sampling and Sample design:
• Population: Group of all individuals who are the focus of the
investigation is known as a population.

• Sample: a sample is a set of individuals or objects collected or


selected from a statistical population by a defined procedure.

• Sample units: The elements of the sample are known as sample points
or sample units.
Sampling:
• Sampling can be explained as a specific principle used to select members of population
to be included in the study. It has been rightly noted that “because many populations of
interest are too large to work with directly, techniques of statistical sampling have been
devised to obtain samples taken from larger populations.”

• In other words, due to the large size of


target population, researchers have no
choice but to study the a number of cases
of elements within the population to
represent the population and to reach
conclusions about the population
The Process of Sampling in Primary Data Collection
• The process of sampling in primary data collection involves the following stages:
1. Defining target population. Target population represent specific segment within wider
population that are best positioned to serve as a primary data source for the research. For
example, for a dissertation entitled ‘Impact of social networking sites on time management
practices amongst university students in the UK” target population would consist of
individuals residing in the UK.
2. Choosing sampling frame. Sampling frame can be explained as a list of people within the
target population who can contribute to the research. For a sample dissertation named above,
sampling frame would be an extensive list of UK university students.
3. Determining sampling size. This is the number of individuals from the sampling frame
who will participate in the primary data collection process. The following observations need
to be taken into account when determining sample size:
a) The magnitude of sampling error can be diminished by increasing the sample size.
b) There are greater sample size requirements in survey-based studies than in experimental
studies.
c) Large initial sample size has to be provisioned for mailed questionnaires, because
the percentage of responses can be as low as 20 to 30 per cent.
d) The most important factors in determining the sample size include subject
availability and cost factors
• For example, for the same research of ‘Impact of social networking sites on time
management practices amongst university students in the UK’ sample size could
be determined to include 200 respondents.
4. Selecting a sampling method. This relates to a specific method according to
which 200 university students in the UK are going to be selected to participate in
research named above.
5. Applying the chosen sampling method in practice.
Types of Sampling:
Sampling methods are broadly divided into two categories: probability and non-probability.

• In probability sampling every member of population has a known chance of participating


in the study. Probability sampling methods include simple, stratified systematic,
multistage, and cluster sampling methods.

• In non-probability sampling, on the other hand, sampling group members are selected on
non-random manner, therefore not each population member has a chance to participate in
the study. Non-probability sampling methods include purposive, quota, convenience and
snowball sampling methods.
• PROBABILITY SAMPLING
SIMPLE RANDOM SAMPLING : Sample group members are selected in a random manner
Randomness is ensured by-
a) Lottery method b) Table of random numbers

STRATIFIED RANDOM SAMPLING : Representation of specific subgroup or strata


EX: To determine prevalence of DMF teeth in different age groups, different age groups form the
strata and random sampling is done
 SYSTEMATIC SAMPLING : Including every Nth member of population in the study.
EX: To obtain a sample of patients attending a dental clinic
If there are 15 patients in a clinic and sample size is decided to be 5, then 15/5 gives quotient as
3. A random number is selected say 2, the next unit will be 2,5,8 and so on until 5 sample size
is reached

CLUSTER SAMPLING : Clusters of participants representing population are identified as


sample group members and simple random sampling is done for groups or clusters and not
for individual subjects.
MULTISTAGE SAMPLING : Sampling conducted on several stages. The first stage is to select
the groups or clusters. Then subsamples are taken in as many subsequent stages as necessary to
obtain the desired sample size.
EX: 1st STAGE : Choice of states within country
2nd STAGE : Choice of cities within each state
3rd STAGE : Choice of colleges within each city
• NON PROBABILITY SAMPLING
QUOTA SAMPLING : Sample group members are selected on the basis of specific criteria
EX: A researcher is interested in the attitudes of members of different states towards the ban of
smoking in public. In Karnataka, a random sample might miss Kashmiris or Rajasthanis because
they are less in number. To be sure of their inclusion, a researcher could set a quota of 3%
Kashmiris and Rajasthanis for the sample. This will guarantee that the views of Kashmiris and
Rajasthanis are represented in the survey
PURPOSIVE SAMPLING: It is a non representative subset of some larger population and is
constructed to serve a very specific need or purpose. A subset of a purposive sample is a
SNOWBALL SAMPLE or chain referral sample. Sample group members nominate
additional members to participate in the study
EX : Hard to track populations such as those with illegal behavior like drug users, homeless
people

CONVINIENCE SAMPLING :Obtaining participants conveniently with no requirements


whatsoever
Data presentation:

There are 2 main types of data presentation:

• Tabulation

• Graphic representation
Tabulation:
• Tables are simple devices used for the presentation of statistical data.

• Principles:

1. They should be as simple as possible

2. Data should be presented according to size or importance,


chronologically or alphabetically.

3. They should be self explanatory


• They can be simple or complex depending upon the number or
measurement of a single or multiple set of items.

• 3 types:

Master table: combines all the data obtained from a survey.

Simple table: one way table which supplies the answer to questions
about one characteristic of data only.
Frequency Distribution table: The data is split into convenient groups
( class interval) and the number of items (frequency) which occurs in
each group is shown in the adjacent column
Master table Simple table

Sl no Age Sex Education D M F DMF PI


1
2
3
Frequency Distribution Table
Charts and Diagrams
• It is the presentation/ representation of data as diagrams and graphs.
• It is the most convincing and appealing way of representing data.
Advantages: Disadvantages:
Gives better insight and It cannot represent all details of the
understanding of the data variables
Makes the presentation Very difficult to include and study the
eye-catching small differences in small measurements.
The comparison They are only a supplement to the tabular
becomes easy representation of data.
The data becomes more They usually show only approximate
logical/clear. figures.
Bar chart
• Lengths of bars drawn vertical or horizontal in proportion to the
frequency of the variable.
• A suitable scale is chosen.
• Bars are usually equally spaced.
• Types:
1. Simple bar chart
2. Multiple bar chart- two or more variables are grouped together.
3. Component bar chart- bars are divided into 2 parts with each part
representing a certain item in proportion to the magnitude of that item.
Bar diagram showing the place of origin of 500 hostellers

Component bar graph


Simple bar graph

Multiple bar graph


Pie Chart

• The frequencies of the


group are shown as a
segment of a circle.
• Degree of angle
denotes the frequency.
• Angle is calculated by:
class frequency/ total
observations x 360
Line Diagram:
• Useful to study the changes of values in the variable over time and is
the simplest type of diagram.
Histogram:
• Pictorial presentation of
frequency distribution
• No space between the cells
• Class interval to be given
• Area of the rectangle is
proportional to the frequency
Frequency Polygon:
• Obtained by joining midpoints of
histogram blocks at the height of
frequency by straight lines
usually forming a polygon
Frequency Curve
• When the number of
observations is very large and
class interval is reduced, the
frequency polygon loses its
angulations becoming a smooth
curve known as frequency curve.
Pictogram:
• Popular method of presenting
data to the common man through
small pictures or symbols.
Spot map/ Shaded map/ Cartogram
• These maps are prepared to show
geographic distribution of
frequencies of characteristics.
Measures of Statistical Average or Central
Tendency
• Central value around which all the other observations are distributed.
• Main objective is to condense the entire mass of data and to facilitate
the comparison.
• Most common measures of central tendency are:
1. Mean
2. Median
3. Mode
Mean
• Refers to the arithmetic mean
• It is obtained by adding the individual observations, divided by the
total number of observations.
• Advantages:
1. It is easy to calculate
2. Most useful of all the averages.
• Disadvantages:
1. Influenced by abnormal values
Median:

• When all the observations are arranged either in ascending order or


descending order, the middle observation is known as median.

• In case of an even number the average of the two middle values is


taken.

• Median is the better indicator of central value as it is not affected by


the extreme values.
Mode:
• Most frequently occurring observation in a data is called mode.
• It is not often used in medical statistics.

• For example:
• Number of decayed teeth in 10 children:
2,2,4,1,3,0,10,2,3,8
• Mean= 34/10= 3.4
• Median ( 0,1,2,2,2,3,3,4,8,10) = 2+3/2 = 2.5
• Mode = 2 (3 times)
Errors
• Errors are the difference between a value obtained from a data collection process and
the ‘true’ value for the population.

• Three types:
1. Observer error – the investigator may alter some information or not record the
measurement correctly.
2. Instrumental error - this is due to defects in the measuring instrument. Both the
observer and the instrument error are called non sampling error.
3. Sampling error or error of bias – this occurs when the samples are not chosen at
random from a population. A sample must be a represerntative of the whole
population.
Measures of Dispersion
• Dispersion is the degree of spread or variation of the variable, about a
central value.
• Helps to know how widely the observations are spread on either side
of the average.
• Most common measures of dispersion are:
1. Range
2. Mean deviation
3. Standard deviation
Range: Mean Deviation: Standard Deviation:
Defined as the difference It is the average of the Most important and widely
between the value of the deviation from the arithmetic used measure of studying
largest item and the smallest mean. dispersion.
item.

Gives no information about Greater the S.D, greater will


the values that lie in- be the magnitude of
between the extreme values. X- Arithmetic mean dispersion from the mean.
Xi- value of each observation
in the data Smaller S.D means a higher
n- number of observations in degree of uniformity of the
the data observations.
Coefficient of variation

• It is used to compare attributes having two different units of


measurement. Eg. Height and weight.

• Denoted by CV

• CV for a population= S.D x 100 / Mean

• It is expressed as percentage.
Normal Curve/ Normal Distribution/
Gaussian Distribution
• When the data is collected
from a very large number of
people and a frequency
distribution is made with
narrow class intervals, the
resulting curve is smooth and
symmetrical- Normal curve.
• The limits on either side of
measurement are called
confidence limits.
Standard Normal Deviation
• There may be many normal curves but only one standard normal curve.
• Characteristics:
Bell shaped
Perfectly symmetrical
Frequency increases from one side, reaches its highest and decreases
exactly the way it had increased.
Total area of the curve is one, its mean is zero and standard deviation
is one.
The highest point denotes mean, median and mode which coincide.
Tests of Significance
• A statistical procedure by which one can conclude if the observed results
from the sample is due chance or not
• When different samples are drawn from the same population, the
estimates might differ- sampling variability.
• It deals with technique to know how far the difference between the
estimates of different samples is, due to sampling variation.
Standard error of mean
Standard error of proportion
Standard error of difference between two means
Standard error of difference between two populations
Standard error of mean
• Gives the standard deviation of the means of several samples from the same
population.

S.E =
Standard error of proportion
• It may be defined as a unit that measures variation which occurs by
chance in the proportions of a character from sample to sample or
from sample to population or vice versa in a qualitative data.

Standard error of proportion =


p- population proportion
q- 1-p
n- number of samples from population
Standard error of difference between two means
• Used to find out whether the difference between the means of two groups is
significant to indicate that the samples represent two different universes.
Standard error of difference between
proportions
• Used to find out whether the difference between the proportions of
two groups is significant or has occurred by chance
• A null hypothesis or hypothesis(H0) of no difference asserts that there
is no real difference in sample and the population in particular matter
under consideration and the difference found is accidental and arise
out of sampling variations.
• The alternative hypothesis (H1) of significant difference states that
there is a difference between the two groups compared.
• A test of significance such as Z test is performed to accept the null
hypothesis or to reject it and accept the alternative hypothesis.
• To make minimum error in rejection or acceptance of null hypothesis,
we divide the sampling distribution or the area of normal curve into
two regions or zones namely:
- a zone of acceptance
- a zone of rejection
• The distance from the mean at which H0 is rejected is called the
level of significance.

• If it falls in the zone of rejection for H0, shaded areas under the
curves and it is denoted by the letter P which indicates the
probability or relative frequency of occurrence of the difference
by chance.

• Greater the Z value, lesser will be the P.


1. Zone of acceptance: if the result of a sample falls in the plain area, i.e. within the
mean, then the null hypothesis is accepted. Hence this area is known as the zone of
acceptance for the null hypothesis.
2. Zone of rejection: if the result of a sample falls in the shaded area, i.e. beyond mean. It
is significantly different from the universe value. Hence, the H0 is rejected and the
alternate H1 is accepted. This area is therefore called the zone of rejection for null
hypothesis.
Z- Test
• Used to test the significance in difference in means for large samples.
• Criteria:
Sample must be randomly selected
Data must be quantitative
The variable is assumed to follow a normal distribution in the
population.
Samples should be larger than 30.
Tests of significance are of 2 types :
• Parametric test: These tests are when data is normally distributed
(Quantitative)
• Non parametric test : These tests are when data is not normally
distributed (Qualitative)
Student’s t-Test (Parametric Test)
• Small samples or their Z values do not follow normal distribution as the
large ones do.
• So, the Z value based on normal distribution will not give the correct level
of significance or probability of a small sample occurring by chance.
• In case of small samples, t-test is applied instead of Z-test.
• It was designed by W.S.Gossett whose pen name was Student. Hence, this
test is also called Student’s t-test.
• Two types:
Paired t test
Unpaired t test
Criteria for applying t test

• Random samples

• Quantitative data

• Variable normally distributed

• Sample size less than 30


Paired t test

• A paired t-test (also known as a dependent or correlated t-test) is a


statistical test that compares the averages/means and standard
deviations of two related groups to determine if there is a significant
difference between the two groups.

• A significant difference occurs when the differences between groups


are unlikely to be due to sampling error or chance.
What are the hypotheses of a paired t-test?

• There are two possible hypotheses in a paired t-test.

• The null hypothesis (H0) states that there is no significant difference


between the means of the two groups.

• The alternative hypothesis (H1) states that there is a significant difference


between the two population means, and that this difference is unlikely to be
caused by sampling error or chance.
When to use a paired t-test?

• Paired t-tests are used when the same item or group is tested twice, which is
known as a repeated measures t-test. Some examples of instances for which a
paired t-test is appropriate include:

• The before and after effect of a pharmaceutical treatment on the same group
of people.

• Body temperature using two different thermometers on the same group of


participants.

• Standardized test results of a group of students before and after a study prep
course.
Unpaired t test
• An unpaired t-test (also known as an independent t-test) is a statistical procedure that compares
the averages/means of two independent or unrelated groups to determine if there is a significant
difference between the two.

What are the hypotheses of an unpaired t-test?

• The hypotheses of an unpaired t-test are the same as those for a paired t-test. The two
hypotheses are:

• The null hypothesis (H0) states that there is no significant difference between the means of the
two groups.

• The alternative hypothesis (H1) states that there is a significant difference between the two
population means, and that this difference is unlikely to be caused by sampling error or chance.
Analysis of Variance (ANOVA) test
• When comparisons of more than two independent groups on a continuous
outcome is required, then ANOVA is used
• It is the best way to test the equality of three or more means of more than 2
groups.
• One way anova- where only one factor will affect the result between 2 groups.
• Two way anova- where we have 2 factors that affect the result or outcome.
• Multiway anova- three or more factors affect the result or outcomes between
groups.
The CHI SQUARE test for qualitative data
( test) – Non Parametric Test
• Developed by Karl Pearson

• Chi-square test offers an alternate method of testing the significance of difference between two
proportions. It has the advantage that it can also be used when more than 2 groups are to be
compared.

• It is most commonly used when data are in frequencies such as in the number of responses in
two or more categories.
• Important applications:

1. Test of proportions: as an alternate test to find the significance of difference in


2 or more than 2 proportions

2. Test of association: the test of association between 2 events in binomial or


multinomial samples is the most important application of the test in statistical
methods. Two events can be studied for their association such as smoking and
cancer ,etc .

3. Test of goodness of fit: chi-square test is also applied as a test of goodness of


fit, to determine if actual numbers are similar to be expected or theoretical
numbers- goodness of fit to a theory.
Correlation and Regression

• Correlation: when dealing with measurement on 2 sets of variable in a same


person, one variable may be related to the other in some way. (ie. Chance in one
variable may result in the change in the value of the other variable)

• It is the relationship between two sets of variables.

• Correlation coefficient is the magnitude or degree of relationship between 2


variables.
• Obtained by plotting scatter diagram.

• Perfect positive correlation: in this 2 variables denoted by X and Y are directly


proportional and fully correlated with each other. (r) = +1. ie both numbers rise
and fall in the same population.

• Perfect negative correlation- values are inversely proportional to each other. Ie.
When one rises, the other falls in the same proportion. (r) = -1.
Types of correlation
Regression
• It is a statistical method for studying the relationship between a single
dependent variable and one or more independent variable
• It is customary to denote the independent variate by x and the dependent
variate by y.

• The value of b is called the regression coefficient of y upon x. similarly,


we can obtain the regression of x upon y.
CONCLUSION

Research and scientific methods may be considered a course of critical


enquiry leading to the discovery of fact or information, which increases
our understanding of human health and disease
THANK YOU
SAMPLING ERRORS
• Errors are the difference between a value obtained from a data collection
process and the ‘true’ value for the population.

• Three types:
1. Observer error – the investigator may alter some information or not record
the measurement correctly.
2. Instrumental error - this is due to defects in the measuring instrument.
Both the observer and the instrument error are called non sampling error.
3. Sampling error or error of bias – this occurs when the samples are not
chosen at random from a population. A sample must be a representative of
the whole population.
• In statistics, a Type I error is a false positive conclusion, while a Type II
error is a false negative conclusion.

• The probability of making a Type I error is the significance level, or alpha (α),
while the probability of making a Type II error is beta (β)

• Using hypothesis testing, you can make decisions about whether your data
support or refuse your research predictions with null and alternative hypotheses.

• Hypothesis testing starts with the assumption of no difference between groups


or no relationship between variables in the population—this is the null
hypothesis. It’s always paired with an alternative hypothesis, which is your
research prediction of an actual difference between groups or a true relationship
between variables.
• For Example: Null and alternative hypothesis, you test whether a new drug
intervention can alleviate symptoms of an autoimmune disease. In this case:

The null hypothesis (H0) is that the new drug has no effect on symptoms of the

disease.

The alternative hypothesis (H1) is that the drug is effective for alleviating

symptoms of the disease.

• Then, you decide whether the null hypothesis can be rejected based on your data
and the results of a statistical test. Since these decisions are based on
probabilities, there is always a risk of making the wrong conclusion.
TYPE 1 ERROR
• A Type I error means rejecting the null hypothesis when it’s actually true. It
means concluding that results are statistically significant when, in reality, they
came about purely by chance or because of unrelated factors.

• Ex : You decide to get tested for COVID-19 based on mild symptoms. There are
two errors that could potentially occur:

Type I error (false positive): the test result says you have coronavirus, but you
actually don’t.
• The risk of committing this error is the significance level (alpha or α) you choose.
That’s a value that you set at the beginning of your study to assess the statistical
probability of obtaining your results (p value).
• Significance level is a term used to state that it is unlikely that their observations
could have occurred under the null hypothesis of a statistical test. Significance is
usually denoted by a p-value, or probability value.
• The significance level is usually set at 0.05 or 5%. This means that your results
only have a 5% chance of occurring, or less, if the null hypothesis is actually true.
• If the p value of your test is lower than the significance level, it means your
results are statistically significant and consistent with the alternative hypothesis.
If your p value is higher than the significance level, then your results are
considered statistically non-significant.
• For Example : In your clinical study, you compare the symptoms of patients who
received the new drug intervention or a control treatment. Using a t test, you
obtain a p value of .035. This p value is lower than your alpha of .05, so you
consider your results statistically significant and reject the null hypothesis.

• However, the p value means that there is a 3.5% chance of your results occurring
if the null hypothesis is true. Therefore, there is still a risk of making a Type I
error.

• To reduce the Type I error probability, you can simply set a lower significance
level.
TYPE 2 ERROR
• A Type II error means not rejecting the null hypothesis when it’s actually false.
This is not quite the same as “accepting” the null hypothesis, because hypothesis
testing can only tell you whether to reject the null hypothesis.
• A Type II error means failing to conclude there was an effect when there actually
was. In reality, your study may not have had enough statistical power to detect
an effect of a certain size.
• Power is the extent to which a test can correctly detect a real effect when there is
one. A power level of 80% or higher is usually considered acceptable.
• The risk of a Type II error is inversely related to the statistical power of a study.
The higher the statistical power, the lower the probability of making a Type II
error.
• The Type II error rate is beta (β)
• For Example: When preparing your clinical study, you complete a power
analysis and determine that with your sample size, you have an 80% chance of
detecting an effect size of 20% or greater. An effect size of 20% means that the
drug intervention reduces symptoms by 20% more than the control treatment.

• However, a Type II may occur if an effect that’s smaller than this size. A smaller
effect size is unlikely to be detected in your study due to inadequate statistical
power.
Statistical power is determined by:

• Size of the effect: Larger effects are more easily detected.

• Measurement error: Systematic and random errors in recorded data reduce


power.

• Sample size: Larger samples reduce sampling error and increase power.

• Significance level: Increasing the significance level increases power.

• To (indirectly) reduce the risk of a Type II error, you can increase the sample
size or the significance level.
Analysis of Variance (ANOVA) test
• When comparisons of more than two independent groups on a continuous
outcome is required, then ANOVA is used
• It is the best way to test the equality of three or more means of more than 2
groups.
• One way ANOVA- where only one factor will affect the result between 2
groups.
• Two way ANOVA- where we have 2 factors that affect the result or outcome.
• Multiway ANOVA- three or more factors affect the result or outcomes between
groups.
• One way ANOVA: Suppose we want to know
whether or not three different exam prep programs
lead to different mean scores on a certain exam. To
test this, we recruit 30 students to participate in a
study and split them into three groups . The
students in each group are randomly assigned to
use one of the three exam prep programs for the
next three weeks to prepare for an exam. At the
end of the three weeks, all of the students take the
same exam.
The exam scores for each group are shown :
• Two way ANOVA :You are researching which type of fertilizer and planting
density produces the greatest crop yield in a field experiment. You assign different
plots in a field to a combination of fertilizer type (1, 2, or 3) and planting density
(1=low density, 2=high density), and measure the final crop yield in bushels per
acre at harvest time.

• You can use a two-way ANOVA to find out if fertilizer type and planting density
have an effect on average crop yield.
BIOSTATISTICS AND
RESEARCH
METHODOLOGY-PART 2
RESEARCH
METHODOLOGY
Contents: RESEARCH METHODOLOGY
• Introduction
• Types of research
• Objectives of research
• Steps involved in research
• Conclusion
• References
WHAT IS RESEARCH?
• Research is a logical and systematic search for new useful information on a
particular topic.
• Research is planned activity leading to the generation of information that will help
in answering a specific question.
• Research is a quest for knowledge through diligent search or investigation or
experimentation aimed at the discovery and interpretation of new knowledge.
-Health research methodology, WHO.
• Research is a systematized effort to gain new knowledge
- Redman and Mory
Types of Research

• Conventional research includes descriptive studies and analytical studies.

• Unconventional research, which is gaining more importance nowadays,


includes operational research, evaluation of health systems, economic studies
(cost benefit, cost effectiveness etc.), qualitative research and research synthesis
(reviews and meta-analysis)
All research can be broadly classified into

1. BASIC Vs APPLIED

2. OBSERVATIONAL Vs EXPERIMENTAL

3. QUALITATIVE Vs QUANTITATIVE

4. CONCEPTUAL Vs EMPIRICAL
Basic VS Applied
• Basic research is also called fundamental research.
It is a search for knowledge without a defined goal
of utility or purpose.
Ex :A study searching for the causative factors of malocclusion

• Applied research is problem oriented and it is directed


toward a defined and purposeful end. It is done based on a
perceived need and helps in solving an existing problem.
Ex :A study on how to treat patients with obstructive sleep
apnoea
Observational VS Experimental
Fundamentally there are two ways with which research questions can
be answered:-
Observational: We can observe what naturally happens in the real
world without interfering with it.
Ex- Root resorption during orthodontic treatment in different time
intervals

Experimental: We can manipulate some aspect of the environment


and observe its effects
Ex: Prescribe flouridated tooth paste during orthodontic treatment
Qualitative VS Quantitative

• Qualitative Research deals with subjective aspects which


are qualitative or qualities by nature which are difficult to
quantify
Ex: Qualitative analysis in dental photography in
orthodontics

• Quantitative Research is based on the measurement of


quantity or amount. It deals with objective aspects
Ex: Dental Material Research- orthodontic bonding
cements
Conceptual VS Empirical
• Conceptual Research is that related to some abstract idea
or theory. It is generally used by philosophers and thinkers
to develop new concepts or Re- interpret the existing ones.
Ex: self ligating brackets have a concept of non extraction

• Empirical research is that in which experience or


observation alone are the tools of research. It is data- based
research and it can be further verified by observation or
experimentation.
Ex: Beggs theory of attritional occlusion
What are the Objectives of Research?

The prime objectives of research are:


• To discover new facts.
• To verify and test important facts.
• To analyze an event or process or phenomenon to identify the cause
and effect relationship.
• To develop new scientific tools, concepts and theories to solve and
understand scientific and nonscientific problems.
• To find solutions to scientific, non-scientific and social problems.
• To overcome or solve the problems occurring in our every day life.
Steps in Dental research

Identify the problem


Formulating a research question
Refining the research question: Literature Review
Formulate hypothesis and research objectives
Decide the study population and setting
Decide on the study design and methodology
Writing the protocol
Collecting the data
Analyze the data and apply statistical significance
Write the report
Identify the problem
Interest and expertise :The topic should be interesting to the
investigator, funding agency and the medical community
Relevance and Applicability: Research should add more information to
the scientific society or result can modify the clinical decisions in future
• Feasibility : It should be feasible in terms of time, man power and
money.
Ex: Incidence of white spot lesions in patients who have undergone
orthodontic treatment
Formulating a research topic
• It is a formal statement for the goal of the study.

• Foremost amongst this is whether the question is interesting

• It is important that the investigator is truly curious about the question, so


that he or she can remain motivated till the successful completion of the
study.

• Curiosity is also an asset in terms of stimulating questions for future studies.


• It is the potential of the study to contribute something new to the
knowledge base.
• It should add more knowledge, guide future studies or have implications
for clinical practice, education or health care policy.
• Finally the topic must be ethical. Studies should not invade someone’s
privacy , or create possible psychological or physical risks are ethically
not acceptable.
Ex: To compare the effectiveness of different remineralizing agents on
white spot lesions
• A good research question most follow the acronym: FINER
F: FEASIBLE
I: INTERESTING THE INVESTIGATOR
N: NOVEL
E: ETHICAL
R: RELEVANT
PICOT Model
P = Population or Problem or patient
What are the characteristics of the patient or population?- 15-20years age group
What is the condition or disease?- white spot lesions
I = Intervention or exposure
What do you want to do with/ for the patient or population?-Remineralize the wsl
C = Comparison
What is the alternative to the intervention?-control
O = Outcome
What are the relevant outcomes?-remineralization
T = Time- 3months
Refining the Research Question
• Once the question or problem is specified, the next step is to collect as
much information as possible.
• It will help to determine:
1. To what extent the research question Or issue has been researched.
2. To identify the past relevant studies or research methods used.
3. To refine the research question.
4. To put the project and methodology into a relevant context.
Formulate Hypothesis and research
Objectives
• The research hypothesis is developed from the research question.
• For example, the research study comparing the treatment X versus the
treatment Y in patients with pneumonia, the experimental group would
be treatment X and the control group will be treatment Y.
• The investigative team would first state the research hypothesis.
• This could be expressed as a single outcome. E.g. Treatment X leads to
improved functional outcome.
Ex: Application of these agents CPP-ACP, Bioactive glass will
remineralize white spot lesions
Decide the study population and setting

• The definition of the subject of study and the target population should be
clearly spelt out.
• The inclusion and exclusion criteria should be decided in the beginning.
• Sample size is very important.
• The smaller the sample, the more the uncertainty.
• Sample size should be chosen in such a way that findings in the study should
reflect what is going on in the population.
• A well designed study but poorly analyzed can be rescued by re analysis, but
a poorly designed study but well analyzed is beyond the redemption of even
sophisticated statistics.
• To get valid and reliable results , appropriate research design and
research methodology and design is a prerequisite.
• Study design is the framework in which investigation is planned and
carried out.
• Selection if design is based on the type of research question.
Ex: Treated orthodontic patients of age group 15-20 years
Decide on study design and methodology
1. Observational:
• Studies in which studies are observed including:
Descriptive study
Analytical study
1.Case report
2.Case study/ case series
3.Case control studies
4.Cross sectional
5.Cohort/ longitudinal
2)Experimental: Studies in which the effect of an intervention
is observed
Randomized Controlled trials
• Field trials
• Community trials
Research study designs
Case reports
Case series
Analysis of secular trends
Case –control study
Cohort studies
• Randomized clinical trials
Case report
• Reports of events in a single
platform
• Useful for raising hypothesis
on drug effects. Leads to the
drug test with more rigorous
study designs.
Case series
• Collection of patients, all of whom have had a single exposure, whose
outcomes are then evaluated and described.
• They can also be a collection of patients with a single outcome,
looking at their antecedent exposure.
• Useful for quantifying the incidence of an adverse reaction or whether
it occurs in a larger population.
• Just provides clinical description of a disease or of patients who
receive and exposure.
Analysis of secular trends
• Also known as ecological studies.
• Examines trends in an exposure that is a presumed case and trends
in a disease that is a presumed effect and test whether the trends
coincide.
• Vital statistics and record linkages are often used in these studies.
• Useful for rapidly providing evidence for or against a hypothesis.
• Unable to control confounding variables.
For eg, Lung cancer might be caused because of cigarettes but
occupational hazards cannot be ruled out.
Case-Control studies- retrospective
study
• Compared cases with the disease to the cases without the disease , looking for differences in
exposure.
• Multiple causes of a single disease can be studied.
• Helps in studying relatively rare diseases and requires a smaller sample size.
• Information is generally obtained retrospectively from hospital records, questionnaires or
interviews.
• Limitations are validity of retrospective information and selection of controls is a
challenging task. Inappropriate control selection will result in incorrect conclusions.
Ex: Incidence of white spot lesion in patients who have undergone orthodontic treatment
Cohort studies
• Identify subsets of a defined population and followed them over time,
looking for differences in their outcome.
• Used to compare exposed patients to unexposed patients, can also be
used to compare one exposure to another or when multiple outcomes
from a single exposure is to be studied.
• Can be done prospectively or retrospectively.
• Requires large sample size and can require prolonged time period to
study delayed outcomes.
EXPOSURE DISEASE
Ex: A study on incidence of dental caries in patients undergoing
orthodontic treatment
Randomized clinical trials
• An experimental study- the investigator controls the therapy that is to
be administered to the participants.
• Major strength is the randomization.
• Disadvantages: ethical issues and could be expensive.
Ex: Intraligamentous injections of Vitamin D metabolite caused an
increase in the number of osteoclast which led to increase the
amount of tooth movement
Meta Analysis study
• Definition: Statistical analysis of collection of analytical results for the
purpose of integrating the findings.
• Uses:
Identify sources of variation among study findings.
To provide an overall measure of effect as a summary of those findings.
Most often used the assess the clinical effectiveness of healthcare
interventions. It does this by collecting data from two or more randomized
control trials.
It provides a precise estimate of treatment interventions, giving due weight to
the size of the different studies involved.
Studies chosen for the inclusion of a meta analysis must be sufficiently similar
in a number of characteristics to accurately combine their results.
Evidence Based Dentistry
• It is an approach to
dental practice that
uses the results of
patient care
research and other
available objective
evidence as a
component of
clinical decision
making.
Need for Evidence based Dentistry
• Daunting number of diseases
• Availability of broad number of therapeutic options
• To keep ourselves updates in the field of expertise.
• Addition in number of information sources
• To remain competent throughout the careers.
Writing the Protocol
• All the efforts put into preceding steps culminates into the draft of the
research protocol that incorporates all the information regarding the
research in a concise manner.
• The protocol should contain background information on the study,
objectives, ethical aspects, study design, study procedures, methods of
assessment, statistics and evaluation, administrative issues and
references.
• Once the protocol is ready, approval from the Ethics committee should be
obtained before the start of the study.
• Along with the protocol, the informed consent form and other documents
required should be submitted to the ethics committee for approval.
Collecting the data
• Once the protocol is finalized, the data should be collected.
• The data forms should be legibly filled, and they should be fully
completed.
• Ethical issues must be taken care of from the beginning to the end of the
study.
• In drug trials care must be taken to document the details of adverse
events if any.
• Proper documentation through out the study is important to ensure
credibility of data.
Ex : Patients with wsl post orthodontic treatment
Analyze the data and apply statistical
significance
• The data should be scrutinized for internal consistency and external
validity.
• Data should be analyzed using the already decided data management
plan
Write the report
• The report should be sufficiently detailed that can remove any doubt a
reader might have about any aspect of the results.
• It should be properly worded, should be adequately illustrated by
charts or diagrams or tablets which enhance the clarity.
• All the limitations need to be described openly.
Conclusion
• Research is a scientific method used to collect and analyze information
to increase our understanding or solve issues on a particular area.
• The research topic should be feasible, interesting, novel, ethical and
relevant.
• The ethical consideration should be taken care of conducting the
research
• The research result should not be biased, both the negative and
positive results should be researched/ published.
References
• Essentials of Preventive Community Dentistry- Dr. Soben Peter.
Third Edition
• Essentials of Preventive Community Dentistry- Dr. Soben Peter,
Fourth Edition
• Park’s Textbook of Preventive and Social Medicine- 22nd
Edition
• Health research methodology – WHO publication, 1993.
• Methods of BioStatistics : T Bhaskara Rao
THANK YOU

You might also like