Homework1029
Homework1029
2. Let 𝑇 − = ∑𝑛𝑖=1(1 − 𝜓𝑖 ) 𝑅𝑖 . Verify directly, or illustrate using the data of above table, the equation
𝑇 + + 𝑇 − = n(n + 1)/2.
3. For arbitrary number of observations n, what are the smallest and largest possible values for 𝑇 + ?
Justify your answers.
4. Consider the test of H0: θ = 0 versus H1 : θ > 0 based on 𝑇 + for the following n = 10 Z observations:
Z1 = 2.5, Z2 = 3.7, Z3 = 0, Z4 = −0.6, Z5 = 4.7, Z6 = 0, Z7 = 1.4, Z8 = 0, Z9 =1.9, Z10 = 5.2.
Compute the P -values for the competing 𝑇 + procedures based on discarding the zero Z values and
reducing n accordingly, as recommended in the Ties portion of this section.
5. In an investigation to determine the effect of aspirin on bleeding time and platelet adhesion, Bick,
Adams, and Schmalhorst (1976) studied the reactions of normal subjects to aspirin. A subset of
their data is presented in the following table, where the X observation for each subject is the
bleeding time (in seconds) before ingestion of 600 mg of aspirin and the Y observation is the
bleeding time (again in seconds) 2 h after administration of the aspirin.(using SIGN TEST)
6. Consider the test of H0: θ = 0 versus H1 : θ > 0 based on B for the following n = 15 Z observations:
Z1 = 2.5, Z2 = 0, Z3 = 3.7, Z4 = −0.6, Z5 = 1.7, Z6 = 0, Z7 = 5.9, Z8 = 4.6, Z9 = 0, Z10 = −1.4,
Z11 = 5.4, Z12 = 4.6, Z13 = 3.1, Z14 = −2.0, and Z15 = 6.3. Compute the P-values for the competing
B procedures based on discarding the zero Z values and reducing n accordingly, as recommended
in the Ties portion.
7. The data in following table are a subset of the data obtained by Thomas and Simmons (1969), who
investigated the relation of sputum histamine levels to inhaled irritants or allergens. The histamine
content was reported in micrograms per gram of dry weight of sputum. The subjects for this portion
of the study consisted of 22 smokers; 9 of them were allergics and the remaining 13 were
asymptomatic (nonallergic) individuals. Care was taken to avoid people who carried out part of
their daily work in an atmosphere of noxious gases or other respiratory toxicants. Table gives the
ordered sputum histamine levels for the 22 individuals in the study. Test the hypothesis of equal
levels versus the alternative that allergic smokers have higher sputum histamine levels than
nonallergic smokers. Use the large-sample approximation.
8. Let 𝑊 ′ be the sum of the ranks of the X observations. Verify the equation W + 𝑊 ′ = (m + n)(m +
n + 1)/2.
9. Let 𝑈 ′ denote the number of (Xi , Yj ) pairs for which Xi > Yj . Assume that there are no X = Y ties,
and either establish directly the relation U + 𝑈 ′ = mn.
10. The platelet count (per cubic millimeter) data for the mothers are given in the following table. Find
the P-value for an appropriate nonparametric test procedure to assess whether are any significant
location or dispersion differences in the predelivery maternal platelet counts for the control and
prednisone-treated groups.
11. Consider the serum iron data. Use the Kolmogorov–Smirnov test procedure to assess whether there
are significant differences of any kind between the distribution of serum iron values obtained by
the Ramsay method and the distribution of serum iron values obtained by the Jung–Parekh method.
Find the P-value for the test and compare it with the results discussed in Example Serum Iron
Determination.
12. Pretherapy training of clients has been shown to have beneficial effects on the process and outcome
of counseling and psychotherapy. Sauber (1971) investigated four different approaches to
pretherapy training: 1. Control (no treatment). 2. Therapeutic reading (TR) (indirect learning). 3.
Vicarious therapy pretraining (VTP) (videotaped, vicarious learning). 4. Group, role induction
interview (RII) (direct learning).
Treatment conditions 2–4 were expected to enhance the outcome of counseling and psychotherapy
as compared with a control group, those subjects receiving no prior set of structuring procedures.
One of the major variables of the study was that of “psychotherapeutic attraction.” The basic data
in following table consist of the raw scores for this measure according to each of the four
experimental conditions. Apply procedure of KRUSKAL–WALLIS with the correction for ties.
13. Consider the fasting metabolic rate (FMR) data on white-tailed deer. Test the hypothesis of no
difference in FMR over the 2-month periods against a general umbrella alternative. Use an
approximate significance level of .01. Compare your result with that obtained in Example fasting
metabolic rate.
14. The data in the following table are a subset of the data obtained by Sylvester (1969) in a study
concerned with the anatomical and pathological status of the corticospinal and somatosensory
tracts and parietal lobes of patients who had had cerebral palsy. Among other things, he was
interested in the relationship between brain weights and large fiber (>7.5 μ in diameter) counts in
the medullary pyramid. The table gives the mean brain weights (g) and medullary pyramid large
fiber counts for 11 cerebral palsy subjects. Test the hypothesis of independence versus the general
alternative that brain weight and large fiber count in the medullary pyramid are correlated in
subjects who have had cerebral palsy.
15. The following table summarizes responses of 91 married couples in Arizona to a question. Find
and interpret a measure of association between wife’s response and husband’s response.
16. Refer to the following. Construct and interpret a 95% confidence interval for the odds ratio.
Homework #3
Due on Dec. 24th, Tuesday
1
3. We have a data set with measurements of the rainfall from 26 clouds (for the purposes of this assignment,
ignore the “Seeded” column). Most clouds gave off very little precipitation, so the density near 0 is high.
Standard kernel approaches produce an estimate of the density 𝑓̂ that is high near zero and -this is the
problem- below zero. Obviously, rainfall cannot be negative, so this estimate is unappealing. For (a)-(c)
below, estimate the density according to the described approach, and plot the resulting estimate (you may
overlay all your answers into a single plot or keep them separate; either way is fine).
(a) Estimate the density using the standard approach, ignoring the boundary problem.
(b) Estimate the density by taking a log transformation of the data, fitting the density on this scale, then
transforming the estimated density back to the original scale.
(c) Estimate the density by taking the standard approach and reflecting the estimated density that lies in (-
∞, 0) about 0.
4
𝑖𝑖𝑑
5. Suppose 𝑥𝑖 Unif(-3,3) for i = 1, 2, …, 100 and that 𝑦𝑖 = 𝑓(𝑥𝑖 ) + 𝜀𝑖 where 𝜀𝑖 follows a standard normal
~
distribution and 𝑓(𝑥) = −𝑥 2 . Conduct a simulation study comparing two methods: polynomial regression
(with linear and quadratic terms) and local linear regression. NOTE: Some implementations refuse to
predict outside the observed range. To avoid this, set two values of {𝑥𝑖 } equal to -3 and 3 and let the other
98 be uniformly distributed.
6. Please show that Geometric distribution belongs to exponential dispersion family. Suppose we have a
sample (𝑥𝑖 , 𝑦𝑖 ), i=1,…n, 𝑦𝑖 is assume to be Geometric distributed and the corresponding parameter 𝑝𝑖
depends upon 𝑥𝑖 . Please set up a generalized linear model and a generalized nonparametric regression
model, respectively, and report the corresponding log-likelihood function and local likelihood function.