gr <- read.csv(pathtofile, # replace pathtofile
colClasses = c("numeric","numeric","factor"))
gr$r <- gr$B / gr$A # compute ratio B / ASTAT 516 hw 7
Students in several classes were asked to measure on their left hands the distances
It is of interest to see whether our (humans’) fingers grow according to the golden ratio such that the mean of the ratio
A comma-separated-values file containing the data can be downloaded from here. It can be read into R with the following code:
There were different numbers of students in each class, so the design is unbalanced. In the first part of this homework we will make a balanced version of the data set; in the second part we will work with the full data set.
Part I
Construct a balanced data set by running the code below, which draws (without replacement) a subsample of size
# fix the random seed so the same subsets
# are drawn every time the code runs
set.seed(1)
classes <- levels(gr$class)
a <- length(classes)
n <- 15
gr_bal <- data.frame()
for(i in 1:a){
ind <- which(gr$class == classes[i])
gr_bal <- rbind(gr_bal,gr[sample(ind,n),])
}
head(gr_bal)
table(gr_bal$class)1.
Using the balanced data, report the mean of the ratio
2.
Produce side-by-side boxplots of the observed responses in the different classes.
3.
Obtain method of moments estimators for
4.
Check whether these are the same as the restricted maximum likelihood estimates obtained with lmer().
5.
Report the test statistic and
6.
Obtain for each class a prediction (a guess) of the realized value of the class random effect
7.
Construct a
8.
Remove the set.seed(1) command (so that new subsamples of data are drawn in the construction of the balanced data set) and run your analysis again: Do you get the same results? How stable is the analysis? Do you have any concerns?
Part II
Now use the entire data set.
1.
Produce side-by-side boxplots of the responses in the treatment groups.
2.
Report the number of observations in each class.
3.
Check whether the assumptions of the one-way random effects model hold for these data.
4.
Obtain a
5.
Consider the interval computed by the code below. Explain the strategy behind the construction of this interval and carefully explain whether you think it is appropriate or inappropriate (trustworthy or untrustworthy).
y <- gr$r
y.. <- mean(y)
sn <- sd(y)
N <- length(y)
alpha <- 0.05
tval <- qt(1-alpha/2,N-1)
lo <- y.. - tval * sn / sqrt(N)
up <- y.. + tval * sn / sqrt(N)
c(lo,up)[1] 1.568001 1.635472