STAT 516 hw 8

Author

Karl Gregory

Students in two different classes were asked to measure their heart rates. Each student counted heartbeats during two 30-second periods separated by several minutes and, moreover, recorded his or her sex. It is of interest to see whether the mean resting heart rate is different for men and women, as claimed. Since the data were collected in several classes, the following model is considered: Assume $Y_{i j k} = μ + τ_{i} + B_{j} + ε_{i j k},$ where $Y_{i j k}$ represents the heart rate (the sum of the two 30-second counts) for student $k$ of sex $i$ in class $j$ , $τ_{i}$ is the effect of sex $i$ , the $B_{j}$ are class effects assumed to be independent $N (0, σ_{B}^{2})$ random variables, and the $ε_{i j k}$ are independent error terms having the $N (0, σ_{ε}^{2})$ distribution for $i = 1, 2$ , $j = 1, 2$ , and $k = 1, \dots, n_{i j}$ . This is a randomized complete block design with replication—where “with replication” means that there is more than one experimental unit at each combination of block and factor level. No interaction term is included, because we have for this data set $a = 2$ and $b = 2$ , so that the interaction term would have degrees of freedom $(a - 1) (b - 1) = 1$ . This means that we would have to estimate the variance of the random interaction effect based on a single random realization; as it is not advisable to estimate a variance based on a single observation, we leave the interaction term out.

The code below reads in a the data set HR_sex_combined.csv which can be downloaded here.

hr0 <- read.csv(pathtofile, # must edit pathtofile
                skip = 1,
                col.names = c("first","second","sex","class"),
                colClasses = c("numeric","numeric","factor","factor"))

hr0$hr <- hr0$first + hr0$second

To make this homework simpler, work with a balanced version of the data created by the following code, which draws sub-samples of size $n = 8$ from each class and sex combination.

# fix the random seed so the same subsets 
# are drawn every time the code runs
set.seed(2) 

hr0$class_sex <- as.factor(paste(hr0$class,hr0$sex))
classes <- levels(hr0$class_sex)
ab <- length(classes)
n <- 8
hr <- data.frame()
for(i in 1:ab){
  
  ind <- which(hr0$class_sex == classes[i])
  hr <- rbind(hr,hr0[sample(ind,n),])  
  
}
head(hr)

   first second sex            class hr          class_sex
24    32     31   f STAT_515_sp_2026 63 STAT_515_sp_2026 f
7     44     44   f STAT_515_sp_2026 88 STAT_515_sp_2026 f
28    40     33   f STAT_515_sp_2026 73 STAT_515_sp_2026 f
9     46     39   f STAT_515_sp_2026 85 STAT_515_sp_2026 f
2     38     34   f STAT_515_sp_2026 72 STAT_515_sp_2026 f
19    44     43   f STAT_515_sp_2026 87 STAT_515_sp_2026 f

1.

Make boxplots of the heart rate measurements for all class and sex combinations.

2.

Give the values of the $F$ test statistics and $p$ values for testing the null hypotheses of i) no difference in mean heart rate between the sexes and ii) no significant variation between classes. Interpret your findings based on i) and ii) assuming that the assumptions of the model are met.

3.

Obtain the method of moments estimators of the variance components in the model. Report the estimated standard deviations as well as any issues encountered.

4.

Give a 95% confidence interval for the difference (women minus men) in mean resting heart rate between the sexes. Construct the interval without using the lmer() function. Give an interpretation of your interval.

5.

Obtain the REML estimators of the variance components in the model. Report the estimated standard deviations.

6.

Give a $95 %$ confidence interval for the difference in mean resting heart rate between men and women (women minus men) using the output of the lmer() function. Give an interpretation of your interval.

7.

Check the assumptions of the model.