Linear Mixed Effect Models

  • When building mixed-effect models, starting with simple models such as the global intercept model can check to see if problems exist with either the data or code. A global intercept  assumes a single intercept can describe all of the variability in the data. One way to view a global intercept is that you cannot do any better modeling that data than to only model the mean without including any other predictor variables.
    • Can be done as a first step (Simple lmer with global intercept) - this is assuming that slopes and intercepts are correlated within each group for random-effect estimates
    • # Build a lmer with State as a random effect birth_rate_state_model <- lmer(BirthRate ~ (1 | State), data = county_births_data) # Plot the predicted values from the model on top of the plot shown during the video county_births_data$birthPredictState <- predict(birth_rate_state_model, county_births_data) ggplot() + theme_minimal() + geom_point(data = county_births_data, aes(x = TotalPopulation, y = BirthRate)) + geom_point(data = county_births_data, aes(x = TotalPopulation, y = birthPredictState), color = 'blue', alpha = 0.5)
      Fixed effects: Estimate Std. Error t value (Intercept) 27.57033 1.81801 15.165 AverageAgeofMother -0.53549 0.06349 -8.434 Correlation of Fixed Effects: (Intr) AvrgAgfMthr -0.994
      The estimate is negative, suggesting counties with older mothers have lower birth rates on average.
       
  • If we do not want to assume that random-effects are not correlated:
    • use || rather than | with lmer() syntax
    • a perfect negative correlation means that random-effect slope and random-effect intercept cannot be estimated independent of each other
  • Same predictor can be both - a fixed effect as well as random-effect
 
 

The output from lmer

  • lmer fits with REML - restricted maximum likelihood method
    • numerically fitting a mixed effect model can be difficult
  • extracting outputs
    • The fixed-effects estimates can be called directly using the fixef() function.
    • The random-effects estimates can be called directly using the ranef()  function.
    • extract confidence intervals for the fixed-effects using the function confint()
  • plotting the results
    • plotting the coefficients - geom_point()
    • 95% confidence intervals - geom_linerange()
    • visualize where zero is located - geom_hline()
    • if the 95% confidence intervals do not include zero, the coefficient's estimate differs from zero
# Extract out the parameter estimates and confidence intervals coef_estimates <- tidy(out, conf.int = TRUE) %>% filter(effect == "fixed") # Print the new dataframe print(coef_estimates) # Plot the results using ggplot2 ggplot(coef_estimates, aes(x = term, y = estimate, ymin = conf.low, ymax = conf.high)) + geom_hline( yintercept = 0, color = 'red' ) + geom_linerange() + geom_point() + coord_flip() + theme_minimal()
notion image