Linear Mixed Effect Models

When building mixed-effect models, starting with simple models such as the global intercept model can check to see if problems exist with either the data or code. A global intercept assumes a single intercept can describe all of the variability in the data. One way to view a global intercept is that you cannot do any better modeling that data than to only model the mean without including any other predictor variables.

Can be done as a first step (Simple lmer with global intercept) - this is assuming that slopes and intercepts are correlated within each group for random-effect estimates


# Build a lmer with State as a random effect
birth_rate_state_model <- lmer(BirthRate ~ (1 | State),
                            data = county_births_data)
# Plot the predicted values from the model on top of the plot shown during the video
county_births_data$birthPredictState <-
	predict(birth_rate_state_model, county_births_data)
ggplot() + 
    theme_minimal() +
    geom_point(data = county_births_data,
               aes(x = TotalPopulation, y = BirthRate)) + 
    geom_point(data = county_births_data,
               aes(x = TotalPopulation, y = birthPredictState),
               color = 'blue', alpha = 0.5)


Fixed effects:
                   Estimate Std. Error t value
(Intercept)        27.57033    1.81801  15.165
AverageAgeofMother -0.53549    0.06349  -8.434

Correlation of Fixed Effects:
            (Intr)
AvrgAgfMthr -0.994

The estimate is negative, suggesting counties with older mothers have lower birth rates on average.

If we do not want to assume that random-effects are not correlated:

use || rather than | with lmer() syntax
a perfect negative correlation means that random-effect slope and random-effect intercept cannot be estimated independent of each other

Same predictor can be both - a fixed effect as well as random-effect

The output from lmer

lmer fits with REML - restricted maximum likelihood method

numerically fitting a mixed effect model can be difficult

extracting outputs

The fixed-effects estimates can be called directly using the fixef() function.
The random-effects estimates can be called directly using the ranef() function.
extract confidence intervals for the fixed-effects using the function confint()

plotting the results

plotting the coefficients - geom_point()
95% confidence intervals - geom_linerange()
visualize where zero is located - geom_hline()
if the 95% confidence intervals do not include zero, the coefficient's estimate differs from zero


# Extract out the parameter estimates and confidence intervals
coef_estimates <-
	tidy(out, conf.int = TRUE) %>%
	filter(effect == "fixed")

# Print the new dataframe
print(coef_estimates)

# Plot the results using ggplot2
ggplot(coef_estimates, aes(x = term, y = estimate,
                     ymin = conf.low, ymax = conf.high)) +
    geom_hline( yintercept = 0, color = 'red' ) +
    geom_linerange() + geom_point() + coord_flip() + theme_minimal()