For me, teaching stats this semester has turned into a journey of discovering what the distributional and ggdist packages can do for me. The way I make illustrative figures will never be the same. So I thought I’d revisit my post about hierarchical variance priors, this time implementing the figures using these two packages.

library(tibble)
library(dplyr)
library(ggplot2)
library(scico)
library(ggdist)
library(distributional)

Custom y theme and scale

theme_no_y <- function(){

  out_theme <- theme(
    axis.text.y = element_blank(),
    axis.ticks.y = element_blank(),
    axis.title.y = element_blank()
  )

  out_theme  
}

scale_y_tight <- function(...) {
  scale_y_continuous(expand = expansion(0), ...)
}

Random effect variance priors

When fitting a model with a random intercept, the group-level random effects (let’s say, \(\gamma_i\)) are sampled from a normal distribution

\[ \gamma_i \sim \mathcal{N}(0,\sigma) \]

If you look at the default prior brms uses for \(\sigma\), it’s a truncated student-t:

student_t(3, 0, 2.5)

We can make this distribution!

full_t <- dist_student_t(
  df = 3, 
  mu = 0, 
  sigma = 2.5
)

half_t <- dist_truncated(
  full_t, 
  lower = 0
)

We can get things like the mean and variance of this truncated distribution, and generate random samples from it

mean(half_t)

[1] 2.756644

variance(half_t)

[1] 11.15091

generate(half_t, 10)

[[1]]
 [1] 5.1531979 1.2497857 1.0370256 3.0964036 4.6862006 1.1580219 0.2008173
 [8] 8.3719751 2.6385236 0.2936411

And we can plot it

ggplot()+
  stat_slab(
    aes(
      xdist = half_t
    ),
    fill = "#EE6677"
  )+
  scale_y_tight()+
  theme_no_y()+
  scale_thickness_shared()

Let’s just use the expected value of this distribution as \(\sigma\) for now.

init_sigma <- mean(half_t)

Random effects in probability space

Now let’s make the normal distribution for the group-level random effects.

ranef_dist <- dist_normal(
  mu = 0, 
  sigma = init_sigma
)

ggplot()+
  stat_slab(
    aes(
      xdist = ranef_dist
    ),
    fill = "#EE6677"
  )+
  scale_y_tight()+
  theme_no_y()

Ok, great! But what if the model I’m fitting is a logistic regression? This random effects distribution is in the logit space. But what does it look like if we transform it to the probability space?

ranef_prob_dist <- dist_transformed(
  ranef_dist,
  plogis,
  qlogis
)

ggplot()+
  stat_slab(
    aes(
      xdist = ranef_prob_dist
    ),
    fill = "#EE6677"
  )+
  scale_y_tight()+
  theme_no_y()

Rather than having a random effects distribution where groups are broadly distributed across the probability space, we actually have a random effects distribution where groups are pretty strongly bifurcated. And this is at the expected value for our prior over \(\sigma\).

Looking for a more neutral distribution

Let’s see what different \(\sigma\)s look like in the probability space.

possible_dists <- tibble(
  sigma = seq(
    1, 2.1, by = 0.1
  ),
  dist = dist_normal(0, sigma),
  p_dist = dist_transformed(
    dist, plogis, qlogis
  )
)

ggplot(
  possible_dists,
  aes(
    xdist = p_dist,
    fill = sigma
  )
)+
  stat_slab(
    color = "black",
    linewidth = 0.5
  )+
  scale_fill_scico(
    palette = "devon",
    guide = "none"
  )+
  scale_y_tight()+
  facet_wrap(~sigma)+
  theme_no_y()+
  theme_no_x()

It looks like somewhere between 1.3 and 1.4 is the sweet spot for a maximally flat random effects distribution in the probability space.

Really honing in on it

I can try getting even more precise by looking at a vectorized version of these distributions, and finding the largest sigma what still has its density peak at 0.5.

library(purrr)

# a vector of sigmas
sigmas = seq(1.3, 1.5, length = 100)

# a vectorized normal
vec_dist <- dist_normal(
  mu = 0,
  sigma = sigmas
) 

# a vectorized ilogit(normal)
vec_p_dist <- dist_transformed(
  vec_dist,
  plogis,
  qlogis
)

# the density function
# from 0 to 0.5
p_densities <- density(
  vec_p_dist, 
  seq(0, 0.5, length = 100)
)

# The index of the max
# density
where_is_max <- p_densities |> 
  map_vec(
    which.max
  ) 

# if where_is_max == 100
# peak density was at 0.5
flat_idx <- (where_is_max == 100) |> 
  which() |> 
  max()

flattest_sigma <- sigmas[flat_idx]

flattest_sigma

[1] 1.413131

Let’s take a look at it:

flat_pdist <- dist_normal(0, flattest_sigma) |> 
  dist_transformed(plogis,qlogis)
  
ggplot()+
  stat_slab(
    aes(
      xdist = flat_pdist
    ),
    fill = "#EE6677"
  )+
  scale_y_tight()+
  theme_no_y()

What about a prior?

In a logistic regression I think I would usually like a prior over \(\sigma\) for random effects to have its expected value right about here, with about as close as we can get to a uniform prior in probability space. Then, the data can pull it towards being more bifurcated, or more focused, depending.

I’m not sure how to solve that, but I can do a grid search!

tibble(
  prior_sigma = seq(1,1.5,length = 100),
  half_student = dist_student_t(3,0,prior_sigma) |> 
    dist_truncated(lower = 0),
  expected = mean(half_student)
)  |> 
  slice(
    which.min(abs(expected - flattest_sigma))
  )

# A tibble: 1 × 3
  prior_sigma        half_student expected
        <dbl>              <dist>    <dbl>
1        1.28 t(3, 0, 1.3)[0,Inf]     1.41

Let’s call it 1.28.

ranef_sigma_prior <- dist_student_t(3, 0, 1.28) |> 
  dist_truncated(lower = 0)

set.seed(2024)
ranef_sigmas <- generate(ranef_sigma_prior, 9)[[1]]


tibble(
  sigma = ranef_sigmas,
  rounded_sigma = round(sigma, digits = 2),
  dist = dist_normal(0, sigma),
  p_dist = dist_transformed(
    dist, plogis, qlogis
  )
) |> 
  ggplot(
    aes(
      xdist = p_dist
    )
  )+
  stat_slab(
    aes(
      fill = sigma
    ),
    normalize = "panels",
    color = "black",
    linewidth = 0.5
  )+ 
  scale_y_tight()+
  scale_fill_scico(
    palette = "devon",
    guide = "none"
  )+
  scale_x_continuous(
    breaks = c(0,1)
  )+
  facet_wrap(~rounded_sigma)+
  theme_no_y()

Looks good to me!

Reuse

CC-BY-SA 4.0

Citation

BibTeX citation:

@online{fruehwald2024,
  author = {Fruehwald, Josef},
  title = {Random Effect Priors, Redo},
  series = {Væl Space},
  date = {2024-11-19},
  url = {https://jofrhwld.github.io/blog/posts/2024/11/2024-11-19_logit-priors-again/},
  langid = {en}
}

For attribution, please cite this work as:

Fruehwald, Josef. 2024. “Random Effect Priors, Redo.” Væl Space. November 19, 2024. https://jofrhwld.github.io/blog/posts/2024/11/2024-11-19_logit-priors-again/.