Dataviz

Visualize an interaction with ggplot

I've had to do this enough times (and have to look it up each time) that I decided to memorialize it here.

The issue: I have a two-way repeated measures design and I want to visualize all four cells. I'd like one plot to contain the individuals responses as well as the cell means. But I also want to link individuals together.

The solution: Plot the individual differences within each level of one of the factors using separate lines for each subject, plus an additional line for the cell means.

Here's a simple demo (with a bonus example of how to simulate such a dataset).

require(tidyverse)

# simulate 2x2 interaction with 40 subjects
n <- 40

# define effects for intercept, A, B, and A*B
a  <-  5.0 # intercept
b1 <-  0.3 # main effect of A
b2 <- -0.5 # main effect of B
b3 <-  1.0 # interaction

# simulate outcomes
A <- rep(c(0,1), each=n*2)
B <- rep(c(0,1), each=n, 2)
e <- rnorm(n*4)
id <- rep(c(1:n), 4)

y <- a + b1*A + b2*B + b3*A*B + e

d <- data.frame(A=A, B=B, y=y, id=id)
d$id <- factor(id)

# use group_by to create a new data.frame object with cell means
g <- d %>%
  group_by(A, B) %>%
  summarize(y = mean(y))

ggplot(d, aes(x=A, y=y)) +
  geom_line(aes(group=id), alpha=.3) +       # plot by individual
  geom_point(data=g, size=2) +               # plot cell means as points
  geom_path(data=g, aes(group=B), size=1) +  # plot mean lines
  facet_wrap(~B)                             # split plots by Factor B