The logo for TidyTuesday
Dataviz,  R

My first #TidyTuesday

I've enjoyed lurking the #tidytuesday hastag on Twitter. For those unfamiliar - every Tuesday a new dataset is provided, and folks are encouraged to practice their data visualization skills, especially within the tidyverse.

For Black History Month, the goal is to recreate some of the iconic images that W.E.B. Du Bois created for the 1900 Paris Exposition. For this week, the goal is to recreate “Valuation of Town and City Property Owned by Georgia Negroes” (plate 21)

Du Bois' plate 21

Overall, I'm pretty happy with how this turned out. Here's a sneak peak at the final product. You can find all of the code for these plots here.

Final version of #tidytuesday for 2022-02-15

First steps

The first thing I did after downloading the dataset was add columns to separate the white and black line segments, including X and Y locations for the question marks.

d <- d %>%
  mutate(val = Property Valuation) %>%
  mutate(KKK = case_when(
    Year < 1875 ~ val,
    Year > 1898 ~ val,
    TRUE ~ NA_real_
  )) %>%
  mutate(postKKK = case_when(
    Year >= 1874 & Year < 1899~ val,
    TRUE ~ NA_real_
  )) %>%
  mutate(KKKmark = case_when(
    is.na(KKK) ~ "",
    TRUE ~ "?"
  ))

The next step was to get the basic layout.

  • I used geom_line() several times, once to lay out a thicker black line through all the points, then thinner white lines.
  • I initially tried to create a separate plot for the separated y-axis, but the sizing never looked right. Instead, I just expanded the x-axis to the left with coord_cartesian() and drew a rectangle over the section I didn't want using annotate('rect').
d %>%
  ggplot(aes(Year)) + 
  geom_line(aes(y=val), size=2.5) +
  geom_line(aes(y=KKK), size=2, color='white') +
  geom_line(aes(y=postKKK), size=2) +
  geom_text(aes(y=KKK, label=KKKmark), size=1.5) +
  scale_x_continuous(breaks = seq(min(d$Year), max(d$Year), 5),
                     minor_breaks = seq(min(d$Year), max(d$Year), 1)) +
  scale_y_continuous(minor_breaks = seq(0, 4.8*1e6, 1e5),
                     breaks = c(1e6, 2*1e6, 3*1e6, 4*1e6),
                     labels = c("1,000,000", "2,000,000",
                                "3,000,000", "4,000,000")) +
  labs(x = element_blank(), y = element_blank()) +
  coord_cartesian(xlim=c(1861, 1900), ylim=c(0, 4.8*1e6), expand=FALSE) +
  theme(axis.ticks.y =element_blank()) +
  annotate('rect', xmin=1865, xmax=1870,
           ymin=0, ymax=4.8*1e6, 
           fill='white')
Initial attempt. Includes the question marks over the white lines and a separated y-axis.
  • The next step was force the grids to be square by setting theme(aspect.ratio=1) and add a main title.
  • I also resized the y-axis labels and moved them inside the separated grid axis.text.y = element_text(margin=margin(l=10, r=-36), size=7)
Second step. Y-axis labels now properly aligned.

Adjusting the color

  • adopt Du Bois' color palette (thanks to Katie Press for the codes).
  • add horizontal lines at every $1 million
  • add gray transparent boxes around the two plot sections. Getting the transparency working was a little tricky, but I had success using color="#131211", fill = alpha('gray', 0)

At this point, I realized the question marks still didn't look right. Du Bois actually plotted them at the midpoints between years. To replicate that, I added some new columns to the data to calculate that midpoint.

d <- d %>%
  mutate(Year2 = lead(Year)) %>%
  mutate(val2 = lead(val)) %>%
  mutate(MarkX = Year + (Year2 - Year)/2) %>%
  mutate(MarkY = val + (val2 - val)/2) %>%
  mutate(MarkLabel = case_when(
    is.na(MarkX) ~ "",
    TRUE ~ "?"
  ))
Question marks at correct location (midpoints along line segments)

Adding annotations

Now to place the text notations, including the dollar signs on the y-axis.

So close! Text annotations in place, but a little difficult to read on top of red gridlines

Finally, I added rectangles behind the text annotations that were the same color as the background. That matches Du Bois' original image pretty well:

Final version