I've enjoyed lurking the #tidytuesday hastag on Twitter. For those unfamiliar - every Tuesday a new dataset is provided, and folks are encouraged to practice their data visualization skills, especially within the tidyverse.
For Black History Month, the goal is to recreate some of the iconic images that W.E.B. Du Bois created for the 1900 Paris Exposition. For this week, the goal is to recreate “Valuation of Town and City Property Owned by Georgia Negroes” (plate 21)
Overall, I'm pretty happy with how this turned out. Here's a sneak peak at the final product. You can find all of the code for these plots here.
First steps
The first thing I did after downloading the dataset was add columns to separate the white and black line segments, including X and Y locations for the question marks.
d <- d %>%
mutate(val = Property Valuation
) %>%
mutate(KKK = case_when(
Year < 1875 ~ val,
Year > 1898 ~ val,
TRUE ~ NA_real_
)) %>%
mutate(postKKK = case_when(
Year >= 1874 & Year < 1899~ val,
TRUE ~ NA_real_
)) %>%
mutate(KKKmark = case_when(
is.na(KKK) ~ "",
TRUE ~ "?"
))
The next step was to get the basic layout.
- I used
geom_line()
several times, once to lay out a thicker black line through all the points, then thinner white lines. - I initially tried to create a separate plot for the separated y-axis, but the sizing never looked right. Instead, I just expanded the x-axis to the left with
coord_cartesian()
and drew a rectangle over the section I didn't want usingannotate('rect')
.
d %>%
ggplot(aes(Year)) +
geom_line(aes(y=val), size=2.5) +
geom_line(aes(y=KKK), size=2, color='white') +
geom_line(aes(y=postKKK), size=2) +
geom_text(aes(y=KKK, label=KKKmark), size=1.5) +
scale_x_continuous(breaks = seq(min(d$Year), max(d$Year), 5),
minor_breaks = seq(min(d$Year), max(d$Year), 1)) +
scale_y_continuous(minor_breaks = seq(0, 4.8*1e6, 1e5),
breaks = c(1e6, 2*1e6, 3*1e6, 4*1e6),
labels = c("1,000,000", "2,000,000",
"3,000,000", "4,000,000")) +
labs(x = element_blank(), y = element_blank()) +
coord_cartesian(xlim=c(1861, 1900), ylim=c(0, 4.8*1e6), expand=FALSE) +
theme(axis.ticks.y =element_blank()) +
annotate('rect', xmin=1865, xmax=1870,
ymin=0, ymax=4.8*1e6,
fill='white')
- The next step was force the grids to be square by setting
theme(aspect.ratio=1)
and add a main title. - I also resized the y-axis labels and moved them inside the separated grid
axis.text.y = element_text(margin=margin(l=10, r=-36), size=7)
Adjusting the color
- adopt Du Bois' color palette (thanks to Katie Press for the codes).
- add horizontal lines at every $1 million
- add gray transparent boxes around the two plot sections. Getting the transparency working was a little tricky, but I had success using
color="#131211", fill = alpha('gray', 0)
At this point, I realized the question marks still didn't look right. Du Bois actually plotted them at the midpoints between years. To replicate that, I added some new columns to the data to calculate that midpoint.
d <- d %>%
mutate(Year2 = lead(Year)) %>%
mutate(val2 = lead(val)) %>%
mutate(MarkX = Year + (Year2 - Year)/2) %>%
mutate(MarkY = val + (val2 - val)/2) %>%
mutate(MarkLabel = case_when(
is.na(MarkX) ~ "",
TRUE ~ "?"
))
Adding annotations
Now to place the text notations, including the dollar signs on the y-axis.
Finally, I added rectangles behind the text annotations that were the same color as the background. That matches Du Bois' original image pretty well: