• The logo for TidyTuesday

    TidyTuesday: season vignette formats

    For the 2022-03-15 #tidytuesday, we're working with data compiled by Robert Flight. The data reflects vignette uploads to the CRAN and Bioconductor. I wanted to focus on the seasonal nature of uploads, so I used a spiral plot. This was a great opportunity to use the spiralize and ComplexHeatmap packages by Zuguang Gu. I had to rely a lot on the grid functionality to add the title, subtitles, and caption. I found these posts by particularly helpful. Note: I used the zoo package to calculate the 7-day rolling averages. All code is available at github

  • The logo for TidyTuesday

    2 TidyTuesdays

    The last two weeks of #tidytuesday have both involved data that can be spatially mapped. They were a good opportunity to get more familiar with showing information on states in the US or countries in Europe. Alternative fuel sources in the US The data for 2022-03-01 are fueling stations throughout the US that offer alternatives to gasoline or diesel. I used the usmap package to help plot this one: Code for this graphic is here Erasmus exchange program The data for 2022-03-08 come from the Erasmus+ exchange program. It allows students to travel to other countries. I decided to look at which countries received more students than they sent away.…

  • Picture of small brains with arms and legs in random colors.

    Introducing MobNet

    I have been running a homebrew (i.e., designed from scratch) Dungeons & Dragons game for the last five years. This past New Year\'s Eve, my five players were victorious, saving their multiverse from certain annihilation. I\'m excited about starting a second campaign, but was struggling to come up with new creatures to challenge and surprise them. Then I realized I could use artificial intelligence to help (special thanks to Jacqueline Nolis at SaturnCloud a demonstration using a neural net to generate pet names). More specifically, I trained a neural network on a list of 1,368 names from existing creatures. MobNet produces names like Orze, Garez, or Wartus. (Header image is…

  • The logo for TidyTuesday

    My first #TidyTuesday

    I've enjoyed lurking the #tidytuesday hastag on Twitter. For those unfamiliar - every Tuesday a new dataset is provided, and folks are encouraged to practice their data visualization skills, especially within the tidyverse. For Black History Month, the goal is to recreate some of the iconic images that W.E.B. Du Bois created for the 1900 Paris Exposition. For this week, the goal is to recreate “Valuation of Town and City Property Owned by Georgia Negroes” (plate 21) Overall, I'm pretty happy with how this turned out. Here's a sneak peak at the final product. You can find all of the code for these plots

  • Advent Of Code 2021

    This was my first year trying Advent of Code. I originally intended to do all the solutions in both R and python. However, as they got more complicated, I decided to prioritize R, sticking with the tidyverse as much as possible. I put all of the RMarkdown and Jupyter notebooks in a public git repo. Overall, it was a great learning experience. I didn't finish all of the days, both because of travel and because of difficulties working with 3D matrices in tidyverse. I think the best part was finally have an excuse to work with classes in R. Obviously, R's approach is very different from python, but you can…

  • Cluster mean centering in tidyverse

    I'm re-analyzing some old datasets (e.g. from pilots for my dissertation I ran in 2015) and find myself wanting to re-run some multilevel models. However, the first time I did this, I used grand mean centering. That means I combine the within-cluster effects and between-cluster effects into a single parameter estimate (Curran and Bauer have for a great summary). Instead, I want to cluster mean center. That means calculating the mean of the variable within each cluster, then subtracting the mean of each cluster from the individuals scores in each cluster. Then you include both the cluster means and the cluster mean centered scores in the regression. The coefficient on…

  • Formatted summary tables R

    One of the slightly annoying issues I've had with using the summary() function in R are the multiple steps it takes to get the parameters from a fitted model into a format that's useful for pasting into a manuscript. After a lot of trial and error, the workflow I settled on used Daniel Lüdecke's sjstats and sjPlot packages. sjPlot has the very helpful plot_model() function and tab_model() functions, that make it very easy to plot marginal effects and interactions, or render HTML tables that can be cut and pasted. However, sjstats also used to contained std_beta(), a function would could be passed a fitted model object (e.g. from lmer() )…