I've spent the majority of the summer as an intern with the Texas Policy Lab, working on primarily data science-related matters such as data cleaning and visualization. Most recently, I sought to create a custom theme in ggplot2 for TPL.
The project was my first experience in developing my own R package. Prior to this project, the most familiarity I had with packages were from the install.packages() and library() commands.
Hadley Wickham's book R Packages was enormously helpful in introducing package development to me. I ran into (a lot of) issues in building the package, specifically encountering problems related to local file paths and logo placement on plots.
Creating your own package is a great exercise in trial and error, and taught me a lot about programming in R that I wouldn't have learned otherwise. I was also struck by how remarkably easy it was to create one's own package (seriously, it requires the same amount of clicks as starting a new R project), and how thorough online resources were.
The package also serves as another reminder of how accessible R is. I don't believe my package is without error (trust me, I've found many even after I "finished" the package), but it is pretty crazy that an undergraduate with nearly no formal or academic experience with R can create his own package in under a month.
The catalyst for creating this package was coming across the Urban Institute's urbnthemes package on GitHub. I also gathered a lot of inspiration (and borrowed some code) from ggthemes (Jeffrey Arnold), bbplot (BBC News), and hrbrthemes (Bob Rudis). I was impressed by the fact that these organizations were able to use R to create publication-ready plots despite the fact that base ggplot figures can look rather ugly (if we're being honest).
Because the organization I intern with is still in its infancy, I thought it would be a perfect time to create a standardized theme for figures made in the future. So long as future employees adopt the theme, this package has the potential to create figures specific to our publications, lending TPL organizational credibility and creating cross-report consistency.
I thought a lot about some basic tenets of design, such as font readability, text size, and color contrast. I learned a lot about visual and aesthetic design I wouldn't know otherwise (Kieran Healy's section on how graphs can deceive the reader--intentionally or not--opened my eyes to a lot of important visual concepts.
I also put a lot of time into creating a color palette which was both aesthetically pleasing and accessible to color-blind viewers. This was somewhat difficult because there are quite a few types of colorblindness. Thankfully, my boss is colorblind, making test cases a lot more accessible!
The final color palettes I created look like this (the latter image depicts what those palettes look like for viewers with deuteranopia, one of the most common forms of red-green colorblindness):
The diverging and sequential color palettes are from http://colorbrewer2.org and the categorical palette is composed of a variety of colors from https://coolors.co/ and the TPL website.
In action, the color palette looks like this:
The primary function in my package is set_tpl_theme(). It takes two arguments: 1) `style`, which can be specified to create normal plots (barplots, scatterplots, etc.) or Texas-specific plots which drop gridlines and adjust legend sizing, and 2) `font`, wherein the user can specify whether the plots should use Lato or Adobe Caslon Pro.
Calling set_tpl_theme() causes a plot to go from this:
The user is also able to specify a variety of other options such as axes to drop with drop_axis(), or whether to add the TPL logo with add_tpl_logo().
Here's a gallery of sample plots made in the tpltheme package to illustrate the power of a few commands:
You can find documentation and a more thorough defense of why I made the decisions I did on my GitHub. Thanks for reading!