Recently, for class notes on probability/the central limit theorem, I wanted to recreate the table of 2d6 values that I made here. A really cool thing I found between that blog post and now is that gt has afmt_icon()
operation that will replace text with its fontawesome icon.
And font-awesome has icons for each face of a 6 sided die!
Here’s how I got to a result that I liked. If anyone has suggestions for how to do this more cleanly, I’d love to hear about it!
Building the Data
The actual die rolls
Getting the actual die rolls and their total is a simple expand_grid()
.
expand_grid(
die_a = 1:6,
die_b = 1:6
) |>
mutate(
total = die_a + die_b
)->
rolls_df
head(rolls_df)
# A tibble: 6 × 3
die_a die_b total
<int> <int> <int>
1 1 1 2
2 1 2 3
3 1 3 4
4 1 4 5
5 1 5 6
6 1 6 7
Injecting the fontawesome icon names
In my original code, I used some joins here, but I just found the {english}
package, which will let me mutate the die_
columns directly.
rolls_df |>
mutate(
across(
starts_with("die_"),
~ str_glue(
"dice-{word}",
word = as.english(.x)
)
)
) ->
rolls_df
head(rolls_df)
# A tibble: 6 × 3
die_a die_b total
<glue> <glue> <int>
1 dice-one dice-one 2
2 dice-one dice-two 3
3 dice-one dice-three 4
4 dice-one dice-four 5
5 dice-one dice-five 6
6 dice-one dice-six 7
Identifying combos
I’m going to start pivoting a bunch, so now is the best time to give an id to each unique combo, as well as the total number of combos per total.
rolls_df |>
mutate(
.by = total,
id = row_number(),
n = n()
) ->
rolls_df
head(rolls_df)
# A tibble: 6 × 5
die_a die_b total id n
<glue> <glue> <int> <int> <int>
1 dice-one dice-one 2 1 1
2 dice-one dice-two 3 1 2
3 dice-one dice-three 4 1 3
4 dice-one dice-four 5 1 4
5 dice-one dice-five 6 1 5
6 dice-one dice-six 7 1 6
Getting wide
I’ll eventually want one column per die, with its combination id appended to it, which means pivoting long, merging the columns together, then pivoting wide.
Pivoting long
rolls_df |>
pivot_longer(
starts_with("die"),
names_to = "die",
values_to = "icon"
) |>
unite(
c(die, id),
col = "die_id"
)->
rolls_long
head(rolls_long)
# A tibble: 6 × 4
total die_id n icon
<int> <chr> <int> <glue>
1 2 die_a_1 1 dice-one
2 2 die_b_1 1 dice-one
3 3 die_a_1 2 dice-one
4 3 die_b_1 2 dice-two
5 4 die_a_1 3 dice-one
6 4 die_b_1 3 dice-three
Pivoting wide
rolls_long |>
pivot_wider(
names_from = die_id,
values_from = icon
) ->
rolls_wide
head(rolls_wide)
# A tibble: 6 × 14
total n die_a_1 die_b_1 die_a_2 die_b_2 die_a_3 die_b_3 die_a_4 die_b_4
<int> <int> <glue> <glue> <glue> <glue> <glue> <glue> <glue> <glue>
1 2 1 dice-one dice-one <NA> <NA> <NA> <NA> <NA> <NA>
2 3 2 dice-one dice-two dice-t… dice-o… <NA> <NA> <NA> <NA>
3 4 3 dice-one dice-thr… dice-t… dice-t… dice-t… dice-o… <NA> <NA>
4 5 4 dice-one dice-four dice-t… dice-t… dice-t… dice-t… dice-f… dice-o…
5 6 5 dice-one dice-five dice-t… dice-f… dice-t… dice-t… dice-f… dice-t…
6 7 6 dice-one dice-six dice-t… dice-f… dice-t… dice-f… dice-f… dice-t…
# ℹ 4 more variables: die_a_5 <glue>, die_b_5 <glue>, die_a_6 <glue>,
# die_b_6 <glue>
Now, I’ve got some well named columns identifying die a and die b, as well as numeric ids for each unique combination. I’ll use these for coloring the dice icons and merging columns.
But, I also want to move the n
column, and add a proportion
column.
Building the table.
Formatting the icons
To make sure it’s clear I’m working with two die, I want die_a and die_b to be different colors, which I can make happen with two uses of fmt_icon()
.
rolls_wide |>
gt() |>
fmt_icon(
starts_with("die_a"),
fill_color = "#CC6677"
) |>
fmt_icon(
starts_with("die_b"),
fill_color = "#4477AA"
) ->
out_tbl
out_tbl
total | die_a_1 | die_b_1 | die_a_2 | die_b_2 | die_a_3 | die_b_3 | die_a_4 | die_b_4 | die_a_5 | die_b_5 | die_a_6 | die_b_6 | n | prop |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 1 | 0.02777778 | ||
3 | NA | NA | NA | NA | NA | NA | NA | NA | 2 | 0.05555556 | ||||
4 | NA | NA | NA | NA | NA | NA | 3 | 0.08333333 | ||||||
5 | NA | NA | NA | NA | 4 | 0.11111111 | ||||||||
6 | NA | NA | 5 | 0.13888889 | ||||||||||
7 | 6 | 0.16666667 | ||||||||||||
8 | NA | NA | 5 | 0.13888889 | ||||||||||
9 | NA | NA | NA | NA | 4 | 0.11111111 | ||||||||
10 | NA | NA | NA | NA | NA | NA | 3 | 0.08333333 | ||||||
11 | NA | NA | NA | NA | NA | NA | NA | NA | 2 | 0.05555556 | ||||
12 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 1 | 0.02777778 |
Dropping the missing values
I want to drop out all of the missing values. I found that if I replace them with just ""
, for some reason the row with no NA
s winds up being narrower than the rest, but if I replace them with a zero-width space, it turns out more compact.
out_tbl |>
sub_missing(missing_text = html("​")) ->
out_tbl
The ugliest part
Now, I need to merge the columns together with cols_merge()
. This is where the code gets a little ugly, what I want to be able to say is
- Merge two columns if they match in the last two characters
Maybe there’s a way to express this with tidyselect verbs that I’m just not good enough with. In the original code, I just used cols_merge()
6 times, which would look like:
out_tbl |>
cols_merge(ends_with("1")) |>
cols_merge(ends_with("2")) |>
cols_merge(ends_with("3")) |>
cols_merge(ends_with("4")) |>
cols_merge(ends_with("5")) |>
cols_merge(ends_with("6"))
It just occurred to me that something from purrr
might be the right tool, and refreshed myself on purrr::reduce()
.
reduce(
as.character(1:6),
\(acc, nxt){
acc |>
cols_merge(
ends_with(nxt)
)
},
.init = out_tbl
) ->
out_tbl
out_tbl
total | die_a_1 | die_a_2 | die_a_3 | die_a_4 | die_a_5 | die_a_6 | n | prop |
---|---|---|---|---|---|---|---|---|
2 | | | | | | 1 | 0.02777778 | |
3 | | | | | 2 | 0.05555556 | ||
4 | | | | 3 | 0.08333333 | |||
5 | | | 4 | 0.11111111 | ||||
6 | | 5 | 0.13888889 | |||||
7 | 6 | 0.16666667 | ||||||
8 | | 5 | 0.13888889 | |||||
9 | | | 4 | 0.11111111 | ||||
10 | | | | 3 | 0.08333333 | |||
11 | | | | | 2 | 0.05555556 | ||
12 | | | | | | 1 | 0.02777778 |
To be honest, even though the reduce()
approach is more programmery, writing out each cols_merge()
individually is more readable…
Maybe if I wanted to expand this out to 3d6, the reduce()
approach would be better. But at that point, I’d also be creating a table of 27 columns, and at that point the illustrative nature of the table would probably be lost.
Final table finessing
Now, I want to
- remove the column names from the dice columns
- add a grand summary row
- format the probabilities down to 2 digits
- add some css so that the table will match lightmode/darkmode settings
out_tbl |>
# no dice column label
cols_label(
starts_with("die") ~ ""
) |>
# grand summary
grand_summary_rows(
columns = c(n, prop),
fns = list(total ~ sum(.)),
missing_text = ""
) |>
# two digits max
fmt_number(
columns = prop,
decimals = 2
) |>
# font setting
opt_table_font(
font = list(
google_font(name = "Public Sans"),
default_fonts()
)
) |>
# light/darkmode matching
tab_style(
style = "
background-color: var(--bs-body-bg);
color: var(--bs-body-color)
",
locations = list(
cells_column_labels(),
cells_column_spanners(),
cells_row_groups(),
cells_body(),
cells_grand_summary(),
cells_stub_grand_summary(),
cells_stub(),
cells_stubhead()
)
) ->
out_tbl
out_tbl
total | n | prop | |||||||
---|---|---|---|---|---|---|---|---|---|
2 | | | | | | 1 | 0.03 | ||
3 | | | | | 2 | 0.06 | |||
4 | | | | 3 | 0.08 | ||||
5 | | | 4 | 0.11 | |||||
6 | | 5 | 0.14 | ||||||
7 | 6 | 0.17 | |||||||
8 | | 5 | 0.14 | ||||||
9 | | | 4 | 0.11 | |||||
10 | | | | 3 | 0.08 | ||||
11 | | | | | 2 | 0.06 | |||
12 | | | | | | 1 | 0.03 | ||
total | 36 | 1 |
Final thoughts
I’m really happy with the final result! The thought I’ve had while working with gt is that there are just a lot more steps involved in building a good table, compared to building a good plot in ggplot2. I don’t mean that as a knock on the package. I think it’s possible that building up a good table is just more complicated!
Reuse
Citation
@online{fruehwald2024,
author = {Fruehwald, Josef},
title = {Building up a Complex `\{Gt\}` Table.},
series = {Væl Space},
date = {2024-09-15},
url = {https://jofrhwld.github.io/blog/posts/2024/09/2024-09-15_gt-table/},
langid = {en}
}