Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cols_width() not working when using cols_merge() for PDF output #1837

Closed
snhansen opened this issue Aug 16, 2024 · 6 comments
Closed

cols_width() not working when using cols_merge() for PDF output #1837

snhansen opened this issue Aug 16, 2024 · 6 comments

Comments

@snhansen
Copy link
Contributor

Description

Widths set with cols_width() isn't respected when using cols_merge() and when the output is PDF.

Reproducible example

Consider this Quarto document:

---
format: pdf
---

```{r}
#| echo: false
library(gt)
sp500 |>
  dplyr::slice(50:55) |>
  dplyr::select(-volume, -adj_close) |>
  gt() |>
  cols_align(
	columns = everything(),
	align = "right"
  ) |>
  cols_merge(
	columns = c(open, close),
	pattern = "{1}-{2}"
  ) |>
  cols_merge(
	columns = c(low, high),
	pattern = "{1}-{2}"
  ) |>
  cols_width(
	date ~ px(100),
	c("open", "low") ~ px(200)
  )
```

Expected result

The table created doesn't respect the widths set:

image

When output as html, everything looks fine:

image

Session info

> sessionInfo()
R version 4.4.0 (2024-04-24 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)

Matrix products: default


locale:
[1] LC_COLLATE=Danish_Denmark.utf8     LC_CTYPE=Danish_Denmark.utf8       LC_MONETARY=Danish_Denmark.utf8    LC_NUMERIC=C                      
[5] LC_TIME=English_United States.1252

time zone: Europe/Copenhagen
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] marginaleffects_0.20.1 patchwork_1.2.0        glue_1.7.0             parameters_0.21.7      ggpomological_0.1.2    epoxy_1.0.0           
 [7] gt_0.11.0              lubridate_1.9.3        forcats_1.0.0          stringr_1.5.1          dplyr_1.1.4            purrr_1.0.2           
[13] readr_2.1.5            tidyr_1.3.1            tibble_3.2.1           ggplot2_3.5.1          tidyverse_2.0.0       

loaded via a namespace (and not attached):
 [1] sass_0.4.9         utf8_1.2.4         generics_0.1.3     xml2_1.3.6         lattice_0.22-6     stringi_1.8.4      hms_1.1.3          digest_0.6.35     
 [9] magrittr_2.0.3     evaluate_0.23      grid_4.4.0         timechange_0.3.0   estimability_1.5.1 mvtnorm_1.2-5      fastmap_1.2.0      processx_3.8.4    
[17] ps_1.7.6           fansi_1.0.6        scales_1.3.0       cli_3.6.2          rlang_1.1.4        reprex_2.1.0       munsell_0.5.1      commonmark_1.9.1  
[25] yaml_2.3.8         withr_3.0.0        tools_4.4.0        datawizard_0.11.0  tzdb_0.4.0         coda_0.19-4.1      colorspace_2.1-0   bayestestR_0.13.2 
[33] vctrs_0.6.5        R6_2.5.1           lifecycle_1.0.4    emmeans_1.10.2     fs_1.6.4           insight_0.20.0     callr_3.7.6        clipr_0.8.0       
[41] pkgconfig_2.0.3    pillar_1.9.0       gtable_0.3.5       Rcpp_1.0.12        data.table_1.15.4  xfun_0.44          tidyselect_1.2.1   knitr_1.47        
[49] rstudioapi_0.16.0  xtable_1.8-4       htmltools_0.5.8.1  rmarkdown_2.27     compiler_4.4.0     markdown_1.13  
@nielsbock
Copy link
Contributor

Just want to add that cols_width is respected in pdf (but not html) output in the above example if "low" is replaced with "high" like this:

  cols_width(
	date ~ px(100),
	c("open", "high") ~ px(200)
  )

So it seems that the label is misattributed in latex. Maybe this can help with debugging.

I've had a related issue with latex when creating tables using gtsummary and converting to gt using as_gt(). Couldn't modify cols_width until I realised after using tab_info() that labels did not match column names in the table. Specifying column width eventually worked after trial and error with labels under tab_info().

@snhansen
Copy link
Contributor Author

Good catch. I looked a bit more into it, and from what I can see the issue is with the create_table_start_l() function in utils_render_latex.R.

Indeed if we call the example table for ex_table, then we get the following

data <- gt:::build_data(data = ex_table, context = "latex")
colwidth_df <- gt:::create_colwidth_df_l(data = data)
colwidth_df
#>      type unspec lw  pt tbl_width
#> 1 default      0  0  75        NA
#> 2 default      0  0 150        NA
#> 3  hidden      1  0   0        NA
#> 4 default      0  0 150        NA
#> 5  hidden      1  0   0        NA

so it calculates the widths correctly for the correct columns, but this is translated into LaTeX code by create_table_start_l() incorrectly, i.e. it is missing a column specification.

gt:::create_table_start_l(data = data, colwidth_df = colwidth_df)
#> [1] "\\begin{longtable}{>{\\raggedleft\\arraybackslash}p{\\dimexpr 75.00pt -2\\tabcolsep-1.5\\arrayrulewidth}>{\\raggedleft\\arraybackslash}p{\\dimexpr 150.00pt -2\\tabcolsep-1.5\\arrayrulewidth}r}\n"

@snhansen
Copy link
Contributor Author

snhansen commented Aug 19, 2024

Did a bit more investigating and it's this part of the code that's not working properly (line 179-214):

  if (any(colwidth_df$unspec < 1L)) {

    col_defs <- NULL

    for (i in seq_along(col_alignment)) {

      if (colwidth_df$unspec[i] == 1L) {
        col_defs_i <- substr(col_alignment[i], 1, 1)
      } else {

        align <-
          switch(
            col_alignment[i],
            left = ">{\\raggedright\\arraybackslash}",
            right = ">{\\raggedleft\\arraybackslash}",
            center = ">{\\centering\\arraybackslash}",
            ">{\\raggedright\\arraybackslash}"
          )

        col_defs_i <-
          paste0(
            align,
            "p{",
            create_singlecolumn_width_text_l(pt = colwidth_df$pt[i], lw = colwidth_df$lw[i]),
            "}"
          )

      }

      col_defs <- c(col_defs, col_defs_i)
    }

  } else {

    col_defs <- substr(col_alignment, 1, 1)
  }

because col_alignment only contains visible columns whereas colwidth_df contains both visible and invisible columns. A fix would be to get rid of the invisible columns of colwidth_df:

  if (any(colwidth_df$unspec < 1L)) {
    
    col_defs <- NULL
    colwidth_df_visible <- colwidth_df[colwidth_df$type != "hidden", ]
	
    for (i in seq_along(col_alignment)) {
      
      if (colwidth_df_visible$unspec[i] == 1L) {
        col_defs_i <- substr(col_alignment[i], 1, 1)
      } else {
        
        align <-
          switch(
            col_alignment[i],
            left = ">{\\raggedright\\arraybackslash}",
            right = ">{\\raggedleft\\arraybackslash}",
            center = ">{\\centering\\arraybackslash}",
            ">{\\raggedright\\arraybackslash}"
          )
        
        col_defs_i <-
          paste0(
            align,
            "p{",
            create_singlecolumn_width_text_l(pt = colwidth_df_visible$pt[i], lw = colwidth_df_visible$lw[i]),
            "}"
          )
        
      }
	  
      col_defs <- c(col_defs, col_defs_i)
    }
    
  } else {
    
    col_defs <- substr(col_alignment, 1, 1)
  }

EDIT: The above doesn't work with stubs/row_groups introduced with the rowname_col and groupname_col options:

data <- mtcars[1:4, c("am", "gear", "mpg", "cyl", "disp")] |>
  gt(rowname_col = "gear", groupname_col = "am") |>
  cols_merge(
    columns = c("mpg", "cyl")
  ) |>
  cols_width(
    "gear" ~ px(50),
    c("mpg", "disp") ~ px(150)
  )

yields

> colwidth_df
       type unspec lw    pt tbl_width
1 row_group      1  0   0.0        NA
2      stub      0  0  37.5        NA
3   default      0  0 112.5        NA
4    hidden      1  0   0.0        NA
5   default      0  0 112.5        NA

@olivroy
Copy link
Collaborator

olivroy commented Aug 26, 2024

Thanks both for your investigation!

Basically, if you use cols_merge(c(col1, col2)), under the hood, gt does the merging, and keeps col1 as the new column and calls cols_hide() on col2.

We'd happily accept a PR for this with tests, clear explanations, before and after screenshots of the result

@snhansen
Copy link
Contributor Author

@olivroy: I'm taking a look at this and currently trying to understand stub layouts. Could you give me an example where the result of get_stub_layout() has length 2? I can't think of such an example, but from the code it seems possible, and I'd like to handle all cases properly.

@olivroy
Copy link
Collaborator

olivroy commented Aug 29, 2024

get_stub_layout() returns length 2 if you have both row_group_as_column = TRUE and the data has row names

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment