Convert a geojson column to sf, use tibble printing in package

Hi all! I'm struggling to figure out the best way (or a way that... works) to convert a geojson column of a data frame to a sf geometry list column, then to convert that into a tibble and store it as data (that prints with tibble printing) within a package. A lot of layers, I know!

I have a small example package on github of what I've attempted and isn't working (mostly in the data-raw folder).

If I convert the column using sf::st_as_sfc() and keep it as a data.frame as an object in the package, all is good (code is in df_sf.R)

library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0

df_sf <- data.frame(
  a = 1,
  b = 2,
  geometry = "{\"type\": \"Polygon\", \"coordinates\": [[[-79.4359157087306, 43.6801533947749], [-79.4359157087306, 43.6801533947749]]]}"
)

df_sf
#>   a b
#> 1 1 2
#>                                                                                                               geometry
#> 1 {"type": "Polygon", "coordinates": [[[-79.4359157087306, 43.6801533947749], [-79.4359157087306, 43.6801533947749]]]}

df_sf[["geometry"]] <- st_as_sfc(df_sf[["geometry"]], GeoJSON = TRUE, crs = 4326)

df_sf
#>   a b                       geometry
#> 1 1 2 POLYGON ((-79.43592 43.6801...

# Save as an object in package
usethis::use_data(df_sf, overwrite = TRUE)

And I can access it just fine:

geojsontosftest::df_sf
#>   a b                                 geometry
#> 1 1 2 -79.43592, -79.43592, 43.68015, 43.68015

But, if I convert it and then convert to a tibble, it seems like things are fine. It prints as a tibble and all looks good. I also have tibble printing enabled, from using usethis::use_tibble(), and documented that it returns a tibble. Code is in df_sf_tibble.R.

library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
library(tibble)

df_sf <- data.frame(
  a = 1,
  b = 2,
  geometry = "{\"type\": \"Polygon\", \"coordinates\": [[[-79.4359157087306, 43.6801533947749], [-79.4359157087306, 43.6801533947749]]]}"
)

df_sf[["geometry"]] <- st_as_sfc(df_sf[["geometry"]], GeoJSON = TRUE, crs = 4326)

# First convert to tibble, then save as object in package

df_sf_tibble <- as_tibble(df_sf)

df_sf_tibble
#> # A tibble: 1 x 3
#>       a     b                                   geometry
#>   <dbl> <dbl>                              <POLYGON [Β°]>
#> 1     1     2 ((-79.43592 43.68015, -79.43592 43.68015))

usethis::use_data(df_sf_tibble, overwrite = TRUE)

But when I try to access it from the package, I get an error:

geojsontosftest::df_sf_tibble
#> Error: Input must be a vector, not a `sfc_POLYGON/sfc` object.

Here is my session info and package versions:

Session info
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 3.6.1 (2019-07-05)
#>  os       macOS Sierra 10.12.1        
#>  system   x86_64, darwin15.6.0        
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_CA.UTF-8                 
#>  ctype    en_CA.UTF-8                 
#>  tz       America/Toronto             
#>  date     2020-02-29                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package         * version      date       lib
#>  assertthat        0.2.1        2019-03-21 [1]
#>  backports         1.1.5        2019-10-02 [1]
#>  callr             3.3.2        2019-09-22 [1]
#>  cli               2.0.1        2020-01-08 [1]
#>  crayon            1.3.4        2017-09-16 [1]
#>  desc              1.2.0        2018-05-01 [1]
#>  devtools          2.2.1        2019-09-24 [1]
#>  digest            0.6.22       2019-10-21 [1]
#>  ellipsis          0.3.0        2019-09-20 [1]
#>  evaluate          0.14         2019-05-28 [1]
#>  fansi             0.4.0        2018-10-05 [1]
#>  fs                1.3.1        2019-05-06 [1]
#>  geojsontosftest   0.0.0.9000   2020-02-29 [1]
#>  glue              1.3.1        2019-03-12 [1]
#>  highr             0.8          2019-03-20 [1]
#>  htmltools         0.4.0        2019-10-04 [1]
#>  knitr             1.28         2020-02-06 [1]
#>  magrittr          1.5          2014-11-22 [1]
#>  memoise           1.1.0        2017-04-21 [1]
#>  pillar            1.4.3.9000   2020-02-08 [1]
#>  pkgbuild          1.0.6        2019-10-09 [1]
#>  pkgconfig         2.0.3        2019-09-22 [1]
#>  pkgload           1.0.2        2018-10-29 [1]
#>  prettyunits       1.0.2        2015-07-13 [1]
#>  processx          3.4.1        2019-07-18 [1]
#>  ps                1.3.0        2018-12-21 [1]
#>  R6                2.4.0        2019-02-14 [1]
#>  Rcpp              1.0.2        2019-07-25 [1]
#>  remotes           2.1.0        2019-06-24 [1]
#>  rlang             0.4.4.9001   2020-02-29 [1]
#>  rmarkdown         1.16         2019-10-01 [1]
#>  rprojroot         1.3-2        2018-01-03 [1]
#>  sessioninfo       1.1.1        2018-11-05 [1]
#>  stringi           1.4.3        2019-03-12 [1]
#>  stringr           1.4.0        2019-02-10 [1]
#>  testthat          2.2.1        2019-07-25 [1]
#>  tibble            2.99.99.9014 2020-02-29 [1]
#>  usethis           1.5.1.9000   2020-02-29 [1]
#>  vctrs             0.2.99.9006  2020-02-29 [1]
#>  withr             2.1.2        2018-03-15 [1]
#>  xfun              0.10         2019-10-01 [1]
#>  yaml              2.2.0        2018-07-25 [1]
#>  source                           
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.1)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  local                            
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  Github (r-lib/pillar@8f5918c)    
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  Github (r-lib/rlang@5af0b7f)     
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  Github (tidyverse/tibble@45603eb)
#>  Github (r-lib/usethis@2a3d134)   
#>  Github (r-lib/vctrs@1ea8454)     
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#>  CRAN (R 3.6.0)                   
#> 
#> [1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library

Most of the resources I've seen for converting geojson -> sf are for a vector, not a column within a data frame/tibble. Is my approach to converting wrong? Is the tibble combination not meant to be? Thoughts appreciated!

You might need to also import sf so that the methods that tibble uses to tell if a column is a vector are loaded? A traceback (preferably with rlang::rlang::last_trace()) would also be useful for diagnosing the exact source of the problem.

thanks!

would I need to do something besides just listing sf under Imports? doing so doesn't seem to help (with traceback, ty, now I know!):

geojsontosftest::df_sf_tibble
# Error: Input must be a vector, not a `sfc_POLYGON/sfc` object.
# Run `rlang::last_error()` to see where the error occurred.

rlang::last_error()
# <error/vctrs_error_scalar_type>
#   Input must be a vector, not a `sfc_POLYGON/sfc` object.
# Backtrace:
#   1. (function (x, ...) ...
#  17. vctrs:::stop_scalar_type(...)
#  18. vctrs:::stop_vctrs(msg, "vctrs_error_scalar_type", actual = x)
#  Run `rlang::last_trace()` to see the full context.

rlang::last_trace()
# <error/vctrs_error_scalar_type>
#   Input must be a vector, not a `sfc_POLYGON/sfc` object.
# Backtrace:
#   β–ˆ
#  1. β”œβ”€(function (x, ...) ...
#  2. β”œβ”€tibble:::print.tbl(x)
#  3. β”‚ β”œβ”€tibble:::cat_line(format(x, ..., n = n, width = width, n_extra = n_extra)) 
#  4. β”‚ β”‚ β”œβ”€base::cat(paste0(..., "\n"), sep = "")
#  5. β”‚ β”‚ └─base::paste0(..., "\n")
#  6. β”‚ β”œβ”€base::format(x, ..., n = n, width = width, n_extra = n_extra)
#  7. β”‚ └─tibble:::format.tbl(x, ..., n = n, width = width, n_extra = n_extra)
#  8. β”‚   └─tibble::trunc_mat(x, n = n, width = width, n_extra = n_extra)
#  9. β”‚     β”œβ”€base::as.data.frame(head(x, n))
# 10. β”‚     β”œβ”€utils::head(x, n)
# 11. β”‚     └─utils:::head.data.frame(x, n)
# 12. β”‚       β”œβ”€x[seq_len(n), , drop = FALSE]
# 13. β”‚       └─tibble:::`[.tbl_df`(x, seq_len(n), , drop = FALSE)
# 14. β”‚         └─tibble:::tbl_subset_row(xo, i = i)
# 15. β”‚           └─base::lapply(unclass(x), vec_slice, i = i)
# 16. β”‚             └─vctrs:::FUN(X[[i]], ...)
# 17. └─vctrs:::stop_scalar_type(...)
# 18.   └─vctrs:::stop_vctrs(msg, "vctrs_error_scalar_type", actual = x)

but loading the sf package does work, so it's definitely something there:

library(sf)
#> Linking to GEOS 3.7.2, GDAL 2.4.2, PROJ 5.2.0
geojsontosftest::df_sf_tibble
#> # A tibble: 1 x 3
#>       a     b                                   geometry
#>   <dbl> <dbl>                              <POLYGON [Β°]>
#> 1     1     2 ((-79.43592 43.68015, -79.43592 43.68015))

Have you considered moving {sf} from Imports to Depends field of your package? It is not a common practice (and in fact sometimes discouraged) but I have found it a helpful pattern for packages handling spatial data.

When I declare {sf} in Depends your package seems to work as intended.

> library(geojsontosftest)
Loading required package: sf
Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.0
> geojsontosftest::df_sf_tibble
# A tibble: 1 x 3
      a     b                                   geometry
  <dbl> <dbl>                              <POLYGON [Β°]>
1     1     2 ((-79.43592 43.68015, -79.43592 43.68015))
1 Like

Works for me...

Here is the session info:

> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods  
[7] base     

other attached packages:
[1] geojsontosftest_0.0.0.9000

loaded via a namespace (and not attached):
 [1] compiler_3.6.2   assertthat_0.2.1 cli_2.0.1       
 [4] tools_3.6.2      pillar_1.4.3     glue_1.3.1      
 [7] rstudioapi_0.10  tibble_2.1.3     crayon_1.3.4    
[10] utf8_1.1.4       fansi_0.4.1      vctrs_0.2.3     
[13] packrat_0.5.0    pkgconfig_2.0.3  rlang_0.4.4     

Oh you need to add to imports and @importFrom one function (it doesn't matter which)

i'm going with Hadley's suggesting of Imports + importing one function, but i'll keep in mind that putting it in Depends is a pattern for spatial data in the future - definitely something i'm newer to, thanks!

I am not one to argue with Hadley :slight_smile:

Just a comment: the Depends path is not something entirely settled in the spatial package development crowd; some people prefer the "import everything" way of @import sf. Again, not a best practice in general, but a reasonable choice in spatial context.

The reason is that {sf} has a significant number of classes (sf, sfc, sfg, you name it...) which in turn have methods and for your data object / package to work as expected you have to have all methods available.

There is little chance your spatial package will be a little self contained island, and in fact this is often not the aim - numerous packages are designed to plug into a broader workflow built on {sf} by providing a specific spatial object.

For example of package built on "let's import the whole hog" consider the popular tigris; for an example of "makes no sense without sf, so might as well depend on it" consider bcmaps.

A fun fact: the changelog of {bcmaps} states that the developers moved {sf} from Imports to Depends specifically to take advanage of {sf} print methods; of course that was 2 years ago, so I can not be certain if the comment is still valid (but the package is actively maintanied, so they likely don't see an issue with it)

2 Likes

this is really great background, thank you! i didn't even realize that the reason for all this was because of the methods, of course it is.

1 Like

Yup. You know how the users are - you resolve print and somebody tries plot; you get that done and they try merge... Don't get me even started about {dplyr} pipelines!

{sf} may have simple in its name, but it is a beast of a package and there is little chance to keep up with all the methods by importing them by name.

length(methods(class = "sf"))
[1] 67
length(methods(class = "sfc"))
[1] 62
length(methods(class = "sfg"))
[1] 41
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.