STRING_PTR() can only be applied to a 'character', not a 'list'

dplyr
purrr
#1

In multiple contexts now I have encountered this cryptic error while using purrrlyr and I can't find any documentation about this (except this: https://github.com/tidyverse/tidyr/pull/362, which wasn't that helpful).

So I wanted to know what am I doing wrong here or what is this error?

library(tidyverse)
library(purrrlyr)

# dataframe
dat <-
  structure(
    list(
      vars = c("var_1", "var_2"),
      data = list(
        structure(
          list(time = 1:10, value = c(1:10)),
          row.names = c(NA, -10L),
          class = c("tbl_df", "tbl", "data.frame")
        ),
        structure(
          list(time = 1:10, value = c(11:20)),
          row.names = c(NA, -10L),
          class = c("tbl_df", "tbl", "data.frame")
        )
      ),
      mu = c(1, 2),
      stdev = c(1, 2)
    ),
    class = c("tbl_df", "tbl", "data.frame"),
    row.names = c(NA, -2L)
  )

# applying operation row-wise
dat %>%
  purrrlyr::by_row(
    .d = .,
    ..f = ~dnorm(x = .$data[[1]]$value[[1]], mean = .$mu[[1]], sd = .$stdev[[1]]),
    collate = "rows"
  )
#> Error in purrrlyr::by_row(.d = ., ..f = ~dnorm(x = .$data[[1]]$value[[1]], : STRING_PTR() can only be applied to a 'character', not a 'list'

Created on 2018-11-12 by the reprex package (v0.2.1)

0 Likes

#2

It does not resolve the issue with purrrlyr but considering how purrr has evolved I would now do something like this without purrrlyr

library(tidyverse)

# dataframe
dat <- tibble::tribble(
  ~vars,                            ~data, ~mu, ~stdev,
  "var_1",  list(time = 1:10, value = 1:10),   1,      1,
  "var_2", list(time = 1:10, value = 11:20),   2,      2
)

# applying operation row-wise using column name as argument
custom_dnorm <- function(data, mu, stdev, ...) {
  dnorm(x = data$value[[1]], mean = mu, sd = stdev)
}

# applying rowise using pmap on the df
dat %>%
  mutate(new_col = pmap_dbl(., custom_dnorm))
#> # A tibble: 2 x 5
#>   vars  data          mu stdev    new_col
#>   <chr> <list>     <dbl> <dbl>      <dbl>
#> 1 var_1 <list [2]>     1     1 0.399     
#> 2 var_2 <list [2]>     2     2 0.00000799

Created on 2018-11-13 by the reprex package (v0.2.1)

I believe this is the result you want. If not, please tell me what is is, I believe purrr can handle it.

You could be interested in those resource:



4 Likes

#3

Thanks!

But what I am actually curious about the error I am getting and not an alternative way to achieve what I want to achieve here. I find purrrlyr functions to be much more intuitive and so I use them often instead of purrr and recently I've noticed this error (with STRING_PTR()) that I find hard to decipher and to get rid of. So I wanted to know if anybody knew what was going on here.

1 Like

#4

So, please note that purrrlyr isn't actually part of the tidyverse and is in the "questioning" lifecycle phase (thus the " Please see Jenny Brian's webinar on row-oriented workflows for some alternative approaches." in the README). purrrlyr was always a somewhat experimental package (it was never part of the tidyverse, to my knowledge). So, from a tidyverse standpoint, I think @cderv's advice is on point.

That said, I believe the STRING_PTR() error is coming from somewhere in the fast-copy.cpp code.

Since it's in the .cpp code (and I don't know C++), I'm not sure I can be much help beyond that.

Hopefully someone else will be able to help you out! :slightly_smiling_face:

2 Likes

#5

Thanks, Mara! That's super-helpful.

Didn't notice the "questioning" lifecycle phase badge. I hope it is not retired/archived because that's going to break the back of a lot of my package functions. Maybe I should proactively remove purrrlyr from dependencies and completely replace all instances of its use with the purrr alternatives.

0 Likes

#6

Glad you have your answer. Sorry I misunderstood at first answer. Thanks @mara for the additional reply with all the necessary information ! :slight_smile:

@IndrajeetPatil If your question's been answered (even by you!), would you mind choosing a solution? It helps other people see which questions still need help, or find solutions if they have similar problems. Here’s how to do it:

1 Like

#7

It's not being archived, but it's not a priority in terms of active development. We're certainly trying to make the outcomes achieved through purrrlyr happen, just with a different approach (honestly, I don't know the package that well).

1 Like

#8

Thanks, but this is still not resolved. Mara pointed to where the error is coming from but it is still unclear to me what the error is. Unfortunately, I don't know C++, so I can't read that code and figure it out on my own what the error is and how to get rid of it. I know that I can do this with purrr, but I'd still like to know how to achieve this with purrrlyr by getting rid of this error.

0 Likes

#9

I don't think* that this has much (or anything) to do with what you are doing in your code. I believe that the STRING_PTR error is coming from R itself, due to a change in R's internals that happened between R 3.4.4 and R 3.5.0 (this is what people are discussing in the tidyr pull request you linked to in the first post).

STRING_PTR is part of R's internals (it lives in memory.c). Previously, it was not as picky as it might have been about checking for valid types of things passed to it, which (I gather!) allowed for a bit of a hack that proved useful to tidyverse package authors in some circumstances. In Sep 2017, a member of the R Core Team committed a change that made STRING_PTR check objects passed to it more carefully, which meant that code that relied on its previous un-pickiness started throwing errors. These errors are coming from deep inside R. In purrrlyr's case, I do not think there is anything you can do differently in your code to avoid these errors. It's a conflict between how the package is implemented and how R works as of version 3.5.0.

purrrlyr was created as a container for functions that had been removed from purrr and dplyr before the STRING_PTR change was made to (what became) R 3.5.0. The NEWS for purrrlyr 0.0.1 (appeared on CRAN in April 2017) says:

All data-frame based mappers have been moved to this package. These functions are not technically deprecated (so you can move to this package as easily as possible), but these functions are unlikely to be changed in the future (i.e. there will be no bug fixes) and are likely to go away in the near future, so we highly recommend updating to new approaches.

  • Mapping a function to each column of a data frame should now be handled with the colwise mutating and summarising operations in dplyr instead of dmap() . These are the verbs with suffix _all() , _at() and _if() , such as mutate_all() or summarise_if() . Note that this means the output of .f should conform to the requirements of dplyr operations: same length as the input for mutating operations, and length 1 for summarising operations.
  • Inovking a function row by row with the columns of a data frame as arguments should be done with pmap() followed by dplyr::as_dataframe() instead of map_rows() .
  • Mapping rowwise slices of a data frame with by_row() is deprecated in favour of a combination of tidyverse functions. First use tidyr::nest() to create a list-column containing groupwise data frames. Then use dplyr::mutate() to operate on this list-column. Typically you will want to apply a function on each element (nested data frame) of this list-column with purrr::map() .

(emphasis mine)

What I'm taking from all this is that while the STRING_PTR bug got fixed elsewhere, I suspect that since purrrlyr isn't currently a development priority, the issue has not yet been fixed in purrrlyr. You may as well file your example above (or an even more minimal version) in the purrrlyr issue tracker, but I wouldn't hold out hope for a bug fix anytime soon. :slightly_frowning_face:

Some options if you (or anybody else!) wants to keep using purrrlyr for tasks that are throwing this STRING_PTR error:

  • Use it with R < 3.5 (see below). But, this is obviously unappealing as a long-term plan.
  • Fork purrrlyr and fix the problem (maybe looking to the similar work that was done on tidyr or other tidyverse packages for inspiration). But, this requires acquiring the necessary programming skills, or teaming up with somebody else who has them.

Or you can choose to start moving to the alternative methods outlined in the NEWS file or all the other links earlier in this thread. I'm sorry that's not better news!

* I have not dug super deep into this, so all of the above should be understood as a hypothesis :sweat_smile:

purrrlyr still works with R < 3.5

Your example in R 3.5.0 (throws error)
library(purrrlyr)

# dataframe
dat <- structure( list( vars = c("var_1", "var_2"), data = list( structure(
  list(time = 1:10, value = c(1:10)), row.names = c(NA,-10L), class =
  c("tbl_df", "tbl", "data.frame") ), structure( list(time = 1:10, value =
  c(11:20)), row.names = c(NA,-10L), class = c("tbl_df", "tbl", "data.frame")
  ) ), mu = c(1, 2), stdev = c(1, 2) ), class = c("tbl_df", "tbl",
  "data.frame"), row.names = c(NA,-2L) )

# applying operation row-wise
dat %>%
  purrrlyr::by_row(
    .d = .,
    ..f = ~dnorm(x = .$data[[1]]$value[[1]], mean = .$mu[[1]], sd = .$stdev[[1]]),
    collate = "rows"
  )
#> Error in purrrlyr::by_row(.d = ., ..f = ~dnorm(x = .$data[[1]]$value[[1]], : STRING_PTR() can only be applied to a 'character', not a 'list'

Created on 2018-11-14 by the reprex package (v0.2.1)

Session info
sessionInfo()
#> R version 3.5.0 (2018-04-23)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 16.04.5 LTS
#> 
#> Matrix products: default
#> BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
#> LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] purrrlyr_0.0.3
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.0       knitr_1.20       bindr_0.1.1      magrittr_1.5    
#>  [5] tidyselect_0.2.5 R6_2.3.0         rlang_0.3.0.1    stringr_1.3.1   
#>  [9] dplyr_0.7.8      tools_3.5.0      htmltools_0.3.6  yaml_2.2.0      
#> [13] rprojroot_1.3-2  digest_0.6.18    assertthat_0.2.0 tibble_1.4.2    
#> [17] crayon_1.3.4     bindrcpp_0.2.2   purrr_0.2.5      glue_1.3.0      
#> [21] evaluate_0.12    rmarkdown_1.10   stringi_1.2.4    compiler_3.5.0  
#> [25] pillar_1.3.0     backports_1.1.2  pkgconfig_2.0.2
Your example in R 3.4.4 (runs successfully)
library(purrrlyr)

# dataframe
dat <- structure( list( vars = c("var_1", "var_2"), data = list( structure(
  list(time = 1:10, value = c(1:10)), row.names = c(NA,-10L), class =
  c("tbl_df", "tbl", "data.frame") ), structure( list(time = 1:10, value =
  c(11:20)), row.names = c(NA,-10L), class = c("tbl_df", "tbl", "data.frame")
  ) ), mu = c(1, 2), stdev = c(1, 2) ), class = c("tbl_df", "tbl",
  "data.frame"), row.names = c(NA,-2L) )

# applying operation row-wise
dat %>%
  purrrlyr::by_row(
    .d = .,
    ..f = ~dnorm(x = .$data[[1]]$value[[1]], mean = .$mu[[1]], sd = .$stdev[[1]]),
    collate = "rows"
  )
#> # tibble [2 × 5]
#>   vars  data                 mu stdev .out     
#>   <chr> <list>            <dbl> <dbl> <list>   
#> 1 var_1 <tibble [10 × 2]>     1     1 <dbl [1]>
#> 2 var_2 <tibble [10 × 2]>     2     2 <dbl [1]>

Created on 2018-11-15 by the reprex package (v0.2.1)

Session info
sessionInfo()
#> R version 3.4.4 (2018-03-15)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Ubuntu 16.04.5 LTS
#> 
#> Matrix products: default
#> BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
#> LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0
#> 
#> locale:
#>  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
#>  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#>  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
#> [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] purrrlyr_0.0.3
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.0       knitr_1.20       bindr_0.1.1      magrittr_1.5    
#>  [5] tidyselect_0.2.5 R6_2.3.0         rlang_0.3.0.1    fansi_0.4.0     
#>  [9] stringr_1.3.1    dplyr_0.7.8      tools_3.4.4      utf8_1.1.4      
#> [13] cli_1.0.1        htmltools_0.3.6  yaml_2.2.0       rprojroot_1.3-2 
#> [17] digest_0.6.18    assertthat_0.2.0 tibble_1.4.2     crayon_1.3.4    
#> [21] bindrcpp_0.2.2   purrr_0.2.5      glue_1.3.0       evaluate_0.12   
#> [25] rmarkdown_1.10   stringi_1.2.4    compiler_3.4.4   pillar_1.3.0    
#> [29] backports_1.1.2  pkgconfig_2.0.2
3 Likes

#10

Thanks for this incredible detective work! Now I am satisfied with the error :slight_smile:

I will file an issue on purrrlyr repo with no expectation that this will be resolved any time soon.

1 Like

closed #11

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

0 Likes