Issue with unnest_token

Hi there,

I am having some trouble with unnest_token. I get following error:

Error: ! Must extract column with a single valid subscript. ✖ Subscript !!enquo(var)has the wrong typefunction. ℹ It must be numeric or character.

which I don't understand as 'Description.of.the.call' is a character field.

Here is my reprex (hopefully done correctly)

install.packages("tidytext")
#> 
#> The downloaded binary packages are in
#>  /var/folders/n4/v7p943711hgcdq67m0w2p2hm0000gp/T//Rtmp9YfBDv/downloaded_packages
install.packages("textdata")
#> 
#> The downloaded binary packages are in
#>  /var/folders/n4/v7p943711hgcdq67m0w2p2hm0000gp/T//Rtmp9YfBDv/downloaded_packages
install.packages("datapasta")
#> 
#> The downloaded binary packages are in
#>  /var/folders/n4/v7p943711hgcdq67m0w2p2hm0000gp/T//Rtmp9YfBDv/downloaded_packages

library(tidytext)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(stringr)
library(textdata)
library(reprex)
library(datapasta)

sa <- read.csv("reprex_call_data.csv")
#> Warning in file(file, "rt"): cannot open file 'reprex_call_data.csv': No such
#> file or directory
#> Error in file(file, "rt"): cannot open the connection

sa<-tibble::tribble(
  ~ID,                  ~Location,                                         ~Uncontactable.reason,    ~Date.of.call,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         ~Description.of.the.call,        ~Areas.of.concern,
  1L,               "Gold Coast", "Left message for them to call back if interested in talking", "5/3/2021 16:30",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  2L,               "Gold Coast", "Left message for them to call back if interested in talking", "5/3/2021 16:29",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  3L,               "Gold Coast", "Left message for them to call back if interested in talking", "5/3/2021 16:29",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  4L,               "Gold Coast", "Left message for them to call back if interested in talking", "5/3/2021 16:27",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  5L,              "Western Qld", "Left message for them to call back if interested in talking", "5/3/2021 16:26",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  6L, "Darling Downs South West", "Left message for them to call back if interested in talking", "5/3/2021 16:25",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  7L,           "Sunshine Coast", "Left message for them to call back if interested in talking", "5/3/2021 16:25",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  8L,               "Gold Coast", "Left message for them to call back if interested in talking", "5/3/2021 16:24",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  9L,               "Gold Coast", "Left message for them to call back if interested in talking", "5/3/2021 16:23",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               NA,                       NA,
  10L,           "Sunshine Coast",                                                            NA,       "2/3/2021", "Surrounded affronting favourable no mr. Lain knew like half she yet joy. Be than dull as seen very shot. Attachment ye so am travelling estimating projecting is. Off fat address attacks his besides. Suitable settling mr attended no doubtful feelings. Any over for say bore such sold five but hung. Demesne far hearted suppose venture excited see had has. Dependent on so extremely delivered by. Yet no jokes worse her why. Bed one supposing breakfast day fulfilled off depending questions. Whatever boy her exertion his extended. Ecstatic followed handsome drawings entirely mrs one yet outweigh. Of acceptance insipidity remarkably is invitation. Full he none no side. Uncommonly surrounded considered for him are its. It we is read good soon. My to considered delightful invitation announcing of no decisively boisterous. Did add dashwoods deficient man concluded additions resources. Or landlord packages overcame distance smallest in recurred. Wrong maids or be asked no on enjoy. Household few sometimes out attending described. Lain just fact four of am meet high. Can curiosity may end shameless explained. True high on said mr on come. An do mr design at little myself wholly entire though. Attended of on stronger or mr pleasure. Rich four like real yet west get. Felicity in dwelling to drawings. His pleasure new steepest for reserved formerly disposed jennings.",          "Lorem; Ipsum;",
  11L,                 "Brisbane",                                                            NA,       "3/2/2021",                                                                                                                                                                                                                                               "May musical arrival beloved luckily adapted him. Shyness mention married son she his started now. Rose if as past near were. To graceful he elegance oh moderate attended entrance pleasure. Vulgar saw fat sudden edward way played either. Thoughts smallest at or peculiar relation breeding produced an. At depart spirit on stairs. She the either are wisdom praise things she before. Be mother itself vanity favour do me of. Begin sex was power joy after had walls miles. Unwilling sportsmen he in questions september therefore described so. Attacks may set few believe moments was. Reasonably how possession shy way introduced age inquietude. Missed he engage no exeter of. Still tried means we aware order among on. Eldest father can design tastes did joy settle. Roused future he ye an marked. Arose mr rapid in so vexed words. Gay welcome led add lasting chiefly say looking. Unfeeling so rapturous discovery he exquisite. Reasonably so middletons or impression by terminated. Old pleasure required removing elegance him had. Down she bore sing saw calm high. Of an or game gate west face shed. no great but music too old found arose.",       "Married Moderate",
  12L,                 "Brisbane",                                                            NA,       "4/2/2021",                                                                                                                                                                                                                     "Guest it he tears aware as. Make my no cold of need. He been past in by my hard. Warmly thrown oh he common future. Otherwise concealed favourite frankness on be at dashwoods defective at. Sympathize interested simplicity at do projecting increasing terminated. As edward settle limits at in. Sing long her way size. Waited end mutual missed myself the little sister one. So in pointed or chicken cheered neither spirits invited. Marianne and him laughter civility formerly handsome sex use prospect. Hence we doors is given rapid scale above am. Difficult ye mr delivered behaviour by an. If their woman could do wound on. You folly taste hoped their above are and but. Little afraid its eat looked now. Very ye lady girl them good me make. It hardly cousin me always. An shortly village is raising we shewing replied. She the favourable partiality inhabiting travelling impression put two. His six are entreaties instrument acceptance unsatiable her. Amongst as or on herself chapter entered carried no. Sold old ten are quit lose deal his sent. You correct how sex several far distant believe journey parties. We shyness enquire uncivil affixed it carried to.", "Terminated; sympathize",
  13L,                 "Brisbane",                                                            NA,       "5/2/2021",                                                                                                                                                                                                                                                                                                                                                                                                                        "Wrote water woman of heart it total other. By in entirely securing suitable graceful at families improved. Zealously few furniture repulsive was agreeable consisted difficult. Collected breakfast estimable questions in to favourite it. Known he place worth words it as to. Spoke now noise off smart her ready. Sudden looked elinor off gay estate nor silent. Son read such next see the rest two. Was use extent old entire sussex. Curiosity remaining own see repulsive household advantage son additions. Supposing exquisite daughters eagerness why repulsive for. Praise turned it lovers be warmly by. Little do it eldest former be if. Uneasy barton seeing remark happen his has. Am possible offering at contempt mr distance stronger an. Attachment excellence announcing or reasonable am on if indulgence. Exeter talked in agreed spirit no he unable do. Betrayed shutters in vicinity it unpacked in. In so impossible appearance considered mr. Mrs him left find are good.",   "breakfast; questions",
  14L,                  "Ipswich",                                                            NA,       "6/2/2021",                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   "Are sentiments apartments decisively the especially alteration. Thrown shy denote ten ladies though ask saw. Or by to he going think order event music. Incommode so intention defective at convinced. Led income months itself and houses you. After nor you leave might share court balls.",                  "music"
)
head(sa)
#> # A tibble: 6 × 6
#>      ID Location                 Uncontactable.reason    Date.…¹ Descr…² Areas…³
#>   <int> <chr>                    <chr>                   <chr>   <chr>   <chr>  
#> 1     1 Gold Coast               Left message for them … 5/3/20… <NA>    <NA>   
#> 2     2 Gold Coast               Left message for them … 5/3/20… <NA>    <NA>   
#> 3     3 Gold Coast               Left message for them … 5/3/20… <NA>    <NA>   
#> 4     4 Gold Coast               Left message for them … 5/3/20… <NA>    <NA>   
#> 5     5 Western Qld              Left message for them … 5/3/20… <NA>    <NA>   
#> 6     6 Darling Downs South West Left message for them … 5/3/20… <NA>    <NA>   
#> # … with abbreviated variable names ¹​Date.of.call, ²​Description.of.the.call,
#> #   ³​Areas.of.concern
#> # A tibble: 6 × 6
#>      ID Location                 Uncontactable.reason    Date.…¹ Descr…² Areas…³
#>   <int> <chr>                    <chr>                   <chr>   <chr>   <chr>  
#> 1     1 Gold Coast               Left message for them … 5/3/20… <NA>    <NA>   
#> 2     2 Gold Coast               Left message for them … 5/3/20… <NA>    <NA>   
#> 3     3 Gold Coast               Left message for them … 5/3/20… <NA>    <NA>   
#> 4     4 Gold Coast               Left message for them … 5/3/20… <NA>    <NA>   
#> 5     5 Western Qld              Left message for them … 5/3/20… <NA>    <NA>   
#> 6     6 Darling Downs South West Left message for them … 5/3/20… <NA>    <NA>   
#> # … with abbreviated variable names ¹​Date.of.call, ²​Description.of.the.call,
#> #   ³​Areas.of.concern

sa<- sa %>%
  mutate(Description.of.the.call = str_replace_all(Description.of.the.call, "\\#", "") %>% 
           str_squish() %>%
           str_replace_all("\\_", "") %>%
           str_replace_all("\\n", "")
         )

sa<- sa %>%
  mutate(Areas.of.concern = str_replace_all(Areas.of.concern, "\\#", "") %>% 
           str_squish() %>%
           str_replace_all("\\_", "") %>%
           str_replace_all("\\n", "")
  )

sa <- tibble(sa)

remove_reg <- "&amp;|&lt;|&gt;"
tidy_sa <- sa %>% 
  select(Description.of.the.call) %>%
  unnest_tokens(word, text) %>%
  filter(!word %in% stop_words$word,
         !word %in% str_remove_all(stop_words$word, "'"),
         str_detect(word, "[a-z]"))
#> Error:
#> ! Must extract column with a single valid subscript.
#> ✖ Subscript `!!enquo(var)` has the wrong type `function`.
#> ℹ It must be numeric or character.

#> Backtrace:
#>      ▆
#>   1. ├─... %>% ...
#>   2. ├─dplyr::filter(...)
#>   3. └─tidytext::unnest_tokens(., word, text)
#>   4.   ├─dplyr::pull(tbl, !!input)
#>   5.   └─dplyr:::pull.data.frame(tbl, !!input)
#>   6.     └─tidyselect::vars_pull(names(.data), !!enquo(var))
#>   7.       └─tidyselect:::pull_as_location2(...)
#>   8.         ├─tidyselect:::with_subscript_errors(...)
#>   9.         │ └─rlang::try_fetch(...)
#>  10.         │   └─base::withCallingHandlers(...)
#>  11.         └─vctrs::vec_as_subscript2(...)
#>  12.           └─vctrs:::result_get(...)
#>  13.             └─rlang::cnd_signal(x$err)
Created on 2022-10-28 with reprex v2.0.2

Any help or advice would be much appreciated.

I guess the problem is that, unnest_token needs 3 args;
1.tbl argument which is passed by the pipeline , "sa" in your case,
2. output: the column name of results which is "word" in your case, and the last one
3. input which is "text" in your case. however, you want to tokenize the column "Description.of.the.call". if you pass this column to unnest_tokenize instead of "text", I guess your code will work.

unnest_tokens(
  tbl,
  output,
  input,
  token = "words",
  format = c("text", "man", "latex", "html", "xml"),
  to_lower = TRUE,
  drop = TRUE,
  collapse = NULL,
  ...
)

Thanks @melih_guven - that worked.

I 'tokenised' Description.of.the.call instead of text and that worked i.e.

tidy_sa <- sa %>% 
  unnest_tokens(word, Description.of.the.call) %>%
  mutate(linenumber = row_number()) %>%
  filter(!word %in% stop_words$word,
         !word %in% str_remove_all(stop_words$word, "'"),
         str_detect(word, "[a-z]"))

Many thanks!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.