Problem with unnest_tokens function.

Hello. I have a problem with this code.

library(ggplot2)
library(magrittr)
library(dplyr)
library(tidytext)
library(gutenbergr)
TTLutS_.ted=gutenberg_download(164, mirror = "something")
Ulysses_.ted=gutenberg_download(4300, mirror = "something")
TTLutS=tibble(lines = 1:12135, text = TTLutS_.ted)
Ulysses=tibble(lines = 1:32834, text = Ulysses_.ted)
TTLutS.words <- TTLutS %>% unnest_tokens(word, text)
Ulysses.words <- Ulysses %>% unnest_tokens(word, text)
TTLutS.words %>% count(word, sort = TRUE)
Ulysses.words %>% count(word, sort = TRUE)

When I try to rate this, I have this error.

Błąd w poleceniu 'is_corpus_df(corpus)':
all(names(corpus)[1L:2L] == c("doc_id", "text")) is not TRUE

I don't know how to fix this.

suppressPackageStartupMessages({
  library(dplyr)
  library(tidytext)
  library(gutenbergr)
})


TTLutS_.ted <- gutenberg_download(164, mirror = "http://mirrors.xmission.com/gutenberg/")
Ulysses_.ted <- gutenberg_download(4300, mirror = "http://mirrors.xmission.com/gutenberg/")
TTLutS <- tibble(TTLutS_.ted)
Ulysses <- tibble(Ulysses_.ted)
TTLutS.words <- TTLutS %>% unnest_tokens(word, text)
Ulysses.words <- Ulysses %>% unnest_tokens(word, text)
TTLutS.words %>% count(word, sort = TRUE)
#> # A tibble: 8,780 x 2
#>    word      n
#>    <chr> <int>
#>  1 the    8425
#>  2 of     4011
#>  3 and    2580
#>  4 to     2527
#>  5 i      2177
#>  6 a      2056
#>  7 in     1549
#>  8 was    1328
#>  9 it     1291
#> 10 that   1028
#> # … with 8,770 more rows
Ulysses.words %>% count(word, sort = TRUE)
#> # A tibble: 30,645 x 2
#>    word      n
#>    <chr> <int>
#>  1 the   14902
#>  2 of     8141
#>  3 and    7207
#>  4 a      6493
#>  5 to     4954
#>  6 in     4930
#>  7 he     4029
#>  8 his    3328
#>  9 i      2682
#> 10 that   2603
#> # … with 30,635 more rows
1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

Thank you very much. It really works.

1 Like