Problem on visualizing the structure (str())


#1

Hi to all, I have a problem to visualize the str(fin) that annoying me...
why I see my str() always like this, with attributes and all that confusing stuff at the bottom???...

str(fin)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 500 obs. of 11 variables:
ID : Factor w/ 500 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ... Name : Factor w/ 500 levels "Abstractedchocolat",..: 297 451 168 40 485 199 435 339 242 395 ...
Industry : Factor w/ 7 levels "Construction",..: 7 5 6 5 7 5 2 1 5 2 ... Inception: Factor w/ 16 levels "1999","2000",..: 8 11 14 13 15 15 11 15 11 12 ...
Employees: Factor w/ 201 levels "1","2","3","5",..: 24 35 NA 64 44 58 98 68 54 24 ... State : Factor w/ 42 levels "AL","AZ","CA",..: 36 33 35 3 41 27 22 29 3 8 ...
City : Factor w/ 297 levels "Addison","Alexandria",..: 94 181 105 195 151 154 53 295 232 26 ... Revenue : Factor w/ 498 levels "$1,614,585","1,835,717",..: 479 194 485 246 402 141 308 NA 96 117 ... Expenses : Factor w/ 497 levels "1,026,548 Dollars",..: 6 485 3 248 227 247 57 NA 402 495 ...
Profit : Factor w/ 498 levels "12434","46851",..: 342 476 348 420 150 321 125 NA 195 446 ... Growth : Factor w/ 32 levels "-2%","-3%","0%",..: 14 16 11 14 14 18 12 NA 26 16 ...

  • attr(, "spec")=List of 2
    .. cols :List of 11 .. .. ID : list()
    .. .. ..- attr(
    , "class")= chr "collector_integer" "collector"
    .. .. Name : list() .. .. ..- attr(*, "class")= chr "collector_character" "collector" .. .. Industry : list()
    .. .. ..- attr(, "class")= chr "collector_character" "collector"
    .. .. Inception: list() .. .. ..- attr(*, "class")= chr "collector_integer" "collector" .. .. Employees: list()
    .. .. ..- attr(
    , "class")= chr "collector_integer" "collector"
    .. .. State : list() .. .. ..- attr(*, "class")= chr "collector_character" "collector" .. .. City : list()
    .. .. ..- attr(, "class")= chr "collector_character" "collector"
    .. .. Revenue : list() .. .. ..- attr(*, "class")= chr "collector_character" "collector" .. .. Expenses : list()
    .. .. ..- attr(
    , "class")= chr "collector_character" "collector"
    .. .. Profit : list() .. .. ..- attr(*, "class")= chr "collector_integer" "collector" .. .. Growth : list()
    .. .. ..- attr(, "class")= chr "collector_character" "collector"
    ..$ default: list()
    .. ..- attr(
    , "class")= chr "collector_guess" "collector"
    ..- attr(*, "class")= chr "col_spec"

while in all tutorials, str() is clean and contain only the info needed.

How I solve this problem?


#2

That certainly is noisy :open_mouth: str takes a give.attr parameter whose default is TRUE, so you can call str(fin, give.attr = FALSE) to cut down the noise. If you do this a lot, it's probably worth making a snippet in your editor of choice to save some typing! (here's how to do that in the RStudio IDE)

Your average data frame doesn't have as many attributes as the one you're working with, and most tutorials use deliberately simplified datasets that also may not have a lot (or any) info stored as attributes.

If you're mainly using str() to get a compact look at your data, you might also try tibble::glimpse, which does something similar and doesn't include attributes as long as your data frame is a tibble (if it's not a tibble, glimpse just returns the output from str()).


#3

Thanks, u r amazing, I resolved it.


#4

Those particular attributes are created by readr. By design, they (and all attributes of a tibble) are automatically dropped after you modify the data, e.g. with mutate.

Reprex:

library(readr)

cars1 <- read_csv(readr_example("mtcars.csv"), 
                  col_types = cols_only(mpg = "d", cyl = "i"))

str(cars1)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    32 obs. of  2 variables:
#>  $ mpg: num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#>  $ cyl: int  6 6 4 6 8 6 8 4 4 6 ...
#>  - attr(*, "spec")=
#>   .. cols_only(
#>   ..   mpg = col_double(),
#>   ..   cyl = col_integer(),
#>   ..   disp = col_skip(),
#>   ..   hp = col_skip(),
#>   ..   drat = col_skip(),
#>   ..   wt = col_skip(),
#>   ..   qsec = col_skip(),
#>   ..   vs = col_skip(),
#>   ..   am = col_skip(),
#>   ..   gear = col_skip(),
#>   ..   carb = col_skip()
#>   .. )

cars2 <- tibble::rowid_to_column(cars1) 

str(cars2)
#> Classes 'tbl_df', 'tbl' and 'data.frame':    32 obs. of  3 variables:
#>  $ rowid: int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ mpg  : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#>  $ cyl  : int  6 6 4 6 8 6 8 4 4 6 ...

In your case, they're actually giving you useful information, though. All your data has been turned into factors (which is a problem—factors of numbers are a frequent source of bugs), but the attributes list correct types for most of the columns. Something strange is afoot.