[Help] I want to export a summary table of all of my dataset

Dear all,

Is there anyone who can help me? please :frowning:
I have made tons of dataset in R studio like you could see in the right part of the image below and I would like to make a simple table like a summary which contains only "name, number of observation and number of variables" of all of each dataset. (I want the table to be made just like the RIGHT PART OF THE IMAGE below)
Hopefully the table should be in excel or csv form.

PLEASE HELP!

Thank you all!

1 Like

Here is an example with one approach to this, using 2 standard datasets

library(tidyverse)
library(purrr)
cars <- mtcars
chicks <- chickwts
list <- list(cars, chicks)

map_dfc(list, dim) %>% 
  set_names('rows', 'columns')
#> New names:
#> * NA -> ...1
#> * NA -> ...2
#> # A tibble: 2 x 2
#>    rows columns
#>   <int>   <int>
#> 1    32      71
#> 2    11       2

Created on 2020-05-16 by the reprex package (v0.3.0)

Here is another approach. The lines where I load the data sets mtcars, iris and iris3 are only there to provide some objects for the reproducible example. You can write the object TBL as a csv or an Excel file.

library(tibble)
library(purrr)
M <- mtcars
I <- iris
I3 <- iris3 #not a data frame
DataNames <- ls()
GetInfo <- function(x) {
  Obj <- get(x)
  if (grepl("data.frame", class(Obj))) {
    Name <- x
    Obs <-  nrow(Obj)
    Vars <- ncol(Obj)
  } else {
    Name <-  x
    Obs <-  NA
    Vars <-  NA
  }
  tibble(Name = Name, Observations = Obs, Variables = Vars)
  
}
TBL <- map_dfr(DataNames, GetInfo )
TBL
#> # A tibble: 3 x 3
#>   Name  Observations Variables
#>   <chr>        <int>     <int>
#> 1 I              150         5
#> 2 I3              NA        NA
#> 3 M               32        11

Created on 2020-05-16 by the reprex package (v0.3.0)

thank you for your help. it runs well but the thing is it shows me all the values as NA :frowning:(
I tell you this hopefully it could be a hint to you. All of the dataset I made in R studio, they are made with a package dplyr(tbl_df).

Please don't use images to communicate code and console outputs. It's all text in RStudio so copy and paste the text into here. Use three backticks on their own line to format code as code

So its nice to read

Ok. With that out if the way, I think it's a mistake to grepl the result of class. Rather use is.dataframe function to test for that more directly. Tibbles have 3 classes after all.

1 Like

You are right. I should have used copy and paste to ask and share with people here. Thank you for let me know.

I have made all of the datasets by these codes below.

original_data <- read.csv("00000.csv",header = TRUE, sep = ",", na.strings = "NA")
str(original_data)
library(dplyr)
data <- tbl_df(original_data)
data

TV_Yes <- filter(data, A10>="2")
TV_Yes <- TV_Yes[,-1:-496]

> is.data.frame(TV_Yes)
[1] TRUE

and as all of the dataset were made in the same way,
they are all dataframe.

I don't know why it isn't work even they all are dataframe like you could see below.

> library(tibble)
> library(purrr)
> DataNames <- ls()
> GetInfo <- function(x) {
+   Obj <- get(x)
+   if (grepl("data.frame", class(Obj))) {
+     Name <- x
+     Obs <-  nrow(Obj)
+     Vars <- ncol(Obj)
+   } else {
+     Name <-  x
+     Obs <-  NA
+     Vars <-  NA
+   }
+   tibble(Name = Name, Observations = Obs, Variables = Vars)
+   
+ }
> TBL <- map_dfr(DataNames, GetInfo )
> TBL
# A tibble: 160 x 3
   Name                    Observations Variables
   <chr>                   <lgl>        <lgl>    
 1 art_No                  NA           NA       
 2 art_Yes                 NA           NA       
 3 Before_sleep_music      NA           NA       
 4 Before_sleep_nothing    NA           NA       
 5 Before_sleep_PC         NA           NA       
 6 Before_sleep_smartphone NA           NA       
 7 Before_sleep_TV         NA           NA       
 8 CableTV_No              NA           NA       
 9 CableTV_Yes             NA           NA       
10 cafe_No                 NA           NA       
# ... with 150 more rows
original_data <- read.csv("00000.csv",header = TRUE, sep = ",", na.strings = "NA")
str(original_data)
library(dplyr)
data <- tbl_df(original_data)
data

TV_Yes <- filter(data, A10>="2")
TV_Yes <- TV_Yes[,-1:-496]

> is.data.frame(TV_Yes)
[1] TRUE

change this to

if (is.data.frame(Obj)){

wow it works!
This is just the thing what I have wanted!!

Thank you so much for all of your help!

> library(tibble)
> library(purrr)
> 
> DataNames <- ls()
> GetInfo <- function(x) {
+   Obj <- get(x)
+   if (is.data.frame(Obj)){
+     Name <- x
+     Obs <-  nrow(Obj)
+     Vars <- ncol(Obj)
+   } else {
+     Name <-  x
+     Obs <-  NA
+     Vars <-  NA
+   }
+   tibble(Name = Name, Observations = Obs, Variables = Vars)
+   
+ }
> TBL <- map_dfr(DataNames, GetInfo )
> TBL
# A tibble: 160 x 3
   Name                    Observations Variables
   <chr>                          <int>     <int>
 1 art_No                           946        18
 2 art_Yes                          254        18
 3 Before_sleep_music                32        18
 4 Before_sleep_nothing              44        18
 5 Before_sleep_PC                   99        18
 6 Before_sleep_smartphone          715        18
 7 Before_sleep_TV                  235        18
 8 CableTV_No                       135        18
 9 CableTV_Yes                      968        18
10 cafe_No                          794        18
# ... with 150 more rows
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.