check if all variables of data converted to text

I have many columns and the columns have many options , can be name, date, numeric, email, accent words. so it can have many variables...

so I want to check if all variables are converted to text.
do we have any function for that in R

If your data frame is named DF, this will return TRUE if all of the columns are of the class character.

all("character" == sapply(DF, class))

ok, but this is giving True or False for whole data frame, but if I want a summary of all columns to check what are their data types.

sapply(DF, class) returns a vector showing the class of each column. For a visual check of the structure of the data frame, you might find the output of str(DF) and summary(DF) to be useful.

Ok, but that will give a over all summary , like mean,median
but I just want summary like for , so that I get to know the class , then I can change to character or text. because some of the column names can have accent or alpha numeric

df11 <- data.frame(uniq_id=c(9143,2357,4339,8927,9149,4285,2683,8217,3702,7857,3255,4262,8501,7111,2681,6970),
                          name=c("xly,mnn","xab,Lan","mhy,mun","vgtu,mmc","ftu,sdh","kull,nnhu","hula,njam","mund,jiha","htfy,ntha","sghu,njui","sgyu,hytb","vdti,kula","mftyu,huta","mhuk,ghul","cday,bhsue","ajtu,nudj"),
                   city=c("A","B","C","C","D","F","S","C","E","S","A","B","W","S","C","A"),
                   age=c(22,45,67,34,43,22,34,43,34,52,37,44,41,40,39,30),
                   country=c("usa","USA","AUSTRALI","AUSTRALIA","uk","UK","SPAIN","SPAIN","CHINA","CHINA","BRAZIL","BRASIL","CHILE","USA","CANADA","UK"),
                   lang=c("ENGLISH(US)","ENGLISH(US)","ENGLISH","ENGLISH","ENGLISH(UK)","ENGLISH(UK)","SPANISH","SPANISH","CHINESE","CHINESE","ENGLISH","ENGLISH","ENGLISH","ENGLISH","ENGLISH","ENGLISH(US)"),
                   gender=c("MALE","FEMALE","male","m","f","MALE","FEMALE","f","FEMALE","MALE","MALE","MALE","FEMALE","FEMALE","MALE","MALE"))

in this case it should be like summary exported in excel

Column_Name Class
unique_id numeric
city character
name character
age numeric
gender character
df11 <- data.frame(uniq_id=c(9143,2357,4339,8927,9149,4285,2683,8217,3702,7857,3255,4262,8501,7111,2681,6970),
                   name=c("xly,mnn","xab,Lan","mhy,mun","vgtu,mmc","ftu,sdh","kull,nnhu","hula,njam","mund,jiha","htfy,ntha","sghu,njui","sgyu,hytb","vdti,kula","mftyu,huta","mhuk,ghul","cday,bhsue","ajtu,nudj"),
                   city=c("A","B","C","C","D","F","S","C","E","S","A","B","W","S","C","A"),
                   age=c(22,45,67,34,43,22,34,43,34,52,37,44,41,40,39,30),
                   country=c("usa","USA","AUSTRALI","AUSTRALIA","uk","UK","SPAIN","SPAIN","CHINA","CHINA","BRAZIL","BRASIL","CHILE","USA","CANADA","UK"),
                   lang=c("ENGLISH(US)","ENGLISH(US)","ENGLISH","ENGLISH","ENGLISH(UK)","ENGLISH(UK)","SPANISH","SPANISH","CHINESE","CHINESE","ENGLISH","ENGLISH","ENGLISH","ENGLISH","ENGLISH","ENGLISH(US)"),
                   gender=c("MALE","FEMALE","male","m","f","MALE","FEMALE","f","FEMALE","MALE","MALE","MALE","FEMALE","FEMALE","MALE","MALE"))

ClassSumm <- sapply(df11, class)
ClassSumm
#>     uniq_id        name        city         age     country        lang 
#>   "numeric" "character" "character"   "numeric" "character" "character" 
#>      gender 
#> "character"
library(tibble)
SummaryDF <- tibble(ColumnName = colnames(df11), Class = ClassSumm)
SummaryDF
#> # A tibble: 7 x 2
#>   ColumnName Class    
#>   <chr>      <chr>    
#> 1 uniq_id    numeric  
#> 2 name       character
#> 3 city       character
#> 4 age        numeric  
#> 5 country    character
#> 6 lang       character
#> 7 gender     character

#If you want that in a function
ColSumFunc <- function(DF) {
  ClassSumm <- sapply(df11, class)
  tibble::tibble(ColumnName = colnames(df11), Class = ClassSumm)
}

SummaryDF2 <- ColSumFunc(df11)
SummaryDF2
#> # A tibble: 7 x 2
#>   ColumnName Class    
#>   <chr>      <chr>    
#> 1 uniq_id    numeric  
#> 2 name       character
#> 3 city       character
#> 4 age        numeric  
#> 5 country    character
#> 6 lang       character
#> 7 gender     character

Created on 2020-09-23 by the reprex package (v0.3.0)

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.