I have many columns and the columns have many options , can be name, date, numeric, email, accent words. so it can have many variables...
so I want to check if all variables are converted to text.
do we have any function for that in R
I have many columns and the columns have many options , can be name, date, numeric, email, accent words. so it can have many variables...
so I want to check if all variables are converted to text.
do we have any function for that in R
If your data frame is named DF, this will return TRUE if all of the columns are of the class character.
all("character" == sapply(DF, class))
ok, but this is giving True or False for whole data frame, but if I want a summary of all columns to check what are their data types.
sapply(DF, class)
returns a vector showing the class of each column. For a visual check of the structure of the data frame, you might find the output of str(DF)
and summary(DF)
to be useful.
Ok, but that will give a over all summary , like mean,median
but I just want summary like for , so that I get to know the class , then I can change to character or text. because some of the column names can have accent or alpha numeric
df11 <- data.frame(uniq_id=c(9143,2357,4339,8927,9149,4285,2683,8217,3702,7857,3255,4262,8501,7111,2681,6970),
name=c("xly,mnn","xab,Lan","mhy,mun","vgtu,mmc","ftu,sdh","kull,nnhu","hula,njam","mund,jiha","htfy,ntha","sghu,njui","sgyu,hytb","vdti,kula","mftyu,huta","mhuk,ghul","cday,bhsue","ajtu,nudj"),
city=c("A","B","C","C","D","F","S","C","E","S","A","B","W","S","C","A"),
age=c(22,45,67,34,43,22,34,43,34,52,37,44,41,40,39,30),
country=c("usa","USA","AUSTRALI","AUSTRALIA","uk","UK","SPAIN","SPAIN","CHINA","CHINA","BRAZIL","BRASIL","CHILE","USA","CANADA","UK"),
lang=c("ENGLISH(US)","ENGLISH(US)","ENGLISH","ENGLISH","ENGLISH(UK)","ENGLISH(UK)","SPANISH","SPANISH","CHINESE","CHINESE","ENGLISH","ENGLISH","ENGLISH","ENGLISH","ENGLISH","ENGLISH(US)"),
gender=c("MALE","FEMALE","male","m","f","MALE","FEMALE","f","FEMALE","MALE","MALE","MALE","FEMALE","FEMALE","MALE","MALE"))
in this case it should be like summary exported in excel
Column_Name | Class |
---|---|
unique_id | numeric |
city | character |
name | character |
age | numeric |
gender | character |
df11 <- data.frame(uniq_id=c(9143,2357,4339,8927,9149,4285,2683,8217,3702,7857,3255,4262,8501,7111,2681,6970),
name=c("xly,mnn","xab,Lan","mhy,mun","vgtu,mmc","ftu,sdh","kull,nnhu","hula,njam","mund,jiha","htfy,ntha","sghu,njui","sgyu,hytb","vdti,kula","mftyu,huta","mhuk,ghul","cday,bhsue","ajtu,nudj"),
city=c("A","B","C","C","D","F","S","C","E","S","A","B","W","S","C","A"),
age=c(22,45,67,34,43,22,34,43,34,52,37,44,41,40,39,30),
country=c("usa","USA","AUSTRALI","AUSTRALIA","uk","UK","SPAIN","SPAIN","CHINA","CHINA","BRAZIL","BRASIL","CHILE","USA","CANADA","UK"),
lang=c("ENGLISH(US)","ENGLISH(US)","ENGLISH","ENGLISH","ENGLISH(UK)","ENGLISH(UK)","SPANISH","SPANISH","CHINESE","CHINESE","ENGLISH","ENGLISH","ENGLISH","ENGLISH","ENGLISH","ENGLISH(US)"),
gender=c("MALE","FEMALE","male","m","f","MALE","FEMALE","f","FEMALE","MALE","MALE","MALE","FEMALE","FEMALE","MALE","MALE"))
ClassSumm <- sapply(df11, class)
ClassSumm
#> uniq_id name city age country lang
#> "numeric" "character" "character" "numeric" "character" "character"
#> gender
#> "character"
library(tibble)
SummaryDF <- tibble(ColumnName = colnames(df11), Class = ClassSumm)
SummaryDF
#> # A tibble: 7 x 2
#> ColumnName Class
#> <chr> <chr>
#> 1 uniq_id numeric
#> 2 name character
#> 3 city character
#> 4 age numeric
#> 5 country character
#> 6 lang character
#> 7 gender character
#If you want that in a function
ColSumFunc <- function(DF) {
ClassSumm <- sapply(df11, class)
tibble::tibble(ColumnName = colnames(df11), Class = ClassSumm)
}
SummaryDF2 <- ColSumFunc(df11)
SummaryDF2
#> # A tibble: 7 x 2
#> ColumnName Class
#> <chr> <chr>
#> 1 uniq_id numeric
#> 2 name character
#> 3 city character
#> 4 age numeric
#> 5 country character
#> 6 lang character
#> 7 gender character
Created on 2020-09-23 by the reprex package (v0.3.0)
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.
If you have a query related to it or one of the replies, start a new topic and refer back with a link.