Hello,
I have a data in excel file (xlsx). I use "readxl" library to read it. Where variable names are in the first column and not in the first row.
raw.data <-
as.data.frame(
read_excel(
"data.xlsx",
range = "B7:NA265",
col_names = FALSE,
na = c("", "NA")
)
)
output is like this:
some.varnames <- c("1.1.varname", "1.2.varname", "1.3.varname")
some.obervation1 <- c("1", "2", "3")
some.obervation2 <- c("4", "5", "6")
df <- data.frame(X__1 = some.varnames, X__2 = some.obervation1, X__3 = some.obervation2, stringsAsFactors = FALSE)
print(df)
> X__1 X__2 X__3
> 1 1.1.varname 1 4
> 2 1.2.varname 2 5
> 3 1.3.varname 3 6
str(df)
> 'data.frame': 3 obs. of 3 variables:
> $ X__1: chr "1.1.varname" "1.2.varname" "1.3.varname"
> $ X__2: chr "1" "2" "3"
> $ X__3: chr "4" "5" "6"
now you can see, that in this data all variable names are in the first column. But I want to structe they normally into first row. So column names must have variable names. Therefore I need to transpose this dataframe.
My code:
# transpose df without first column (varnames), to assign they as column names
df.transposed <- as.data.frame(t(df[, -1]))
print(df.transposed)
> V1 V2 V3
> some.obervation1 1 2 3
> some.obervation2 4 5 6
# assign variable names to column names
colnames(df.transposed) <- some.varnames
print(df.transposed)
> 1.1.varname 1.2.varname 1.3.varname
> some.obervation1 1 2 3
> some.obervation2 4 5 6
and that is form that I need. But str() is not that what I need.
str(df.transposed)
> 'data.frame': 2 obs. of 3 variables:
> $ 1.1.varname: Factor w/ 2 levels "1","4": 1 2
> ..- attr(*, "names")= chr "some.obervation1" "some.obervation2"
> $ 1.2.varname: Factor w/ 2 levels "2","5": 1 2
> ..- attr(*, "names")= chr "some.obervation1" "some.obervation2"
> $ 1.3.varname: Factor w/ 2 levels "3","6": 1 2
> ..- attr(*, "names")= chr "some.obervation1" "some.obervation2"
I can't understand, why it outputs this. I used transpose() function of "data.table" library, which worked, but it has some problems with mac and I need only transponse. Therefore I need some easy method that does my job and works on every OS good.
I solved it with an argument in as.data.frame "stringsAsFactors = FALSE"
df.transposed <- as.data.frame(t(df[, -1]), **stringsAsFactors = FALSE**)
colnames(df.transposed) <- some.varnames
str(df.transposed)
> 'data.frame': 2 obs. of 3 variables:
> $ 1.1.varname: chr "1" "4"
> $ 1.2.varname: chr "2" "5"
> $ 1.3.varname: chr "3" "6"
but I can't really understand how and why works it this way. Have you any suggestions?