Transponse DataFrames

Hello,

I have a data in excel file (xlsx). I use "readxl" library to read it. Where variable names are in the first column and not in the first row.

raw.data <-
  as.data.frame(
    read_excel(
      "data.xlsx",
      range = "B7:NA265",
      col_names = FALSE,
      na = c("", "NA")
    )
  )

output is like this:

some.varnames <- c("1.1.varname", "1.2.varname", "1.3.varname")
some.obervation1 <- c("1", "2", "3")
some.obervation2 <- c("4", "5", "6")
df <- data.frame(X__1 = some.varnames, X__2 = some.obervation1, X__3 = some.obervation2, stringsAsFactors = FALSE)
print(df)

>          X__1 X__2 X__3
> 1 1.1.varname    1    4
> 2 1.2.varname    2    5
> 3 1.3.varname    3    6
str(df)

> 'data.frame':	3 obs. of  3 variables:
>  $ X__1: chr  "1.1.varname" "1.2.varname" "1.3.varname"
>  $ X__2: chr  "1" "2" "3"
>  $ X__3: chr  "4" "5" "6"

now you can see, that in this data all variable names are in the first column. But I want to structe they normally into first row. So column names must have variable names. Therefore I need to transpose this dataframe.
My code:

# transpose df without first column (varnames), to assign they as column names
df.transposed <- as.data.frame(t(df[, -1]))
print(df.transposed)

>                   V1 V2 V3
> some.obervation1   1  2  3
> some.obervation2   4  5  6

# assign variable names to column names
colnames(df.transposed) <- some.varnames
print(df.transposed)

>                  1.1.varname 1.2.varname 1.3.varname
> some.obervation1           1           2           3
> some.obervation2           4           5           6

and that is form that I need. But str() is not that what I need.

str(df.transposed)

> 'data.frame':	2 obs. of  3 variables:
>  $ 1.1.varname: Factor w/ 2 levels "1","4": 1 2
>   ..- attr(*, "names")= chr  "some.obervation1" "some.obervation2"
>  $ 1.2.varname: Factor w/ 2 levels "2","5": 1 2
>   ..- attr(*, "names")= chr  "some.obervation1" "some.obervation2"
>  $ 1.3.varname: Factor w/ 2 levels "3","6": 1 2
>   ..- attr(*, "names")= chr  "some.obervation1" "some.obervation2"

I can't understand, why it outputs this. I used transpose() function of "data.table" library, which worked, but it has some problems with mac and I need only transponse. Therefore I need some easy method that does my job and works on every OS good.

I solved it with an argument in as.data.frame "stringsAsFactors = FALSE"

df.transposed <- as.data.frame(t(df[, -1]), **stringsAsFactors = FALSE**)
colnames(df.transposed) <- some.varnames

str(df.transposed)

> 'data.frame':	2 obs. of  3 variables:
>  $ 1.1.varname: chr  "1" "4"
>  $ 1.2.varname: chr  "2" "5"
>  $ 1.3.varname: chr  "3" "6"

but I can't really understand how and why works it this way. Have you any suggestions?

Welcome to the community!

I think the problem is occurring in this part t(df[, -1]).

When you're skipping the 1st column and transposing it, all elements of the resultant matrix are characters:

> t(df[, -1])
     [,1] [,2] [,3]
X__2 "1"  "2"  "3" 
X__3 "4"  "5"  "6"

Hence, when you're using as.data.frame, all of them are converted to factors, as stringsAsFactors is TRUE by default. Thus, you have to specify it as FALSE yourself.

Hope this helps.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.