Base scale not working with Tibble

I have a dataset I've imported from excel using readxl called GSMA. Checking the class of the object returns:

    class(GSMA)
[1] "tbl_df"     "tbl"        "data.frame"

I want to standardise columns 2 through 4 using base scale. I try running:

GSMA[2:4] <- scale(GSMA[2:4])

This results in an incorrectly scaled dataframe, with each row having the same value for all columns.

A potential clue to the problem: When I attempt to sort the incorrectly scaled dataframe, this error is returned:

Error in xj[i, , drop = FALSE] : subscript out of bounds

When I re-import the same dataset, and then run:

GSMA <- as.data.frame(GSMA)
GSMA[2:4] <- scale(GSMA[2:4])

The dataframe columns scale correctly.

What is going on? Why is base scale not working in the first instance?



dput(head(GSMA))

structure(list(Country = c("GBR", "CHE", "DEU", "ROU", "LUX", 
"KAZ"), entry = c(98.4974384307861, 95.6549962361654, 91.4044539133708, 
90.8518393834432, 90.4088099797567, 88.0471547444662), medium = c(86.0081672668457, 
93.0372142791748, 91.2993144989014, 100, 96.7348480224609, 100
), high = c(74.6774760159579, 84.1793060302734, 79.542350769043, 
99.6931856328791, 97.031680020419, 92.5396745855158)), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

scale() works for me. Might you have a package loaded that has a scale function that is being used instead of base::scale(). Try explicitly calling base::scale()

DF <- structure(list(Country = c("GBR", "CHE", "DEU", "ROU", "LUX", "KAZ"), 
                     entry = c(98.4974384307861, 95.6549962361654, 91.4044539133708, 90.8518393834432, 
                               90.4088099797567, 88.0471547444662), 
                     medium = c(86.0081672668457, 93.0372142791748, 91.2993144989014, 100, 96.7348480224609, 100), 
                     high = c(74.6774760159579, 84.1793060302734, 79.542350769043, 99.6931856328791, 
                              97.031680020419, 92.5396745855158)), 
                row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"))
summary(DF)
#>    Country              entry           medium            high      
#>  Length:6           Min.   :88.05   Min.   : 86.01   Min.   :74.68  
#>  Class :character   1st Qu.:90.52   1st Qu.: 91.73   1st Qu.:80.70  
#>  Mode  :character   Median :91.13   Median : 94.89   Median :88.36  
#>                     Mean   :92.48   Mean   : 94.51   Mean   :87.94  
#>                     3rd Qu.:94.59   3rd Qu.: 99.18   3rd Qu.:95.91  
#>                     Max.   :98.50   Max.   :100.00   Max.   :99.69
sd(DF$entry)
#> [1] 3.848059
sd(DF$medium)
#> [1] 5.477022
sd(DF$high)
#> [1] 10.02508
DF[2:4] <- scale(DF[2:4])
DF
#>   Country      entry     medium       high
#> 1     GBR  1.5644225 -1.5528676 -1.3233285
#> 2     CHE  0.8257534 -0.2694974 -0.3755223
#> 3     DEU -0.2788406 -0.5868048 -0.8380579
#> 4     ROU -0.4224492  1.0017748  1.1719851
#> 5     LUX -0.5375798  0.4056202  0.9065003
#> 6     KAZ -1.1513063  1.0017748  0.4584233
summary(DF)
#>    Country              entry             medium        
#>  Length:6           Min.   :-1.1513   Min.   :-1.55287  
#>  Class :character   1st Qu.:-0.5088   1st Qu.:-0.50748  
#>  Mode  :character   Median :-0.3506   Median : 0.06806  
#>                     Mean   : 0.0000   Mean   : 0.00000  
#>                     3rd Qu.: 0.5496   3rd Qu.: 0.85274  
#>                     Max.   : 1.5644   Max.   : 1.00177  
#>       high         
#>  Min.   :-1.32333  
#>  1st Qu.:-0.72242  
#>  Median : 0.04145  
#>  Mean   : 0.00000  
#>  3rd Qu.: 0.79448  
#>  Max.   : 1.17199
sd(DF$entry)
#> [1] 1
sd(DF$medium)
#> [1] 1
sd(DF$high)
#> [1] 1

Created on 2020-04-14 by the reprex package (v0.2.1)

base::scale returns the same result. Not sure where to go from here.

Have you tried running the exact code I posted? Just grasping the dark on this end!

1 Like
> DF
# A tibble: 6 x 4
  Country entry[,"entry"] [,"medium"] [,"high"] medium[,"entry"] [,"medium"] [,"high"] high[,"entry"] [,"medium"] [,"high"]
  <chr>             <dbl>       <dbl>     <dbl>            <dbl>       <dbl>     <dbl>          <dbl>       <dbl>     <dbl>
1 GBR               1.56       -1.55     -1.32             1.56       -1.55     -1.32           1.56       -1.55     -1.32 
2 CHE               0.826      -0.269    -0.376            0.826      -0.269    -0.376          0.826      -0.269    -0.376
3 DEU              -0.279      -0.587    -0.838           -0.279      -0.587    -0.838         -0.279      -0.587    -0.838
4 ROU              -0.422       1.00      1.17            -0.422       1.00      1.17          -0.422       1.00      1.17 
5 LUX              -0.538       0.406     0.907           -0.538       0.406     0.907         -0.538       0.406     0.907
6 KAZ              -1.15        1.00      0.458           -1.15        1.00      0.458         -1.15        1.00      0.458

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.