Having an issue with sorting data in rows due to a conversion error

LanceAki1 · October 7, 2022, 1:41am

Load base packages manually

library(datasets) # For example datasets

Install pacman ("package manager") if needed

if (!require("pacman")) install.packages("pacman")

pacman must already be installed; then load contributed

packages (including pacman) with pacman

pacman::p_load(magrittr, pacman, rio, tidyverse)

pacman: for loading/unloading packages

magrittr: for pipes

rio: for importing data

tidyverse: for so many reasons

This script makes use of functions from the forcats

package, which is installed as part of the tidyverse

LOAD AND PREPARE DATA

Import data into tibble "df"

df <- import("Documents/Job_Stuff/Active_LinkedIn_Learning_Courses_Files/Ex_Files_R_Visualizing_Data/Exercise_Files/data/MobileOS_US.xlsx") %>%

as_tibble() %>%
print()

A tibble: 18 × 126

MobileOS 2009-…¹ 2009-…² 2009-…³ 2009-…⁴ 2009-…⁵ 2009-…⁶ 2009-…⁷

1 Android 1.75 5.47 6.12 5.56 5.47 0.3 5.19
2 BlackBerry OS 2.16 12.4 4.74 5.52 13.0 20.1 20.1
3 Brew 0 0 0 0 0 0 0
4 iOS 59.2 60.5 68.1 69.8 63.1 60.0 56.6
5 LG 0 0 0 0 0 0 0.14
6 Linux 3.33 0 0 0 0 0 0
7 Nintendo 0 0 0 0 0 0 0.17
8 Nintendo 3DS 0 0 0 0 0 0 0
9 Nokia Unknown 0 0 0 0 0 0 0
10 Other 1.61 1.85 1.74 1.65 1.56 1.45 1.34
11 Playstation 7.51 6.39 6.35 5.4 5.02 5.04 4.43
12 Samsung 0 0 0 0 0 0 0.58
13 Series 40 0 0 0 0 0 0 0
14 Sony Ericsson 0 0 0 0 0 0 0.11
15 SymbianOS 5.61 3.05 2.74 2.47 2.64 2.95 1.1
16 Unknown 14.3 5.9 6.48 6.35 6.36 7.37 7.38
17 webOS 0 0 0 0 0 0 0.22
18 Windows 4.58 4.5 3.77 3.25 2.87 2.84 2.56

… with 118 more variables: `2009-08` , `2009-09` ,

`2009-10` , `2009-11` , `2009-12` , `2010-01` ,

`2010-02` , `2010-03` , `2010-04` , `2010-05` ,

`2010-06` , `2010-07` , `2010-08` , `2010-09` ,

`2010-10` , `2010-11` , `2010-12` , `2011-01` ,

`2011-02` , `2011-03` , `2011-04` , `2011-05` ,

`2011-06` , `2011-07` , `2011-08` , …

Use `colnames()` to see all variable names

Define "MobileOS" as factor

df %<>%

mutate(MobileOS = as.factor(MobileOS)) %>%
print()

A tibble: 18 × 126

MobileOS 2009-…¹ 2009-…² 2009-…³ 2009-…⁴ 2009-…⁵ 2009-…⁶ 2009-…⁷

1 Android 1.75 5.47 6.12 5.56 5.47 0.3 5.19
2 BlackBerry OS 2.16 12.4 4.74 5.52 13.0 20.1 20.1
3 Brew 0 0 0 0 0 0 0
4 iOS 59.2 60.5 68.1 69.8 63.1 60.0 56.6
5 LG 0 0 0 0 0 0 0.14
6 Linux 3.33 0 0 0 0 0 0
7 Nintendo 0 0 0 0 0 0 0.17
8 Nintendo 3DS 0 0 0 0 0 0 0
9 Nokia Unknown 0 0 0 0 0 0 0
10 Other 1.61 1.85 1.74 1.65 1.56 1.45 1.34
11 Playstation 7.51 6.39 6.35 5.4 5.02 5.04 4.43
12 Samsung 0 0 0 0 0 0 0.58
13 Series 40 0 0 0 0 0 0 0
14 Sony Ericsson 0 0 0 0 0 0 0.11
15 SymbianOS 5.61 3.05 2.74 2.47 2.64 2.95 1.1
16 Unknown 14.3 5.9 6.48 6.35 6.36 7.37 7.38
17 webOS 0 0 0 0 0 0 0.22
18 Windows 4.58 4.5 3.77 3.25 2.87 2.84 2.56

… with 118 more variables: `2009-08` , `2009-09` ,

`2009-10` , `2009-11` , `2009-12` , `2010-01` ,

`2010-02` , `2010-03` , `2010-04` , `2010-05` ,

`2010-06` , `2010-07` , `2010-08` , `2010-09` ,

`2010-10` , `2010-11` , `2010-12` , `2011-01` ,

`2011-02` , `2011-03` , `2011-04` , `2011-05` ,

`2011-06` , `2011-07` , `2011-08` , …

Use `colnames()` to see all variable names

Select 2010 variable, convert to whole numbers

df %<>%

mutate(OS_2010 = 2010-01 * 100) %>%
select(MobileOS, OS_2010) %>%
print()

A tibble: 18 × 2

MobileOS OS_2010

1 Android 1190
2 BlackBerry OS 2013
3 Brew 0
4 iOS 5358
5 LG 52
6 Linux 0
7 Nintendo 79
8 Nintendo 3DS 0
9 Nokia Unknown 0
10 Other 77
11 Playstation 218
12 Samsung 174
13 Series 40 0
14 Sony Ericsson 43
15 SymbianOS 141
16 Unknown 433
17 webOS 117
18 Windows 105

Check for outliers

df %>%

select(OS_2010) %>%
boxplot(horizontal = T)

Convert to rows

df %<>%

uncount(OS_2010) %>%
print()
Error in uncount():
! Can't convert from weights to due to loss of precision.
• Locations: 11
Run rlang::last_error() to see where the error occurred.

rlang::last_error()
<error/vctrs_error_cast_lossy>
Error in uncount():
! Can't convert from weights to due to loss of precision.
• Locations: 11

Backtrace:

df %<>% uncount(OS_2010) %>% print()
tidyr::uncount(., OS_2010)
Run rlang::last_trace() to see the full context.

rlang::last_trace()
<error/vctrs_error_cast_lossy>
Error in uncount():
! Can't convert from weights to due to loss of precision.
• Locations: 11

Backtrace:
▆

├─df %<>% uncount(OS_2010) %>% print()
├─base::print(.)
└─tidyr::uncount(., OS_2010)
└─vctrs::vec_cast(w, integer(), x_arg = "weights")
```
└─vctrs (local) `<fn>`()
```

  └─vctrs:::vec_cast.integer.double(...)

```
    └─vctrs::maybe_lossy_cast(...)
```
```
      ├─base::withRestarts(...)
```

      │ └─base (local) withOneRestart(expr, restarts[[1L]])

      │   └─base (local) doWithOneRestart(return(expr), restart)

      └─vctrs:::stop_lossy_cast(...)

        └─vctrs::stop_incompatible_cast(...)

          └─vctrs::stop_incompatible_type(...)

            └─vctrs:::stop_incompatible(...)

              └─vctrs:::stop_vctrs(...)

                └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))

jonesey441 · October 7, 2022, 2:22am

Hi @LanceAki1 ,

Welcome to the RStudio Community!

First off, I'd like to introduce the concept of a reproducible example. A reproducible example (or "reprex" for short) is the minimum amount of data and code needed to reproduce the error. Please read more on how to produce a reprex.

Second, aside from the reproducible example please pose a direct question to answer. This will help the community understand what it is you are trying to solve.

Now from what I can discern from some of the code is that there is an error while trying to use the uncount function from tidyr. If this is the error to solve in this post, I thing the problem is that the OS_2010 variable isn't saved in the df dataset. It has simply been printed in a temporary visual in the console in the previous couple steps prior to the error.

I think the uncount() function here is unable to use a variable that technically doesn't exist. Try passing the creation of the OS_2010 variable into a new dataset like so:

# create the new dataset
new_df <- df %<>%
  mutate(OS_2010 = 2010-01 * 100) %>%
  select(MobileOS, OS_2010) %>%
  print()

# then call your new dataset when using the uncount
another_df <- new_df %>%
  uncount(OS_2010) %>%
  print()

LanceAki1 · October 8, 2022, 10:25pm

I tried your method, but ran into the same problem again.

> # INSTALL AND LOAD PACKAGES ################################
> 
> # Load base packages manually
> # library(datasets)  # For example datasets
> 
> # Install pacman ("package manager") if needed
> if (!require("pacman")) install.packages("pacman")
Loading required package: pacman
> 
> # pacman must already be installed; then load contributed
> # packages (including pacman) with pacman
> pacman::p_load(magrittr, pacman, rio, tidyverse)
> # pacman: for loading/unloading packages
> # magrittr: for pipes
> # rio: for importing data
> # tidyverse: for so many reasons
> 
> # This script makes use of functions from the forcats
> # package, which is installed as part of the tidyverse
> # LOAD AND PREPARE DATA ####################################
> 
> # Import data into tibble "df"
> df <- import("Ex_Files_R_Visualizing_Data/Exercise_Files/data/MobileOS_US.xlsx") %>%
+   as_tibble() %>%
+   print()
# A tibble: 18 × 126                                                  
   MobileOS      2009-…¹ 2009-…² 2009-…³ 2009-…⁴ 2009-…⁵ 2009-…⁶ 2009-…⁷
   <chr>           <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 Android          1.75    5.47    6.12    5.56    5.47    0.3     5.19
 2 BlackBerry OS    2.16   12.4     4.74    5.52   13.0    20.1    20.1 
 3 Brew             0       0       0       0       0       0       0   
 4 iOS             59.2    60.5    68.1    69.8    63.1    60.0    56.6 
 5 LG               0       0       0       0       0       0       0.14
 6 Linux            3.33    0       0       0       0       0       0   
 7 Nintendo         0       0       0       0       0       0       0.17
 8 Nintendo 3DS     0       0       0       0       0       0       0   
 9 Nokia Unknown    0       0       0       0       0       0       0   
10 Other            1.61    1.85    1.74    1.65    1.56    1.45    1.34
11 Playstation      7.51    6.39    6.35    5.4     5.02    5.04    4.43
12 Samsung          0       0       0       0       0       0       0.58
13 Series 40        0       0       0       0       0       0       0   
14 Sony Ericsson    0       0       0       0       0       0       0.11
15 SymbianOS        5.61    3.05    2.74    2.47    2.64    2.95    1.1 
16 Unknown         14.3     5.9     6.48    6.35    6.36    7.37    7.38
17 webOS            0       0       0       0       0       0       0.22
18 Windows          4.58    4.5     3.77    3.25    2.87    2.84    2.56
# … with 118 more variables: `2009-08` <dbl>, `2009-09` <dbl>,
#   `2009-10` <dbl>, `2009-11` <dbl>, `2009-12` <dbl>, `2010-01` <dbl>,
#   `2010-02` <dbl>, `2010-03` <dbl>, `2010-04` <dbl>, `2010-05` <dbl>,
#   `2010-06` <dbl>, `2010-07` <dbl>, `2010-08` <dbl>, `2010-09` <dbl>,
#   `2010-10` <dbl>, `2010-11` <dbl>, `2010-12` <dbl>, `2011-01` <dbl>,
#   `2011-02` <dbl>, `2011-03` <dbl>, `2011-04` <dbl>, `2011-05` <dbl>,
#   `2011-06` <dbl>, `2011-07` <dbl>, `2011-08` <dbl>, …
# ℹ Use `colnames()` to see all variable names
> # Define "MobileOS" as factor
> df %<>%
+   mutate(MobileOS = as.factor(MobileOS)) %>%
+   print()
# A tibble: 18 × 126
   MobileOS      2009-…¹ 2009-…² 2009-…³ 2009-…⁴ 2009-…⁵ 2009-…⁶ 2009-…⁷
   <fct>           <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 Android          1.75    5.47    6.12    5.56    5.47    0.3     5.19
 2 BlackBerry OS    2.16   12.4     4.74    5.52   13.0    20.1    20.1 
 3 Brew             0       0       0       0       0       0       0   
 4 iOS             59.2    60.5    68.1    69.8    63.1    60.0    56.6 
 5 LG               0       0       0       0       0       0       0.14
 6 Linux            3.33    0       0       0       0       0       0   
 7 Nintendo         0       0       0       0       0       0       0.17
 8 Nintendo 3DS     0       0       0       0       0       0       0   
 9 Nokia Unknown    0       0       0       0       0       0       0   
10 Other            1.61    1.85    1.74    1.65    1.56    1.45    1.34
11 Playstation      7.51    6.39    6.35    5.4     5.02    5.04    4.43
12 Samsung          0       0       0       0       0       0       0.58
13 Series 40        0       0       0       0       0       0       0   
14 Sony Ericsson    0       0       0       0       0       0       0.11
15 SymbianOS        5.61    3.05    2.74    2.47    2.64    2.95    1.1 
16 Unknown         14.3     5.9     6.48    6.35    6.36    7.37    7.38
17 webOS            0       0       0       0       0       0       0.22
18 Windows          4.58    4.5     3.77    3.25    2.87    2.84    2.56
# … with 118 more variables: `2009-08` <dbl>, `2009-09` <dbl>,
#   `2009-10` <dbl>, `2009-11` <dbl>, `2009-12` <dbl>, `2010-01` <dbl>,
#   `2010-02` <dbl>, `2010-03` <dbl>, `2010-04` <dbl>, `2010-05` <dbl>,
#   `2010-06` <dbl>, `2010-07` <dbl>, `2010-08` <dbl>, `2010-09` <dbl>,
#   `2010-10` <dbl>, `2010-11` <dbl>, `2010-12` <dbl>, `2011-01` <dbl>,
#   `2011-02` <dbl>, `2011-03` <dbl>, `2011-04` <dbl>, `2011-05` <dbl>,
#   `2011-06` <dbl>, `2011-07` <dbl>, `2011-08` <dbl>, …
# ℹ Use `colnames()` to see all variable names
> 
> # create the new dataset
> new_df <- df %<>%
+   mutate(OS_2010 = `2010-01` * 100) %>%
+   select(MobileOS, OS_2010) %>%
+   print()
# A tibble: 18 × 2
   MobileOS      OS_2010
   <fct>           <dbl>
 1 Android          1190
 2 BlackBerry OS    2013
 3 Brew                0
 4 iOS              5358
 5 LG                 52
 6 Linux               0
 7 Nintendo           79
 8 Nintendo 3DS        0
 9 Nokia Unknown       0
10 Other              77
11 Playstation       218
12 Samsung           174
13 Series 40           0
14 Sony Ericsson      43
15 SymbianOS         141
16 Unknown           433
17 webOS             117
18 Windows           105
> # Select 2010 variable, convert to whole numbers
> # df %<>%
> #   mutate(OS_2010 = `2010-01` * 100) %>%
> #   select(MobileOS, OS_2010) %>%
> #   print()
> 
> # Check for outliers
> df %>%
+   select(OS_2010) %>%
+   boxplot(horizontal = TRUE)
> # Convert to rows
> # df %<>%
> #   uncount(OS_2010) %>%
> #   print()
> 
> # then call your new dataset when using the uncount
> another_df <- new_df %>%
+   uncount(OS_2010) %>%
+   print()
Error in `uncount()`:
! Can't convert from `weights` <double> to <integer> due to loss of precision.
• Locations: 11
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/vctrs_error_cast_lossy>
Error in `uncount()`:
! Can't convert from `weights` <double> to <integer> due to loss of precision.
• Locations: 11
---
Backtrace:
 1. new_df %>% uncount(OS_2010) %>% print()
 3. tidyr::uncount(., OS_2010)
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/vctrs_error_cast_lossy>
Error in `uncount()`:
! Can't convert from `weights` <double> to <integer> due to loss of precision.
• Locations: 11
---
Backtrace:
     ▆
  1. ├─new_df %>% uncount(OS_2010) %>% print()
  2. ├─base::print(.)
  3. └─tidyr::uncount(., OS_2010)
  4.   └─vctrs::vec_cast(w, integer(), x_arg = "weights")
  5.     └─vctrs (local) `<fn>`()
  6.       └─vctrs:::vec_cast.integer.double(...)
  7.         └─vctrs::maybe_lossy_cast(...)
  8.           ├─base::withRestarts(...)
  9.           │ └─base (local) withOneRestart(expr, restarts[[1L]])
 10.           │   └─base (local) doWithOneRestart(return(expr), restart)
 11.           └─vctrs:::stop_lossy_cast(...)
 12.             └─vctrs::stop_incompatible_cast(...)
 13.               └─vctrs::stop_incompatible_type(...)
 14.                 └─vctrs:::stop_incompatible(...)
 15.                   └─vctrs:::stop_vctrs(...)
 16.                     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))
>

LanceAki1 · October 8, 2022, 10:28pm

I tried the method, but ran into the same problem.

> # INSTALL AND LOAD PACKAGES ################################
> 
> # Load base packages manually
> # library(datasets)  # For example datasets
> 
> # Install pacman ("package manager") if needed
> if (!require("pacman")) install.packages("pacman")
Loading required package: pacman
> 
> # pacman must already be installed; then load contributed
> # packages (including pacman) with pacman
> pacman::p_load(magrittr, pacman, rio, tidyverse)
> # pacman: for loading/unloading packages
> # magrittr: for pipes
> # rio: for importing data
> # tidyverse: for so many reasons
> 
> # This script makes use of functions from the forcats
> # package, which is installed as part of the tidyverse
> # LOAD AND PREPARE DATA ####################################
> 
> # Import data into tibble "df"
> df <- import("Ex_Files_R_Visualizing_Data/Exercise_Files/data/MobileOS_US.xlsx") %>%
+   as_tibble() %>%
+   print()
# A tibble: 18 × 126                                                  
   MobileOS      2009-…¹ 2009-…² 2009-…³ 2009-…⁴ 2009-…⁵ 2009-…⁶ 2009-…⁷
   <chr>           <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 Android          1.75    5.47    6.12    5.56    5.47    0.3     5.19
 2 BlackBerry OS    2.16   12.4     4.74    5.52   13.0    20.1    20.1 
 3 Brew             0       0       0       0       0       0       0   
 4 iOS             59.2    60.5    68.1    69.8    63.1    60.0    56.6 
 5 LG               0       0       0       0       0       0       0.14
 6 Linux            3.33    0       0       0       0       0       0   
 7 Nintendo         0       0       0       0       0       0       0.17
 8 Nintendo 3DS     0       0       0       0       0       0       0   
 9 Nokia Unknown    0       0       0       0       0       0       0   
10 Other            1.61    1.85    1.74    1.65    1.56    1.45    1.34
11 Playstation      7.51    6.39    6.35    5.4     5.02    5.04    4.43
12 Samsung          0       0       0       0       0       0       0.58
13 Series 40        0       0       0       0       0       0       0   
14 Sony Ericsson    0       0       0       0       0       0       0.11
15 SymbianOS        5.61    3.05    2.74    2.47    2.64    2.95    1.1 
16 Unknown         14.3     5.9     6.48    6.35    6.36    7.37    7.38
17 webOS            0       0       0       0       0       0       0.22
18 Windows          4.58    4.5     3.77    3.25    2.87    2.84    2.56
# … with 118 more variables: `2009-08` <dbl>, `2009-09` <dbl>,
#   `2009-10` <dbl>, `2009-11` <dbl>, `2009-12` <dbl>, `2010-01` <dbl>,
#   `2010-02` <dbl>, `2010-03` <dbl>, `2010-04` <dbl>, `2010-05` <dbl>,
#   `2010-06` <dbl>, `2010-07` <dbl>, `2010-08` <dbl>, `2010-09` <dbl>,
#   `2010-10` <dbl>, `2010-11` <dbl>, `2010-12` <dbl>, `2011-01` <dbl>,
#   `2011-02` <dbl>, `2011-03` <dbl>, `2011-04` <dbl>, `2011-05` <dbl>,
#   `2011-06` <dbl>, `2011-07` <dbl>, `2011-08` <dbl>, …
# ℹ Use `colnames()` to see all variable names
> # Define "MobileOS" as factor
> df %<>%
+   mutate(MobileOS = as.factor(MobileOS)) %>%
+   print()
# A tibble: 18 × 126
   MobileOS      2009-…¹ 2009-…² 2009-…³ 2009-…⁴ 2009-…⁵ 2009-…⁶ 2009-…⁷
   <fct>           <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 Android          1.75    5.47    6.12    5.56    5.47    0.3     5.19
 2 BlackBerry OS    2.16   12.4     4.74    5.52   13.0    20.1    20.1 
 3 Brew             0       0       0       0       0       0       0   
 4 iOS             59.2    60.5    68.1    69.8    63.1    60.0    56.6 
 5 LG               0       0       0       0       0       0       0.14
 6 Linux            3.33    0       0       0       0       0       0   
 7 Nintendo         0       0       0       0       0       0       0.17
 8 Nintendo 3DS     0       0       0       0       0       0       0   
 9 Nokia Unknown    0       0       0       0       0       0       0   
10 Other            1.61    1.85    1.74    1.65    1.56    1.45    1.34
11 Playstation      7.51    6.39    6.35    5.4     5.02    5.04    4.43
12 Samsung          0       0       0       0       0       0       0.58
13 Series 40        0       0       0       0       0       0       0   
14 Sony Ericsson    0       0       0       0       0       0       0.11
15 SymbianOS        5.61    3.05    2.74    2.47    2.64    2.95    1.1 
16 Unknown         14.3     5.9     6.48    6.35    6.36    7.37    7.38
17 webOS            0       0       0       0       0       0       0.22
18 Windows          4.58    4.5     3.77    3.25    2.87    2.84    2.56
# … with 118 more variables: `2009-08` <dbl>, `2009-09` <dbl>,
#   `2009-10` <dbl>, `2009-11` <dbl>, `2009-12` <dbl>, `2010-01` <dbl>,
#   `2010-02` <dbl>, `2010-03` <dbl>, `2010-04` <dbl>, `2010-05` <dbl>,
#   `2010-06` <dbl>, `2010-07` <dbl>, `2010-08` <dbl>, `2010-09` <dbl>,
#   `2010-10` <dbl>, `2010-11` <dbl>, `2010-12` <dbl>, `2011-01` <dbl>,
#   `2011-02` <dbl>, `2011-03` <dbl>, `2011-04` <dbl>, `2011-05` <dbl>,
#   `2011-06` <dbl>, `2011-07` <dbl>, `2011-08` <dbl>, …
# ℹ Use `colnames()` to see all variable names
> 
> # create the new dataset
> new_df <- df %<>%
+   mutate(OS_2010 = `2010-01` * 100) %>%
+   select(MobileOS, OS_2010) %>%
+   print()
# A tibble: 18 × 2
   MobileOS      OS_2010
   <fct>           <dbl>
 1 Android          1190
 2 BlackBerry OS    2013
 3 Brew                0
 4 iOS              5358
 5 LG                 52
 6 Linux               0
 7 Nintendo           79
 8 Nintendo 3DS        0
 9 Nokia Unknown       0
10 Other              77
11 Playstation       218
12 Samsung           174
13 Series 40           0
14 Sony Ericsson      43
15 SymbianOS         141
16 Unknown           433
17 webOS             117
18 Windows           105
> # Select 2010 variable, convert to whole numbers
> # df %<>%
> #   mutate(OS_2010 = `2010-01` * 100) %>%
> #   select(MobileOS, OS_2010) %>%
> #   print()
> 
> # Check for outliers
> df %>%
+   select(OS_2010) %>%
+   boxplot(horizontal = TRUE)
> # Convert to rows
> # df %<>%
> #   uncount(OS_2010) %>%
> #   print()
> 
> # then call your new dataset when using the uncount
> another_df <- new_df %>%
+   uncount(OS_2010) %>%
+   print()
Error in `uncount()`:
! Can't convert from `weights` <double> to <integer> due to loss of precision.
• Locations: 11
Run `rlang::last_error()` to see where the error occurred.
> rlang::last_error()
<error/vctrs_error_cast_lossy>
Error in `uncount()`:
! Can't convert from `weights` <double> to <integer> due to loss of precision.
• Locations: 11
---
Backtrace:
 1. new_df %>% uncount(OS_2010) %>% print()
 3. tidyr::uncount(., OS_2010)
Run `rlang::last_trace()` to see the full context.
> rlang::last_trace()
<error/vctrs_error_cast_lossy>
Error in `uncount()`:
! Can't convert from `weights` <double> to <integer> due to loss of precision.
• Locations: 11
---
Backtrace:
     ▆
  1. ├─new_df %>% uncount(OS_2010) %>% print()
  2. ├─base::print(.)
  3. └─tidyr::uncount(., OS_2010)
  4.   └─vctrs::vec_cast(w, integer(), x_arg = "weights")
  5.     └─vctrs (local) `<fn>`()
  6.       └─vctrs:::vec_cast.integer.double(...)
  7.         └─vctrs::maybe_lossy_cast(...)
  8.           ├─base::withRestarts(...)
  9.           │ └─base (local) withOneRestart(expr, restarts[[1L]])
 10.           │   └─base (local) doWithOneRestart(return(expr), restart)
 11.           └─vctrs:::stop_lossy_cast(...)
 12.             └─vctrs::stop_incompatible_cast(...)
 13.               └─vctrs::stop_incompatible_type(...)
 14.                 └─vctrs:::stop_incompatible(...)
 15.                   └─vctrs:::stop_vctrs(...)
 16.                     └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = vctrs_error_call(call))
>

LanceAki1 · October 9, 2022, 12:28am

I found a solution to my problem.

> # Title:    Recoding categorical data
> # File:     06_01_RecodeCategoricalData.R
> # Project:  R_EssT_1; R Essential Training, Part 1:
> #           Wrangling and Visualizing Data
> 
> # INSTALL AND LOAD PACKAGES ################################
> 
> # Load base packages manually
> # library(datasets)  # For example datasets
> 
> # Install pacman ("package manager") if needed
> if (!require("pacman")) install.packages("pacman")
> 
> # pacman must already be installed; then load contributed
> # packages (including pacman) with pacman
> pacman::p_load(magrittr, pacman, rio, tidyverse)
> # pacman: for loading/unloading packages
> # magrittr: for pipes
> # rio: for importing data
> # tidyverse: for so many reasons
> 
> # This script makes use of functions from the forcats
> # package, which is installed as part of the tidyverse
> 
> # LOAD AND PREPARE DATA ####################################
> # Import data into tibble "df"
> df <- import("Documents/Job_Stuff/Active_LinkedIn_Learning_Courses_Files/Ex_Files_R_Visualizing_Data/Exercise_Files/data/MobileOS_US.xlsx") %>%
+   as_tibble() %>%
+   print()
# A tibble: 18 × 126                                                  
   MobileOS      2009-…¹ 2009-…² 2009-…³ 2009-…⁴ 2009-…⁵ 2009-…⁶ 2009-…⁷
   <chr>           <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 Android          1.75    5.47    6.12    5.56    5.47    0.3     5.19
 2 BlackBerry OS    2.16   12.4     4.74    5.52   13.0    20.1    20.1 
 3 Brew             0       0       0       0       0       0       0   
 4 iOS             59.2    60.5    68.1    69.8    63.1    60.0    56.6 
 5 LG               0       0       0       0       0       0       0.14
 6 Linux            3.33    0       0       0       0       0       0   
 7 Nintendo         0       0       0       0       0       0       0.17
 8 Nintendo 3DS     0       0       0       0       0       0       0   
 9 Nokia Unknown    0       0       0       0       0       0       0   
10 Other            1.61    1.85    1.74    1.65    1.56    1.45    1.34
11 Playstation      7.51    6.39    6.35    5.4     5.02    5.04    4.43
12 Samsung          0       0       0       0       0       0       0.58
13 Series 40        0       0       0       0       0       0       0   
14 Sony Ericsson    0       0       0       0       0       0       0.11
15 SymbianOS        5.61    3.05    2.74    2.47    2.64    2.95    1.1 
16 Unknown         14.3     5.9     6.48    6.35    6.36    7.37    7.38
17 webOS            0       0       0       0       0       0       0.22
18 Windows          4.58    4.5     3.77    3.25    2.87    2.84    2.56
# … with 118 more variables: `2009-08` <dbl>, `2009-09` <dbl>,
#   `2009-10` <dbl>, `2009-11` <dbl>, `2009-12` <dbl>, `2010-01` <dbl>,
#   `2010-02` <dbl>, `2010-03` <dbl>, `2010-04` <dbl>, `2010-05` <dbl>,
#   `2010-06` <dbl>, `2010-07` <dbl>, `2010-08` <dbl>, `2010-09` <dbl>,
#   `2010-10` <dbl>, `2010-11` <dbl>, `2010-12` <dbl>, `2011-01` <dbl>,
#   `2011-02` <dbl>, `2011-03` <dbl>, `2011-04` <dbl>, `2011-05` <dbl>,
#   `2011-06` <dbl>, `2011-07` <dbl>, `2011-08` <dbl>, …
# ℹ Use `colnames()` to see all variable names
> # Define "MobileOS" as factor
> df %<>%
+   mutate(MobileOS = as.factor(MobileOS)) %>%
+   print()
# A tibble: 18 × 126
   MobileOS      2009-…¹ 2009-…² 2009-…³ 2009-…⁴ 2009-…⁵ 2009-…⁶ 2009-…⁷
   <fct>           <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
 1 Android          1.75    5.47    6.12    5.56    5.47    0.3     5.19
 2 BlackBerry OS    2.16   12.4     4.74    5.52   13.0    20.1    20.1 
 3 Brew             0       0       0       0       0       0       0   
 4 iOS             59.2    60.5    68.1    69.8    63.1    60.0    56.6 
 5 LG               0       0       0       0       0       0       0.14
 6 Linux            3.33    0       0       0       0       0       0   
 7 Nintendo         0       0       0       0       0       0       0.17
 8 Nintendo 3DS     0       0       0       0       0       0       0   
 9 Nokia Unknown    0       0       0       0       0       0       0   
10 Other            1.61    1.85    1.74    1.65    1.56    1.45    1.34
11 Playstation      7.51    6.39    6.35    5.4     5.02    5.04    4.43
12 Samsung          0       0       0       0       0       0       0.58
13 Series 40        0       0       0       0       0       0       0   
14 Sony Ericsson    0       0       0       0       0       0       0.11
15 SymbianOS        5.61    3.05    2.74    2.47    2.64    2.95    1.1 
16 Unknown         14.3     5.9     6.48    6.35    6.36    7.37    7.38
17 webOS            0       0       0       0       0       0       0.22
18 Windows          4.58    4.5     3.77    3.25    2.87    2.84    2.56
# … with 118 more variables: `2009-08` <dbl>, `2009-09` <dbl>,
#   `2009-10` <dbl>, `2009-11` <dbl>, `2009-12` <dbl>, `2010-01` <dbl>,
#   `2010-02` <dbl>, `2010-03` <dbl>, `2010-04` <dbl>, `2010-05` <dbl>,
#   `2010-06` <dbl>, `2010-07` <dbl>, `2010-08` <dbl>, `2010-09` <dbl>,
#   `2010-10` <dbl>, `2010-11` <dbl>, `2010-12` <dbl>, `2011-01` <dbl>,
#   `2011-02` <dbl>, `2011-03` <dbl>, `2011-04` <dbl>, `2011-05` <dbl>,
#   `2011-06` <dbl>, `2011-07` <dbl>, `2011-08` <dbl>, …
# ℹ Use `colnames()` to see all variable names
> # Select 2010 variable, convert to whole numbers
> df %<>%
+   mutate(OS_2010 = `2010-01` * 100) %>%
+   select(MobileOS, OS_2010) %>%
+   print()
# A tibble: 18 × 2
   MobileOS      OS_2010
   <fct>           <dbl>
 1 Android          1190
 2 BlackBerry OS    2013
 3 Brew                0
 4 iOS              5358
 5 LG                 52
 6 Linux               0
 7 Nintendo           79
 8 Nintendo 3DS        0
 9 Nokia Unknown       0
10 Other              77
11 Playstation       218
12 Samsung           174
13 Series 40           0
14 Sony Ericsson      43
15 SymbianOS         141
16 Unknown           433
17 webOS             117
18 Windows           105
> # Check for outliers
> df %>%
+   select(OS_2010) %>%
+   boxplot(horizontal = TRUE)
> # Convert to integer and then convert to rows
> df %<>%
+   mutate(OS_2010 = as.integer(OS_2010)) %>%
+   uncount(OS_2010) %>%
+   print()
# A tibble: 10,000 × 1
   MobileOS
   <fct>   
 1 Android 
 2 Android 
 3 Android 
 4 Android 
 5 Android 
 6 Android 
 7 Android 
 8 Android 
 9 Android 
10 Android 
# … with 9,990 more rows
# ℹ Use `print(n = ...)` to see more rows
>

system · October 30, 2022, 12:29am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

Having an issue with sorting data in rows due to a conversion error

Load base packages manually

library(datasets) # For example datasets

Install pacman ("package manager") if needed

pacman must already be installed; then load contributed

packages (including pacman) with pacman

pacman: for loading/unloading packages

magrittr: for pipes

rio: for importing data

tidyverse: for so many reasons

This script makes use of functions from the forcats

package, which is installed as part of the tidyverse

LOAD AND PREPARE DATA

Import data into tibble "df"

A tibble: 18 × 126

… with 118 more variables: 2009-08 , 2009-09 ,

2009-10 , 2009-11 , 2009-12 , 2010-01 ,

2010-02 , 2010-03 , 2010-04 , 2010-05 ,

2010-06 , 2010-07 , 2010-08 , 2010-09 ,

2010-10 , 2010-11 , 2010-12 , 2011-01 ,

2011-02 , 2011-03 , 2011-04 , 2011-05 ,

2011-06 , 2011-07 , 2011-08 , …

Use colnames() to see all variable names

Define "MobileOS" as factor

A tibble: 18 × 126

… with 118 more variables: 2009-08 , 2009-09 ,

2009-10 , 2009-11 , 2009-12 , 2010-01 ,

2010-02 , 2010-03 , 2010-04 , 2010-05 ,

2010-06 , 2010-07 , 2010-08 , 2010-09 ,

2010-10 , 2010-11 , 2010-12 , 2011-01 ,

2011-02 , 2011-03 , 2011-04 , 2011-05 ,

2011-06 , 2011-07 , 2011-08 , …

Use colnames() to see all variable names

Select 2010 variable, convert to whole numbers

A tibble: 18 × 2

Check for outliers

Convert to rows

… with 118 more variables: `2009-08` , `2009-09` ,

`2009-10` , `2009-11` , `2009-12` , `2010-01` ,

`2010-02` , `2010-03` , `2010-04` , `2010-05` ,

`2010-06` , `2010-07` , `2010-08` , `2010-09` ,

`2010-10` , `2010-11` , `2010-12` , `2011-01` ,

`2011-02` , `2011-03` , `2011-04` , `2011-05` ,

`2011-06` , `2011-07` , `2011-08` , …

Use `colnames()` to see all variable names

… with 118 more variables: `2009-08` , `2009-09` ,

`2009-10` , `2009-11` , `2009-12` , `2010-01` ,

`2010-02` , `2010-03` , `2010-04` , `2010-05` ,

`2010-06` , `2010-07` , `2010-08` , `2010-09` ,

`2010-10` , `2010-11` , `2010-12` , `2011-01` ,

`2011-02` , `2011-03` , `2011-04` , `2011-05` ,

`2011-06` , `2011-07` , `2011-08` , …

Use `colnames()` to see all variable names