ERROR: Assigned data `value` must be compatible with existing data

Screenshot 2022-07-09 004701
i tried to add a new column:
bikes_by_customer$total1 <- bikes_by_customer %>% group_by(member_casual) %>% summarise(total1 = sum(number_of_bikes))
but it didnt work and error appears:
Assigned data value must be compatible with existing data.
:heavy_multiplication_x: Existing data has 5 rows.
:heavy_multiplication_x: Assigned data has 2 rows.
:information_source: Only vectors of size 1 are recycled.

There are a couple of problems here: The error you are getting, and what you are actually doing.

So first, regarding that error, consider the dimensions of bikes_by_customer. It is a data.frame with 5 rows and three columns. When you create a new column the way you are doing it (by using data$newcol <- vector semantics), you can either assign a vector that is equal in length to the number of rows in your dataframe, or you can say:

data$newcol <- 1 # or any other vector where length(x) == 1`

However, in your case, look at what your expression evaluates to:

bikes_by_customer %>% 
    group_by(member_casual) %>% 
    summarize(
        total1 = sum(number_of_bikes)
    )
#> # A tibble: 2 x 2
#>   member_casual  total1
#> * <chr>           <dbl>
#> 1 casual        2670303
#> 2 member        3263737

Your result isn't a vector at all, but a 2x2 data.frame, and R doesn't know what to do with that, so you get the error that you are getting.

So, how do you actually add the new column that you are trying to add? If I understand what you are trying to do, the easiest way is going to be dplyr::mutate. Remember that mutate provides a guarantee to return a data.frame with the same number of rows, whereas summarize will only return the same number of rows when the grouping variable is unique (either using dplyr::rowwise or dplyr::group_by on a variable where length(df$x) == length(unique(df$x))), so mutate is ALMOST ALWAYS the right choice when all you want to do is add a new column.

bikes_by_customer <- bikes_by_customer %>% 
    group_by(member_casual) %>% 
    mutate(
        total1 = sum(number_of_bikes)
    )

bikes_by_customer
#> # A tibble: 5 x 4
#> # Groups:   member_casual [2]
#>   rideable_type member_casual number_of_bikes  total1
#>   <chr>         <chr>                   <dbl>   <dbl>
#> 1 classic_bike  casual                1313878 2670303
#> 2 classic_bike  member                2013712 3263737
#> 3 docked_bike   casual                 308231 2670303
#> 4 electric_bike casual                1048194 2670303
#> 5 electric_bike member                1250025 3263737
1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.