How do i assign a specific colum as numeric in a splited dataframe?

take airquality as example

split(airquality,airquality$Month)
str(airquality
List of 5
5:'data.frame': 31 obs. of 6 variables: .. Ozone : int [1:31] 41 36 12 18 NA 28 23 19 8 NA ...
.. Solar.R: int [1:31] 190 118 149 313 NA NA 299 99 19 194 ... .. Wind : num [1:31] 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
.. Temp : int [1:31] 67 72 74 62 56 66 65 59 61 69 ... .. Month : int [1:31] 5 5 5 5 5 5 5 5 5 5 ...
.. Day : int [1:31] 1 2 3 4 5 6 7 8 9 10 ... 6:'data.frame': 30 obs. of 6 variables:
.. Ozone : int [1:30] NA NA NA NA NA NA 29 NA 71 39 ... .. Solar.R: int [1:30] 286 287 242 186 220 264 127 273 291 323 ...
.. Wind : num [1:30] 8.6 9.7 16.1 9.2 8.6 14.3 9.7 6.9 13.8 11.5 ... .. Temp : int [1:30] 78 74 67 84 85 79 82 87 90 87 ...
.. Month : int [1:30] 6 6 6 6 6 6 6 6 6 6 ... .. Day : int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
7:'data.frame': 31 obs. of 6 variables: .. Ozone : int [1:31] 135 49 32 NA 64 40 77 97 97 85 ...
.. Solar.R: int [1:31] 269 248 236 101 175 314 276 267 272 175 ... .. Wind : num [1:31] 4.1 9.2 9.2 10.9 4.6 10.9 5.1 6.3 5.7 7.4 ...
.. Temp : int [1:31] 84 85 81 84 83 83 88 92 92 89 ... .. Month : int [1:31] 7 7 7 7 7 7 7 7 7 7 ...
.. Day : int [1:31] 1 2 3 4 5 6 7 8 9 10 ... 8:'data.frame': 31 obs. of 6 variables:
.. Ozone : int [1:31] 39 9 16 78 35 66 122 89 110 NA ... .. Solar.R: int [1:31] 83 24 77 NA NA NA 255 229 207 222 ...
.. Wind : num [1:31] 6.9 13.8 7.4 6.9 7.4 4.6 4 10.3 8 8.6 ... .. Temp : int [1:31] 81 81 82 86 85 87 89 90 90 92 ...
.. Month : int [1:31] 8 8 8 8 8 8 8 8 8 8 ... .. Day : int [1:31] 1 2 3 4 5 6 7 8 9 10 ...
9:'data.frame': 30 obs. of 6 variables: .. Ozone : int [1:30] 96 78 73 91 47 32 20 23 21 24 ...
.. Solar.R: int [1:30] 167 197 183 189 95 92 252 220 230 259 ... .. Wind : num [1:30] 6.9 5.1 2.8 4.6 7.4 15.5 10.9 10.3 10.9 9.7 ...
.. Temp : int [1:30] 91 92 93 93 87 84 80 78 75 73 ... .. Month : int [1:30] 9 9 9 9 9 9 9 9 9 9 ...
..$ Day : int [1:30] 1 2 3 4 5 6 7 8 9 10 ...)

If I want to change Ozone or other column to numeric , what should I do?
I know

airquality$Ozone <- as.numeric(airquality$Ozone)
would work if the dateframe hasn't been splited.
but What should I do if it has been splited in to group?

Just iterate through the list, many functions can accomplish this.
for example, lapply from base R

list1 <- split(airquality,airquality$Month)

lapply(list1, function(x) {
  x$Ozone <- as.character(x$Ozone)
  return(x)
})

or map function from purrr

library(purrr)
library(dplyr)

list1 |> map(~ mutate(.x, Ozone=as.character(Ozone)))

Thx !
I've tried lapply many times but it didn't work
Now I know I missed return(x)
but I don't know why it is so crucial
without return , I got this

v <- lapply(airquality1,function(x){

  •  x$Ozone <- as.numeric(x$Ozone) 
    
  • })
    

str(v)
List of 5
5: num [1:31] 41 36 12 18 NA 28 23 19 8 NA ... 6: num [1:30] NA NA NA NA NA NA 29 NA 71 39 ...
7: num [1:31] 135 49 32 NA 64 40 77 97 97 85 ... 8: num [1:31] 39 9 16 78 35 66 122 89 110 NA ...
$ 9: num [1:30] 96 78 73 91 47 32 20 23 21 24 ...

with return(x) I got this
v <- lapply(airquality1,function(x){

  •  x$Ozone <- as.numeric(x$Ozone) 
    
  • return(x)})
    

str(v)
List of 5
5:'data.frame': 31 obs. of 6 variables: .. Ozone : num [1:31] 41 36 12 18 NA 28 23 19 8 NA ...
.. Solar.R: int [1:31] 190 118 149 313 NA NA 299 99 19 194 ... .. Wind : num [1:31] 7.4 8 12.6 11.5 14.3 14.9 8.6 13.8 20.1 8.6 ...
.. Temp : int [1:31] 67 72 74 62 56 66 65 59 61 69 ... .. Month : int [1:31] 5 5 5 5 5 5 5 5 5 5 ...
.. Day : int [1:31] 1 2 3 4 5 6 7 8 9 10 ... 6:'data.frame': 30 obs. of 6 variables:
.. Ozone : num [1:30] NA NA NA NA NA NA 29 NA 71 39 ... .. Solar.R: int [1:30] 286 287 242 186 220 264 127 273 291 323 ...
.. Wind : num [1:30] 8.6 9.7 16.1 9.2 8.6 14.3 9.7 6.9 13.8 11.5 ... .. Temp : int [1:30] 78 74 67 84 85 79 82 87 90 87 ...
.. Month : int [1:30] 6 6 6 6 6 6 6 6 6 6 ... .. Day : int [1:30] 1 2 3 4 5 6 7 8 9 10 ...
7:'data.frame': 31 obs. of 6 variables: .. Ozone : num [1:31] 135 49 32 NA 64 40 77 97 97 85 ...
.. Solar.R: int [1:31] 269 248 236 101 175 314 276 267 272 175 ... .. Wind : num [1:31] 4.1 9.2 9.2 10.9 4.6 10.9 5.1 6.3 5.7 7.4 ...
.. Temp : int [1:31] 84 85 81 84 83 83 88 92 92 89 ... .. Month : int [1:31] 7 7 7 7 7 7 7 7 7 7 ...
.. Day : int [1:31] 1 2 3 4 5 6 7 8 9 10 ... 8:'data.frame': 31 obs. of 6 variables:
.. Ozone : num [1:31] 39 9 16 78 35 66 122 89 110 NA ... .. Solar.R: int [1:31] 83 24 77 NA NA NA 255 229 207 222 ...
.. Wind : num [1:31] 6.9 13.8 7.4 6.9 7.4 4.6 4 10.3 8 8.6 ... .. Temp : int [1:31] 81 81 82 86 85 87 89 90 90 92 ...
.. Month : int [1:31] 8 8 8 8 8 8 8 8 8 8 ... .. Day : int [1:31] 1 2 3 4 5 6 7 8 9 10 ...
9:'data.frame': 30 obs. of 6 variables: .. Ozone : num [1:30] 96 78 73 91 47 32 20 23 21 24 ...
.. Solar.R: int [1:30] 167 197 183 189 95 92 252 220 230 259 ... .. Wind : num [1:30] 6.9 5.1 2.8 4.6 7.4 15.5 10.9 10.3 10.9 9.7 ...
.. Temp : int [1:30] 91 92 93 93 87 84 80 78 75 73 ... .. Month : int [1:30] 9 9 9 9 9 9 9 9 9 9 ...
..$ Day : int [1:30] 1 2 3 4 5 6 7 8 9 10 ...

[/quote]

if you don't return the whole x, the costumed function will only return the column of Ozone only.

got it! Thanks a lot!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.