Map function to iteration across data frame

I have a database a little bit tidier than the original thanks to a StackOverflow colleague I'm really grateful

The data frame is composed of a patient ID which has several measurements along time (00 = basal, 66 = 6 months, 01 = 12 months). I know I put the data frame not chronologically ordered

df1<-data.frame(pacient<- c(6430, 6430, 6430, 6494, 6494, 6494, 6165, 6165, 6165),
                time<- c(00, 01, 66, 00, 01, 66, 00, 01, 66),
                weight <- c(115, 112, 110, 98, 95, 94, 88, 87, 86),
                waist <- c(123, NA, 112, 115, 112, 113, 112, 110, NA),
                p14_total<- c(7, NA, 4, 12, 5, NA, 15, 12, 13))
  • t.test

I am trying to perform comparisons between the measurements between the different time points, NOT paired. For example weight at 00 vs weight at 66 / weight at 00 vs weight at 01 /weight at 66 vs weight at 01 I am looking for a data frame or data.table to export with the statistics (t, pvalue, mean..)

  • Create a column with the difference between the different time measurements for each patient. For example: patient ID: 6430 Weight_6months = Weight01 - Weight66 Weight_12months= Weight01 - Weight00

I am really trying to perform this with purrr::map functions but I'm not reaching the objective

There's probably a lot of ways to do this but here's one way I thought of.

library(tidyverse)

df1<-tibble(pacient= c(6430, 6430, 6430, 6494, 6494, 6494, 6165, 6165, 6165),
                time= c(00, 01, 66, 00, 01, 66, 00, 01, 66),
                weight = c(115, 112, 110, 98, 95, 94, 88, 87, 86),
                waist = c(123, NA, 112, 115, 112, 113, 112, 110, NA),
                p14_total= c(7, NA, 4, 12, 5, NA, 15, 12, 13))

timepairs <- expand_grid(t1=unique(df1$time), t2=unique(df1$time)) %>%
   filter(t1>t2)

compfun <- function(t1, t2){
   df2 <- df1 %>%
      filter(time %in% c(t1, t2))
   testout <- t.test(weight~time, data=df2)
   tibble(t1=t1, t2=t2, t=testout$statistic, p.value=testout$p.value, mean1=testout$estimate[1], mean2=testout$estimate[2], diff=mean1-mean2)
}

map2_df(timepairs$t1, timepairs$t2, compfun)
#> # A tibble: 3 x 7
#>      t1    t2     t p.value mean1 mean2  diff
#>   <dbl> <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl>
#> 1     1     0 0.216   0.839  100.  98    2.33
#> 2    66     0 0.347   0.747  100.  96.7  3.67
#> 3    66     1 0.131   0.902   98   96.7  1.33

Created on 2021-09-30 by the reprex package (v2.0.1)

I don't really understand what you've done

First of all here:

timepairs <- expand_grid(t1=unique(df1$time), t2=unique(df1$time)) %>%
   filter(t1>t2)

Afterwards you create a function but I am not sure i f you can iterate with this function which actually is defined by weight (an specific variable)

  testout <- t.test(weight~time, data=df2)

How can iterate through the rest of columns (variables)? waist, p14_total .. there are more in my database

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.