cleaning the table. Subgroups in the rows

Hi,
I know this might be a very simple issue, but I did not use R much and I can not find a solution...

I need to clean this table. test.pdf (248.7 KB)

Under the column "pathogen" are included pathogens but also the sample types (wound, CSF, respiratory, urinary...).

I would like to create a new row with the pathogens.
final2.pdf (223.3 KB)

Many thanks in advance

Hi!

To help us help you, could you please prepare a reproducible example (reprex) illustrating your issue? Please have a look at this guide, to see how to create one:

Hi Andrés,

Thanks for the link. I did not upload my database because contains sensitive information that’s why I created a small table with the issue.

I will follow your link and update later.

Thanks !

That is completely fine, you just have to share the sample data in a copy/paste friendly format to make it easier for people trying to help you. Read the reprex guide for details.


Hi Andrés,
hi everyone,

I am not sure if this is a reprex. I try to share with you some screenshots, if that helps.

The problem that I have here, is that under the colum "Names " there are not only Names but also the height, weight, age, run and time

I would like to mutate the 1st row with the ones from the 2nd to the 6th row, then mutate the 7th row with the one between the 8th and the 11th and finally the 12th with the ones between the 13th and the 17th row. Or create a new column with just the names...

test%>%
 mutate(row[(1,2:6)|(7,8:11)|(12,13:17)]

any ideas?

Many thanks in advance

Hello tacoba,

unfortunately that isn't exactly what we need. If you have a non sensitive data.frame, you can type dput(Data) in your console (where Data is the name of your non sensitive data.frame) and paste the resulting output into the forum. You can use triple ` followed by r (```r) to begin a code chunk and end the chunk with triple ` again.

This allows us to just copy your data.frame without hand typing everything from your screenshot. Additionally, you may want to add a wanted outcome of your example, because from your description, I am not 100% sure if I correctly understand what you would like to achieve. An expected outcome will leave no (or at least not so many) questions open.

Kind regards

Hi FactOREO,

thanks! Let me try...

structure(list(Names = c("Tom", "height", "weight", "age", "run",
"time", "Paul", "height", "weight", "age", "time", "Steven",
"height", "weight", "age", "run", "time"), t0 = c(NA, 171, NA,
55, 2, 15, NA, 150, 70, NA, 15, NA, 171, 60, 55, 7, 15), t1 = c(NA,
171, NA, 55, 3, 16, NA, 150, 70, 54, 14, NA, NA, NA, NA, NA,
16), t2 = c(NA, 171, NA, 55, 4, 18, NA, 150, 69, NA, 13, NA,
171, 60, 55, 8, 17), t3 = c(NA, 171, NA, 55, 5, 17, NA, 150,
68, NA, NA, NA, 171, 60, NA, 8, 18)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -17L))

I hope it worked.

My wanted outcome for my example would be (see below), to create a new column with the names of the people.

structure(list(Names = c("Tom", "Tom", "Tom", "Tom", "Tom", "Paul",
"Paul", "Paul", "Paul", "Steven", "Steven", "Steven", "Steven",
"Steven"), variables = c("height", "weight", "age", "run", "time",
"height", "weight", "age", "time", "height", "weight", "age",
"run", "time"), t0 = c(171, NA, 55, 2, 15, 150, 70, NA, 15, 171,
60, 55, 7, 15), t1 = c(171, NA, 55, 3, 16, 150, 70, 54, 14, NA,
NA, NA, NA, 16), t2 = c(171, NA, 55, 4, 18, 150, 69, NA, 13,
171, 60, 55, 8, 17), t3 = c(171, NA, 55, 5, 17, 150, 68, NA,
NA, 171, 60, NA, 8, 18)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -14L))

Many thanks in advance

1 Like

Hello again,

here is an approach you could use, depending on the actual dataset and the properties you have in there.

library(tidyverse)
# if you have a list of all the relevant names, this will work
names <- c('Tom','Paul','Steven')
Data |>
  mutate(
    Group = if_else(Names %in% names, Names, NA_character_)
  ) |>
  fill(
    Group, .direction = 'down'
  ) |>
  pivot_longer(
    cols = t0:t3, names_to = 'category', values_to = 'value'
  ) |>
  filter(
    Names != Group
  ) |>
  pivot_wider(
    values_from = 'value', names_from = 'category'
  ) |>
  rename(
    'Names' = Group,
    'variables' = Names
  ) |>
  select(Names,variables,everything())
#> # A tibble: 14 × 6
#>    Names  variables    t0    t1    t2    t3
#>    <chr>  <chr>     <dbl> <dbl> <dbl> <dbl>
#>  1 Tom    height      171   171   171   171
#>  2 Tom    weight       NA    NA    NA    NA
#>  3 Tom    age          55    55    55    55
#>  4 Tom    run           2     3     4     5
#>  5 Tom    time         15    16    18    17
#>  6 Paul   height      150   150   150   150
#>  7 Paul   weight       70    70    69    68
#>  8 Paul   age          NA    54    NA    NA
#>  9 Paul   time         15    14    13    NA
#> 10 Steven height      171    NA   171   171
#> 11 Steven weight       60    NA    60    60
#> 12 Steven age          55    NA    55    NA
#> 13 Steven run           7    NA     8     8
#> 14 Steven time         15    16    17    18

# if all names start with uppercase letter, do this
Data |>
  mutate(
    Group = if_else(str_detect(Names,'^[A-Z]'),Names,NA_character_)
  ) |>
  fill(
    Group, .direction = 'down'
  ) |>
  pivot_longer(
    cols = t0:t3, names_to = 'category', values_to = 'value'
  ) |>
  filter(
    Names != Group
  ) |>
  pivot_wider(
    values_from = 'value', names_from = 'category'
  ) |>
  rename(
    'Names' = Group,
    'variables' = Names
  ) |>
  select(Names,variables,everything())
#> # A tibble: 14 × 6
#>    Names  variables    t0    t1    t2    t3
#>    <chr>  <chr>     <dbl> <dbl> <dbl> <dbl>
#>  1 Tom    height      171   171   171   171
#>  2 Tom    weight       NA    NA    NA    NA
#>  3 Tom    age          55    55    55    55
#>  4 Tom    run           2     3     4     5
#>  5 Tom    time         15    16    18    17
#>  6 Paul   height      150   150   150   150
#>  7 Paul   weight       70    70    69    68
#>  8 Paul   age          NA    54    NA    NA
#>  9 Paul   time         15    14    13    NA
#> 10 Steven height      171    NA   171   171
#> 11 Steven weight       60    NA    60    60
#> 12 Steven age          55    NA    55    NA
#> 13 Steven run           7    NA     8     8
#> 14 Steven time         15    16    17    18

Hopefully this works for your actual data. If not, feel free to ask further questions. I will try my best to help you.

Kind regards
Created on 2022-10-08 by the reprex package (v2.0.1)

3 Likes

thanks very much FactOREO !!
It works perfectly. I only changed the native R pipe |> to the {magrittr} pipe %>% because it was giving me error with the native R pipe.

Kind regards

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.