R functions to create a variable

My data looks like this:

patient_id dia_ppal dia_02 dia_03 poad_02 poad_03
1 A41.9 R65.20 J18.9 S S
2 J12.89 B97.29 Z87.891 S E
3 J18.9 R68.89 E66.3 S S
4 J12.89 B97.29 I10 S S
5 J12.89 B97.29 J96.00 S S
6 J12.89 B97.29 I10 S S
7 J98.8 B97.29 D69.6 S S
8 J12.89 B97.29 E11.65 S N
9 J12.89 B97.29 R45.1 S N
10 J12.89 B97.29 E11.65 S N
11 J98.8 B97.29 J96.00 S S

Principal diagnosis=DIA_PPAL
diagnosis 2=DIA_02
present on admission=POAD (Si/No)

I need to create a variable with a diagnosis at baseline. For example diabetes at baseline (yes/no). What I thought was:

1)when there is diabetes in dia_ppal asign assign a 1, the rest are missing or 0 as corresponding
2)create variable dm_ppal= when dia_ppal=1 and POAD0_ppal=S assign 1
3) repite step 2 and 3 for dia_02/paod02, dia_03/paod03, dia_04/paod04, etc.
4)create diabetes_baseline: when there is a 1 in dm_ppal,or in dia_02, etc, asign 1, the rest ceros ir missing as corresponding

I did this:

dm_basal<-with(dx,ifelse(dm_01==1 & poad_ppal=="S",1,
ifelse(dm_02==1 & poad_02=="S",1,
ifelse(dm_03==1 & poad_03=="S",1,0))))

However, there are 2 probles:

  • the codes for diabetes start with "E08","E09",etc. That means there could be a diabetes codes as E081. And this sintaxis does not include them. I know I have to use grep, but I don't know how.
  • the number of diagnosis is actually 20, so , I'am sure there is a way not to repeat every thing.

Can somebody help me?
*pd: forgat to include the POAD_ppal , but thats the idea of the data set.

Thank you¡¡

Welcome to R and RStudio

We need more information, especially some data in dput() format.
Have a look at this.

1 Like

Thank you for the suggestion .

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.