odm xml files can't turn in to data frame

Hello Folks,

I want to convert ODM .xml files into R data frame. I tried couple of libraries such as XML and XML2 but it failed. I am attaching .xml file please share code how to convert into the data frame so I can use it in my project.

this is the link https://evs.nci.nih.gov/ftp1/CDISC/SDTM/CDASH%20Terminology.odm.xml

thank you in advance

Please show us the code that you tried.

library("XML")
library("methods")
results <- xmlParse(file = "/Users/kk/Desktop/SDTM Terminology.odm.xml")
print(results)
rootnode <- xmlRoot(results)
rootsize <- xmlSize(rootnode)
print(rootnode[1])
xml <- xmlToDataFrame("/Users/kk/Desktop/SDTM Terminology.odm.xml")
print(xml)

Maybe this is a start.

library(xml2)
results <- read_xml("https://evs.nci.nih.gov/ftp1/CDISC/SDTM/CDASH%20Terminology.odm.xml")
results = xml_ns_strip(results)

# find the CodeList s
x = xml_find_all(results,"//CodeList")
# create description of the CodeLists
CodeListsTable = purrr::map_dfr(x,
                    function(x) {
                      data.frame(
                        OID=xml_attr(x,"OID"),
                        Name = xml_attr(x,"Name"),
                        # other CodeList descriptive elements you might be interested in
                        stringsAsFactors = F
                      )
                    })
print(CodeListsTable)
#>                   OID                                                 Name
#> 1  CL.C78418.CMDOSFRM                     Concomitant Medication Dose Form
#> 2    CL.C78417.CMDOSU                    Concomitant Medication Dose Units
#> 3  CL.C78419.CMDOSFRQ Concomitant Medication Dosing Frequency per Interval
#> 4   CL.C78420.CMROUTE       Concomitant Medication Route of Administration
#> 5  CL.C78422.EGORRESU                                   ECG Original Units
#> 6  CL.C128690.ETHNICC                               Ethnicity As Collected
#> 7  CL.C78426.EXDOSFRM                                   Exposure Dose Form
#> 8  CL.C78745.EXDOSFRQ               Exposure Dosing Frequency per Interval
#> 9   CL.C78425.EXROUTE                     Exposure Route of Administration
#> 10   CL.C128689.RACEC                                    Race As Collected
#> 11    CL.C83004.SUNCF    Substance Use Never/Current/Former Classification
#> 12  CL.C78428.EXVOLTU                     Total Volume Administration Unit
#> 13  CL.C78427.EXINTPU      Unit for the Duration of Treatment Interruption
#> 14 CL.C78421.DAORRESU                   Unit of Drug Dispensed or Returned
#> 15  CL.C78429.EXFLRTU                        Unit of Measure for Flow Rate
#> 16   CL.C78423.EXDOSU                                   Units for Exposure
#> 17 CL.C78430.EXPDOSEU                           Units for Planned Exposure
#> 18    CL.C78431.VSPOS                      Vital Signs Position of Subject

# create definitions of one (the first) Codelist
x1 = x[1]

CodeListDefinitions = function (x1) {
  defs = xml_find_all(x1,".//EnumeratedItem")
  purrr::map_dfr(defs,
                 function(x) {
                   data.frame(
                     CodedValue=xml_attr(x,"CodedValue"),
                     # Definition capped to 60 characters for visibility
                     Definition = substring(xml_text(xml_find_first(x,".//nciodm:CDISCDefinition")),1,60),
                     # other EnumeratedItem elements you might be interested in
                     stringsAsFactors = F
                   )
                 })
}

CodeListDefinitionsTable = CodeListDefinitions (x1) 
print(CodeListDefinitionsTable)
#>     CodedValue                                                   Definition
#> 1      AEROSOL A product that is packaged under pressure and contains thera
#> 2      CAPSULE A solid pharmaceutical dosage form that contains medicinal a
#> 3        CREAM A semisolid emulsion of either the oil-in-water or the water
#> 4          GAS Any elastic aeriform fluid in which the molecules are separa
#> 5          GEL A semisolid (1) dosage form that contains a gelling agent to
#> 6     OINTMENT A suspension or emulsion, semisolid (1) dosage form, usually
#> 7        PATCH A drug delivery system that often contains an adhesive backi
#> 8       POWDER An intimate mixture of dry, finely divided drugs and/or chem
#> 9        SPRAY A liquid minutely divided as by a jet of air or steam. (NCI)
#> 10 SUPPOSITORY A solid body of various weights and shapes, adapted for intr
#> 11  SUSPENSION A liquid dosage form that contains solid particles dispersed
#> 12      TABLET A solid dosage form containing medicinal substances with or

Created on 2020-05-22 by the reprex package (v0.3.0)

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.