Extracting patterns and relationships between categorical data in keras

machinelearning
keras
#1

Hi all,

I was just wondering if anybody would be able to help with some data I am struggling to extract from keras after a model I have built is run on my data. The main issue here is the number of categorical data I have and how I am trying to find relationships between them. I have built a sequential model for some genomic data as follows:

Data=Genes as columns and ages as columns: Categorical data such as age and organ etc are one-hot encoded:

onehot <- data.frame(to_categorical(onehot$Age, 7), to_categorical(onehot$Organ, 13), to_categorical(onehot$Sex, 4, onehot[,1:5000] **(number of genes**)..etc

split data (code not shown

build model (sequential)

model <- keras_model_sequential()

model %>%
   layer_dense(units=150, activation = 'relu', input_shape = 5000) %>%  
   #layer_dropout(0.4)%>%
   #layer_dense(units = 128, activation = "relu") %>%
   layer_dense(units = 64, activation = "relu") %>%   
   #layer_dense(units = 32, activation = "relu") %>%
   #layer_dropout(0.2) %>%
   #layer_dense(units = 5074, activation = "relu") %>%
   layer_dense(units=ncol(trainingtarget1), activation = "sigmoid")
# sigmoid for multi-class and multi-label classification

model %>% keras::compile(loss='binary_crossentropy',
                          optimizer='adam',
                          metrics='accuracy')
history <- model%>% 
   fit(as.matrix(training1), # input, the first independent variables
       as.matrix(trainingtarget1), # input, Metadata
       epoch=200,
       batch=32,
       validation_split = 0.15,
       callbacks = list(early_stop, print_dot_callback))

Run test set on model

I get good accuracy on this. I can predict classes on the individual categorical data to see how well it identifies organ, age etc...

What I would like to do is extract the genes that that light up because of relationships between the one-hot encoded categorical data.. i.e, genes that are high in young ages and low in adults in 5 of the organs specifically because they are female organs (for example).. Is there a way of extracting relational information like this? Given that this is a sequential model, it must find patterns linked to this also.....

I would be more than happy to provide any more code if necessary!

Any help would be greatly appreciated!

Many thanks!!

0 Likes

closed #2

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.

0 Likes