How to represent in my PCOA analysis, only microorganisms greater than 0.01 of my total sample?

Family %>% 
  left_join(Metadata %>% select(SampleID, Month), by = c("Index" = "SampleID")) %>% 
  gather(Microbe, count, -c(Month, Index)) %>% 
  group_by(Month, Microbe) %>% 
  summarise(count = sum(count)) %>% 
  mutate(prop = count/sum(count)) %>% 
  filter(prop > 0.0001) %>% 
  select(Month, Microbe) %>%
  ungroup() %>% 
  inner_join(Family %>% 
               left_join(Metadata %>% select(SampleID, Month), by = c("Index" = "SampleID")) %>% 
               gather(Microbe, count, -c(Month, Index)), by = c("Month", "Microbe")) %>% 
  spread(Microbe, count)
#> Error in Family %>% left_join(Metadata %>% select(SampleID, Month), by = c(Index = "SampleID")) %>% : no se pudo encontrar la función "%>%"

Created on 2020-03-05 by the reprex package (v0.3.0)

Andres im getting this error : Error: by can't contain join column Index which is missing from LHS

and to be honest i didnt understand what LHS means: https://www.rdocumentation.org/packages/pse/versions/0.4.7/topics/LHS

LHS stands for Left Hand Side, and you are getting that error message because as I told you on the previous post you are eliminating the joining key (i.e. Index column) thus the joint can't be executed, If you need to do this steps do it after the joint.

Thank you very much Andres, I solved this problem by renaming the SampleID column by index in the Family database. I have one more question, Andrés, it is possible to extract the file generated from this in Excel format:

Family %>%
left_join(Metadata %>% select(SampleID, Month), by = c("Index" = "SampleID")) %>%
gather-...................................................

Because I don't know how to program the way you do it, using%>% (your scripts are more simplified). And I would like to open the format that was generated and then use my script.

The easiest way is using the openxlsx package, with this function write.xlsx()

Family_filtered <- Family %>% 
  left_join(Metadata %>% select(SampleID, Month), by = c("Index" = "SampleID")) %>% 
  gather(Microbe, count, -c(Month, Index)) %>% 
  group_by(Month, Microbe) %>% 
  summarise(count = sum(count)) %>% 
  mutate(prop = count/sum(count)) %>% 
  filter(prop > 0.0001) %>% 
  select(Month, Microbe) %>%
  ungroup() %>% 
  inner_join(Family %>% 
               left_join(Metadata %>% select(SampleID, Month), by = c("Index" = "SampleID")) %>% 
               gather(Microbe, count, -c(Month, Index)), by = c("Month", "Microbe")) %>% 
  spread(Microbe, count)

openxlsx::write.xlsx(Family_filtered, file = "file_name.xlsx")

But this doesn't sound like a good idea especially if you are doing serious research since this could bring you reproducibility issues.

and to create the document do I have to use this command?

write.xlsx(
x,
file,
sheetName = "Sheet1",
col.names = TRUE,
row.names = TRUE,
append = FALSE,
showNA = TRUE,
password = NULL
)

You are right in what you mention but all this has been a challenge for me, that if they ask me for proof of the analysis, I will choose to send them this and the documents in excel as evidence of such analyzes.

There is an example on my previous post, check the last command

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.