How to perform a co-occurrence network analysis of microbial ecology data?

Hello everyone!
Is there anyone here with experience in co-occurrence network analysis of microbial ecology data? I would appreciate any advice or steps on how to conduct this analysis. What I'm looking for is to generate a graph very similar to this (network co-occurrence figure).

These are the data I have, and I'd like to analyze the entire dataset. There are genes that are repeated in different samples, microorganisms, as well as antibiotics. Thank you very much in advance.

data <-tibble::tribble(
                       ~IDs,          ~genes,     ~Antibioticos,           ~generos,
         "61_contigs.fasta",    "AAC(6')-Iz", "aminoglycosides", "Stenotrophomonas",
         "61_contigs.fasta",          "mexW",      "multidrugs",      "Pseudomonas",
         "61_contigs.fasta",          "mexK",      "multidrugs",      "Pseudomonas",
         "61_contigs.fasta",        "catB11",       "Phenicols",       "Klebsiella",
         "61_contigs.fasta",         "vanRO",   "glycopeptides",      "Rhodococcus",
         "61_contigs.fasta",          "MexF",      "multidrugs",      "Pseudomonas",
         "61_contigs.fasta",          "smeD",      "multidrugs", "Stenotrophomonas",
         "61_contigs.fasta",          "smeE",      "multidrugs", "Stenotrophomonas",
         "61_contigs.fasta",          "smeF",      "multidrugs", "Stenotrophomonas",
         "61_contigs.fasta",          "mexW",      "multidrugs",      "Pseudomonas",
         "61_contigs.fasta",       "tet(42)",   "tetracyclines",      "Micrococcus",
         "61_contigs.fasta",          "MexE",      "multidrugs",      "Pseudomonas",
         "61_contigs.fasta",         "vanRO",   "glycopeptides",      "Rhodococcus",
         "61_contigs.fasta",         "KHM-1",      "multidrugs",      "Citrobacter",
         "61_contigs.fasta",   "APH(3'')-Ib", "aminoglycosides",      "Pseudomonas",
         "61_contigs.fasta",     "APH(6)-Id", "aminoglycosides",      "Pseudomonas",
         "62_contigs.fasta",          "mexQ",      "multidrugs",      "Pseudomonas",
         "62_contigs.fasta",         "PER-6",      "multidrugs",        "Aeromonas",
         "62_contigs.fasta",          "smeF",      "multidrugs", "Stenotrophomonas",
         "62_contigs.fasta",         "KHM-1",      "multidrugs",      "Citrobacter",
         "62_contigs.fasta",          "MexD",      "multidrugs",      "Pseudomonas",
         "62_contigs.fasta",          "MexF",      "multidrugs",      "Pseudomonas",
         "62_contigs.fasta",          "mphG",      "macrolides",   "Photobacterium",
         "62_contigs.fasta",    "AAC(6')-Iz", "aminoglycosides", "Stenotrophomonas",
         "62_contigs.fasta",          "MexF",      "multidrugs",      "Pseudomonas",
         "62_contigs.fasta",          "sul1",    "sulfonamides",           "Vibrio",
         "62_contigs.fasta",         "vanRO",   "glycopeptides",      "Rhodococcus",
         "62_contigs.fasta",          "RbpA",      "macrolides",    "Mycobacterium",
         "62_contigs.fasta",          "smeD",      "multidrugs", "Stenotrophomonas",
         "62_contigs.fasta",          "mexW",      "multidrugs",      "Pseudomonas",
         "62_contigs.fasta",         "rpoB2",      "macrolides",         "Nocardia",
         "62_contigs.fasta",       "tet(42)",   "tetracyclines",      "Micrococcus",
         "63_contigs.fasta",          "mexK",      "multidrugs",      "Pseudomonas",
         "63_contigs.fasta",         "PER-6",      "multidrugs",        "Aeromonas",
         "63_contigs.fasta",         "vanRO",   "glycopeptides",      "Rhodococcus",
         "63_contigs.fasta",          "MexF",      "multidrugs",      "Pseudomonas",
         "63_contigs.fasta",         "PER-2",      "multidrugs",       "Salmonella",
         "63_contigs.fasta",    "AAC(6')-Iz", "aminoglycosides", "Stenotrophomonas",
         "63_contigs.fasta",          "RbpA",      "macrolides",    "Mycobacterium",
         "63_contigs.fasta",          "mexW",      "multidrugs",      "Pseudomonas",
         "63_contigs.fasta",         "PER-6",      "multidrugs",        "Aeromonas",
         "63_contigs.fasta",          "mefC",      "macrolides",   "Photobacterium",
         "63_contigs.fasta",          "mtrA",      "multidrugs",    "Mycobacterium",
         "63_contigs.fasta",          "CpxR",      "multidrugs",      "Pseudomonas",
         "63_contigs.fasta",          "mexW",      "multidrugs",      "Pseudomonas",
         "63_contigs.fasta",          "MexF",      "multidrugs",      "Pseudomonas",
         "63_contigs.fasta",          "MexE",      "multidrugs",      "Pseudomonas",
         "63_contigs.fasta",   "APH(3'')-Ib", "aminoglycosides",      "Pseudomonas",
         "63_contigs.fasta",          "MexD",      "multidrugs",      "Pseudomonas",
         "63_contigs.fasta",          "mtrA",      "multidrugs",    "Mycobacterium",
         "63_contigs.fasta",          "smeF",      "multidrugs", "Stenotrophomonas",
         "63_contigs.fasta",          "smeE",      "multidrugs", "Stenotrophomonas",
         "63_contigs.fasta",          "smeD",      "multidrugs", "Stenotrophomonas",
         "63_contigs.fasta",          "EreD",      "macrolides",       "Riemerella",
         "63_contigs.fasta",         "vanRO",   "glycopeptides",      "Rhodococcus",
         "63_contigs.fasta",         "KHM-1",      "multidrugs",      "Citrobacter",
         "63_contigs.fasta",          "smeE",      "multidrugs", "Stenotrophomonas",
         "63_contigs.fasta",          "smeD",      "multidrugs", "Stenotrophomonas",
         "63_contigs.fasta",          "RbpA",      "macrolides",    "Mycobacterium",
         "63_contigs.fasta",       "tet(42)",   "tetracyclines",      "Micrococcus",
         "64_contigs.fasta",          "mexQ",      "multidrugs",      "Pseudomonas",
         "64_contigs.fasta",          "MexF",      "multidrugs",      "Pseudomonas",
         "64_contigs.fasta",       "tet(42)",   "tetracyclines",      "Micrococcus",
         "64_contigs.fasta",          "mtrA",      "multidrugs",    "Mycobacterium",
         "64_contigs.fasta",        "aadA27", "aminoglycosides",    "Acinetobacter",
         "64_contigs.fasta",         "vanRO",   "glycopeptides",      "Rhodococcus",
         "64_contigs.fasta",       "tet(42)",   "tetracyclines",      "Micrococcus",
         "64_contigs.fasta",         "vanRO",   "glycopeptides",      "Rhodococcus",
         "64_contigs.fasta",          "MexF",      "multidrugs",      "Pseudomonas",
         "64_contigs.fasta",          "mexW",      "multidrugs",      "Pseudomonas",
         "64_contigs.fasta",   "APH(3'')-Ib", "aminoglycosides",      "Pseudomonas",
         "65_contigs.fasta",          "mexW",      "multidrugs",      "Pseudomonas",
         "65_contigs.fasta",         "KHM-1",      "multidrugs",      "Citrobacter",
         "65_contigs.fasta",     "APH(6)-Id", "aminoglycosides",      "Pseudomonas",
         "65_contigs.fasta",          "smeE",      "multidrugs", "Stenotrophomonas",
         "65_contigs.fasta",          "smeD",      "multidrugs", "Stenotrophomonas",
         "65_contigs.fasta",          "CpxR",      "multidrugs",      "Pseudomonas",
         "65_contigs.fasta",       "tet(42)",   "tetracyclines",      "Micrococcus",
         "65_contigs.fasta",    "AAC(6')-Iz", "aminoglycosides", "Stenotrophomonas",
         "65_contigs.fasta",         "KHM-1",      "multidrugs",      "Citrobacter",
         "65_contigs.fasta",          "MexF",      "multidrugs",      "Pseudomonas",
         "65_contigs.fasta",          "mefC",      "macrolides",   "Photobacterium",
         "65_contigs.fasta",         "vanRO",   "glycopeptides",      "Rhodococcus",
         "65_contigs.fasta",         "PER-6",      "multidrugs",        "Aeromonas",
         "65_contigs.fasta",          "smeF",      "multidrugs", "Stenotrophomonas",
         "65_contigs.fasta",          "smeE",      "multidrugs", "Stenotrophomonas",
         "65_contigs.fasta",          "smeD",      "multidrugs", "Stenotrophomonas",
         "65_contigs.fasta",   "APH(3'')-Ib", "aminoglycosides",      "Pseudomonas"
         )

Created on 2023-04-24 with reprex v2.0.2

Have you converted this to a graph object yet? Or is that an unfamiliar term ?

No, I haven't done that. It's unfamiliar to me what you're referring to.

Ok. When I get home I’ll start there. Thanks

OK, here's a reprex to start with. I'll write to explain separately.

library(ggraph)
#> Loading required package: ggplot2
library(network)
#> 
#> 'network' 1.18.1 (2023-01-24), part of the Statnet Project
#> * 'news(package="network")' for changes since last version
#> * 'citation("network")' for citation information
#> * 'https://statnet.org' for help, support, and other information
library(tidygraph)
#> 
#> Attaching package: 'tidygraph'
#> The following object is masked from 'package:stats':
#> 
#>     filter

dat <- tibble::tribble(
  ~IDs, ~genes, ~Antibioticos, ~generos,
  "61_contigs.fasta", "AAC(6')-Iz", "aminoglycosides", "Stenotrophomonas",
  "61_contigs.fasta", "mexW", "multidrugs", "Pseudomonas",
  "61_contigs.fasta", "mexK", "multidrugs", "Pseudomonas",
  "61_contigs.fasta", "catB11", "Phenicols", "Klebsiella",
  "61_contigs.fasta", "vanRO", "glycopeptides", "Rhodococcus",
  "61_contigs.fasta", "MexF", "multidrugs", "Pseudomonas",
  "61_contigs.fasta", "smeD", "multidrugs", "Stenotrophomonas",
  "61_contigs.fasta", "smeE", "multidrugs", "Stenotrophomonas",
  "61_contigs.fasta", "smeF", "multidrugs", "Stenotrophomonas",
  "61_contigs.fasta", "mexW", "multidrugs", "Pseudomonas",
  "61_contigs.fasta", "tet(42)", "tetracyclines", "Micrococcus",
  "61_contigs.fasta", "MexE", "multidrugs", "Pseudomonas",
  "61_contigs.fasta", "vanRO", "glycopeptides", "Rhodococcus",
  "61_contigs.fasta", "KHM-1", "multidrugs", "Citrobacter",
  "61_contigs.fasta", "APH(3'')-Ib", "aminoglycosides", "Pseudomonas",
  "61_contigs.fasta", "APH(6)-Id", "aminoglycosides", "Pseudomonas",
  "62_contigs.fasta", "mexQ", "multidrugs", "Pseudomonas",
  "62_contigs.fasta", "PER-6", "multidrugs", "Aeromonas",
  "62_contigs.fasta", "smeF", "multidrugs", "Stenotrophomonas",
  "62_contigs.fasta", "KHM-1", "multidrugs", "Citrobacter",
  "62_contigs.fasta", "MexD", "multidrugs", "Pseudomonas",
  "62_contigs.fasta", "MexF", "multidrugs", "Pseudomonas",
  "62_contigs.fasta", "mphG", "macrolides", "Photobacterium",
  "62_contigs.fasta", "AAC(6')-Iz", "aminoglycosides", "Stenotrophomonas",
  "62_contigs.fasta", "MexF", "multidrugs", "Pseudomonas",
  "62_contigs.fasta", "sul1", "sulfonamides", "Vibrio",
  "62_contigs.fasta", "vanRO", "glycopeptides", "Rhodococcus",
  "62_contigs.fasta", "RbpA", "macrolides", "Mycobacterium",
  "62_contigs.fasta", "smeD", "multidrugs", "Stenotrophomonas",
  "62_contigs.fasta", "mexW", "multidrugs", "Pseudomonas",
  "62_contigs.fasta", "rpoB2", "macrolides", "Nocardia",
  "62_contigs.fasta", "tet(42)", "tetracyclines", "Micrococcus",
  "63_contigs.fasta", "mexK", "multidrugs", "Pseudomonas",
  "63_contigs.fasta", "PER-6", "multidrugs", "Aeromonas",
  "63_contigs.fasta", "vanRO", "glycopeptides", "Rhodococcus",
  "63_contigs.fasta", "MexF", "multidrugs", "Pseudomonas",
  "63_contigs.fasta", "PER-2", "multidrugs", "Salmonella",
  "63_contigs.fasta", "AAC(6')-Iz", "aminoglycosides", "Stenotrophomonas",
  "63_contigs.fasta", "RbpA", "macrolides", "Mycobacterium",
  "63_contigs.fasta", "mexW", "multidrugs", "Pseudomonas",
  "63_contigs.fasta", "PER-6", "multidrugs", "Aeromonas",
  "63_contigs.fasta", "mefC", "macrolides", "Photobacterium",
  "63_contigs.fasta", "mtrA", "multidrugs", "Mycobacterium",
  "63_contigs.fasta", "CpxR", "multidrugs", "Pseudomonas",
  "63_contigs.fasta", "mexW", "multidrugs", "Pseudomonas",
  "63_contigs.fasta", "MexF", "multidrugs", "Pseudomonas",
  "63_contigs.fasta", "MexE", "multidrugs", "Pseudomonas",
  "63_contigs.fasta", "APH(3'')-Ib", "aminoglycosides", "Pseudomonas",
  "63_contigs.fasta", "MexD", "multidrugs", "Pseudomonas",
  "63_contigs.fasta", "mtrA", "multidrugs", "Mycobacterium",
  "63_contigs.fasta", "smeF", "multidrugs", "Stenotrophomonas",
  "63_contigs.fasta", "smeE", "multidrugs", "Stenotrophomonas",
  "63_contigs.fasta", "smeD", "multidrugs", "Stenotrophomonas",
  "63_contigs.fasta", "EreD", "macrolides", "Riemerella",
  "63_contigs.fasta", "vanRO", "glycopeptides", "Rhodococcus",
  "63_contigs.fasta", "KHM-1", "multidrugs", "Citrobacter",
  "63_contigs.fasta", "smeE", "multidrugs", "Stenotrophomonas",
  "63_contigs.fasta", "smeD", "multidrugs", "Stenotrophomonas",
  "63_contigs.fasta", "RbpA", "macrolides", "Mycobacterium",
  "63_contigs.fasta", "tet(42)", "tetracyclines", "Micrococcus",
  "64_contigs.fasta", "mexQ", "multidrugs", "Pseudomonas",
  "64_contigs.fasta", "MexF", "multidrugs", "Pseudomonas",
  "64_contigs.fasta", "tet(42)", "tetracyclines", "Micrococcus",
  "64_contigs.fasta", "mtrA", "multidrugs", "Mycobacterium",
  "64_contigs.fasta", "aadA27", "aminoglycosides", "Acinetobacter",
  "64_contigs.fasta", "vanRO", "glycopeptides", "Rhodococcus",
  "64_contigs.fasta", "tet(42)", "tetracyclines", "Micrococcus",
  "64_contigs.fasta", "vanRO", "glycopeptides", "Rhodococcus",
  "64_contigs.fasta", "MexF", "multidrugs", "Pseudomonas",
  "64_contigs.fasta", "mexW", "multidrugs", "Pseudomonas",
  "64_contigs.fasta", "APH(3'')-Ib", "aminoglycosides", "Pseudomonas",
  "65_contigs.fasta", "mexW", "multidrugs", "Pseudomonas",
  "65_contigs.fasta", "KHM-1", "multidrugs", "Citrobacter",
  "65_contigs.fasta", "APH(6)-Id", "aminoglycosides", "Pseudomonas",
  "65_contigs.fasta", "smeE", "multidrugs", "Stenotrophomonas",
  "65_contigs.fasta", "smeD", "multidrugs", "Stenotrophomonas",
  "65_contigs.fasta", "CpxR", "multidrugs", "Pseudomonas",
  "65_contigs.fasta", "tet(42)", "tetracyclines", "Micrococcus",
  "65_contigs.fasta", "AAC(6')-Iz", "aminoglycosides", "Stenotrophomonas",
  "65_contigs.fasta", "KHM-1", "multidrugs", "Citrobacter",
  "65_contigs.fasta", "MexF", "multidrugs", "Pseudomonas",
  "65_contigs.fasta", "mefC", "macrolides", "Photobacterium",
  "65_contigs.fasta", "vanRO", "glycopeptides", "Rhodococcus",
  "65_contigs.fasta", "PER-6", "multidrugs", "Aeromonas",
  "65_contigs.fasta", "smeF", "multidrugs", "Stenotrophomonas",
  "65_contigs.fasta", "smeE", "multidrugs", "Stenotrophomonas",
  "65_contigs.fasta", "smeD", "multidrugs", "Stenotrophomonas",
  "65_contigs.fasta", "APH(3'')-Ib", "aminoglycosides", "Pseudomonas"
)

pair1 <- dat[2:3]
pair2 <- dat[3:4]
fromto <- c("from","to")
edge_list1 <- unique(pair1)
edge_list2 <- unique(pair2)
colnames(edge_list1) <- fromto
colnames(edge_list2) <- fromto
edge_list <- rbind(edge_list1,edge_list2)
 
ntw <- network(edge_list,
          matrix.type = "edgelist",
          ignore.eval = FALSE)

graph <- as_tbl_graph(ntw)
graph
#> # A tbl_graph: 48 nodes and 44 edges
#> #
#> # A directed acyclic simple graph with 5 components
#> #
#> # A tibble: 48 × 2
#>   na    name      
#>   <lgl> <chr>     
#> 1 FALSE AAC(6')-Iz
#> 2 FALSE mexW      
#> 3 FALSE mexK      
#> 4 FALSE catB11    
#> 5 FALSE vanRO     
#> 6 FALSE MexF      
#> # ℹ 42 more rows
#> #
#> # A tibble: 44 × 3
#>    from    to na   
#>   <int> <int> <lgl>
#> 1     1    28 FALSE
#> 2     2    29 FALSE
#> 3     3    29 FALSE
#> # ℹ 41 more rows
ggraph(graph, layout = 'fr') + 
  geom_edge_link() + 
  geom_node_point() +
  theme_void()
#> Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

The plot in my reprex is a representation of a graph object. What's that?

A graph is a representation of relatedness between nodes that are connected by edges. In the source data, the gene, antibiotic and genus columns contain the nodes. An edge is the occurrence of one node with the adjacent node on its right in one row. Thus, genes are connected with antibiotics and antibiotics are connected to genuses. This is a unidirectional graph because all the connections go one way—genes have a relationship to antibiotics and antibiotics have a relationship to genuses. The first thing to do is to determine if that is a sensible abstract representation of the subject matter in the research setting. Antibiotics affect genus elements in the presence of genes. Leaving aside possibilties for epigenesis, neither antibiotics nor genus elements affect genes. Nor, we assume, does the genus affect the antibiotic in the sense of changing its properties; the process goes the other way around.

There are bidirectional graphs in which nodes may have reciprocal relationships and multidirectional graphs where it's all over the place.

I said before that this is a graph object, and that's true in two senses. In the first sense, everything in R is an object—single characters, data frames, functions, operators. In the second sense, it is something that has mathematical properties. Among these

  • Betweenedness
  • Cliques
  • Communities
  • Connectedness
  • Centrality
  • Degree
  • Distance

Visualization may, or may not, be an effective way of examining these properties. Some graphs are so dense that it is impractical to see much. This paper discusses some methods by which dense networks may be treated to make a visual representation feasible. If the data in the reprex is the complete data, this should not be necessary for your purposes. Still, you will need carefully to consider and articulate the properties of the data that the graph is intended to elucidate.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.