 # Determing valid directed graph selections

Hello,

There are two things I am hoping to get right:

1. I have the following data below. I want to be able to cut invalid `selection` columns. Invalid selection columns are those that contain both "a -> b" and "b -> a" as an example, i.e. if the source = target and the target = source for two path combinations then it shouldn't be valid as it can only be directed into a single direction between those nodes. See example data below:
``````df1 <- data.frame(
stringsAsFactors = FALSE,
source = c("a", "b", "c", "d", "e", "c", "a", "e", "a"),
target = c("b", "c", "d", "e", "a", "b", "d", "d", "e"),
selection1 = c(1, 0, 1, 0, 0, 1, 1, 0, 0),
selection2 = c(1, 1, 0, 0, 1, 1, 0, 0, 1),
selection3 = c(1, 1, 1, 1, 1, 0, 0, 0, 0)
)

``````
1. Then I also have a second problem which I believe to be more complicated and I am hoping there is a package for this. I want to exclude selections that have contradictory self reference i.e. a->b b->c c->a. This combination is essentially a triangle and our start and end point is the same node. This condition shouldn't be true for combinations. So a -> b and c-> a would be fine and b->c with either path 1 or 3 but not the combination of all three. These toy problems are relatively simple but I would want them to work for complicated dataframes too.
``````df2 <- data.frame(
stringsAsFactors = FALSE,
source = c("a", "b", "c", "a", "b"),
target = c("b", "c", "a", "d", "d"),
seection1 = c(1, 1, 1, 1, 0),
seection2 = c(1, 0, 1, 0, 0),
seection3 = c(1, 0, 0, 0, 1)
)
``````

Any help would be massively appreciated!

to start of I would do something like :

``````df1 <- data.frame(
stringsAsFactors = FALSE,
source = c("a", "b", "c", "d", "e", "c", "a", "e", "a"),
target = c("b", "c", "d", "e", "a", "b", "d", "d", "e"),
selection1 = c(1, 0, 1, 0, 0, 1, 1, 0, 0),
selection2 = c(1, 1, 0, 0, 1, 1, 0, 0, 1),
selection3 = c(1, 1, 1, 1, 1, 0, 0, 0, 0)
)

(identify_possibles <- group_by(
df1,
selection1,
selection2,
selection3
) %>% group_modify(
~  if (nrow(.) > 1) {data.frame(.)}  else { data.frame() }
) )
``````

as some selections are unique and so dont have source-target combinations to assess - best strip them out and focus on those that do. the above code produces a new (grouped) table of the selections worth further investigating.

Then I would try to integrate my solution here:

conceptually it seems the same its just that first name is source and last name is target.

good luck

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

Hello,

I will have a look thanks I have realised I what I am looking for relates to graph theory and is called `connected` and `acyclic`. Within the package `semPLS` they perform these checks as functions on the inserted model but I want to evaluate those combinations before they can even be inserted into the model object (as I intend to run thousands of combinations) and I would like to reduce the set before pushing it to the model. I tried to extract those functions from the package but they are not straightforward to extract and setup 