So I have some data which talks about unique visits for each site. Now, these have overlaps since people visit multiple websites. SO a user who visited site1
could have also visited site3
and site4
. So for the unique visits in site1 might be 20M
instead of 23M
. So in order to remove the overlaps, we did get a percentage overlap between each pair of websites. Based on this how can I calculate the actual unique visits for each website .
tibble::tribble(
~X1, ~unique_visits, ~site1, ~site2, ~site3, ~site4, ~site5, ~site6, ~site7,
"site1", 23873274, 100, 96.1, 95.6, 95, 91.6, 96.6, 92.9,
"site2", 4249486, 54.1, 100, 46.2, 46.5, 56.9, 77.5, 43.2,
"site3", 887786, 47.2, 40.5, 100, 41.5, 38.5, 55.3, 85,
"site4", 3727497, 41.7, 36.3, 36.9, 100, 38.6, 56.3, 51.5,
"site5", 1833995, 23.7, 26.2, 20.2, 22.8, 100, 28, 87.5,
"site6", 1617476, 5.1, 7.3, 5.9, 6.8, 5.7, 100, 2.9,
"site7", 760829, 0.2, 0.2, 0.4, 0.3, 0.8, 0.1, 100
)