# Clustering algorithm: optimizing spatial points pairing

Hi,

I am currently facing a clustering procedure issue.
I have been digging into different kind of clustering algorithms but can't find the appropriate one to solve my problem.

To simplify the situation, here is what I am trying to do: let's say I have points defined with GPS coordinates.

• I have n points of type A
• and m points of type B, with m>n.

I want to build pairs of A and B, so that each type-A point would be paired with a close type-B point, additionnal B points would be left alone. I want to apply an algorithm that would optimize the formation of pairs so that mean A-B distances would be minimized.

Any idea of a function/package that would work?

Than you

Use a k nearest neighbor function, knn in one of the many R packages with this functionality.

While you can surely use clustering algorithms to maybe speed up the process, what you have is an optimization problem.
I don't think any out-of-the-box clustering function or package will get you what you want.
can B be paired multiple times or just once?

I would try to formulate this as an optimization problem and then use one of the optimizer packages.

1 Like

Yes it is typically an optimization problem, but I have no idea what I should use to solve it.

A and B can't be paired several times.

Once you've settled on your optimization method, check out CRAN Task View on Optimization.

I think you are jumping a couple of steps by wondering what package or solver to use.
First, figure out the formulation, then method, then package & solver.

What I see is a possible IP formulation

Minimize:
\sum d_{ij} x_{ij}
where d_{ij} are the distances, x_{ij} are binary decision variables (1 if you use that pairing), i indexes the A's, and j indexes the B's.

Constraints:
\sum_{j} x_{ij} = 1 \; \; \forall i so that A is paired but only once
\sum_{i} x_{ij} \le 1 \; \; \forall j so that B is only paired 0 or 1 time
assuming N < M where N, M are the index size of i,j respectively.

I did this very quickly so take this with a huge grain of salt.
I hope that gives you an idea on how to start the problem.