I am currently facing a clustering procedure issue.
I have been digging into different kind of clustering algorithms but can't find the appropriate one to solve my problem.
To simplify the situation, here is what I am trying to do: let's say I have points defined with GPS coordinates.
I have n points of type A
and m points of type B, with m>n.
I want to build pairs of A and B, so that each type-A point would be paired with a close type-B point, additionnal B points would be left alone. I want to apply an algorithm that would optimize the formation of pairs so that mean A-B distances would be minimized.
While you can surely use clustering algorithms to maybe speed up the process, what you have is an optimization problem.
I don't think any out-of-the-box clustering function or package will get you what you want.
can B be paired multiple times or just once?
I would try to formulate this as an optimization problem and then use one of the optimizer packages.
I think you are jumping a couple of steps by wondering what package or solver to use.
First, figure out the formulation, then method, then package & solver.
What I see is a possible IP formulation
Minimize: \sum d_{ij} x_{ij}
where d_{ij} are the distances, x_{ij} are binary decision variables (1 if you use that pairing), i indexes the A's, and j indexes the B's.
Constraints: \sum_{j} x_{ij} = 1 \; \; \forall i so that A is paired but only once \sum_{i} x_{ij} \le 1 \; \; \forall j so that B is only paired 0 or 1 time
assuming N < M where N, M are the index size of i,j respectively.
I did this very quickly so take this with a huge grain of salt.
I hope that gives you an idea on how to start the problem.