Questions about "which algorithm" to use are extremely problem specific and are very difficult to answer from a distance. Things to think about:
- How much data do you have? How many features?
- What is the balance of classes?
- What is the cost of type 1 vs type 2 errors?
And that's really just the beginning.
If by "2000+" you mean there are 2,000 documents in your corpus, one of the only things that jumps out at me is that that is probably too little data for a neural network. It's difficult to say for sure without knowing precisely how you're building the model, but, for example, many simple bag of words models have well over 2,000 features. With powerful non-linear estimators like deep neural nets (or even random forest), it's very easy to overfit when you have a lot of features to work with and few examples.