Is it a bad idea to stack multiple ensemble models like random forest and GBM even if I got good results?
Your question implies you think it might be. Why is that ?
I was told about this by a colleague. that having more than one ensemble model can cause bias.
got me a bit confused, so I thought I need to check.
You may be referring to this ? Bias–variance tradeoff - Wikipedia
For sure, you don't want to overfit, but you should test for that, and if your stack underperforms on holdout then you wouldnt consider it good ( I hope) . (but if its good, then its good... )
Let's say if the stack had an r-squared for the training of 0.98 and 0.9 for testing, is that considered underperformance? or should the testing r-squared be significantly much lower than that of the training to consider that to be underperformance?