Stacking RF and GBM

Is it a bad idea to stack multiple ensemble models like random forest and GBM even if I got good results?

Your question implies you think it might be. Why is that ?

I was told about this by a colleague. that having more than one ensemble model can cause bias.
got me a bit confused, so I thought I need to check.

You may be referring to this ? Bias–variance tradeoff - Wikipedia

For sure, you don't want to overfit, but you should test for that, and if your stack underperforms on holdout then you wouldnt consider it good ( I hope) . (but if its good, then its good... )

Let's say if the stack had an r-squared for the training of 0.98 and 0.9 for testing, is that considered underperformance? or should the testing r-squared be significantly much lower than that of the training to consider that to be underperformance?

Thank you