For Q1: you can compute them but AFAIK the bootstrap is the only resampling method that has been theoretically shown to have the type of statistical properties that allow you to do those computations with validity.
Q2: You can "pick the winner" using the resampling statistics. That is what I typically do. There is the issue of optimization bias that may lead to some overfitting. IMO it is a real but negligible bias in most data sets. Your milage may vary.
Q3: If you are trying to get percentile intervals for statistics, you'll need to use a very large number of resamples (since we are trying to estimate the area of the distributions tail). For model tuning, I usually use 25-50, depending of the characteristics of the data set. In this application, we are trying to get an estimate of the distribution mean with reasonable uncertainty.
Q4: Never. Leave the test set alone until you really need it at the end. Do not train or optimize with it.
Q5: For intervals, we do compute it. We can get rid of it if we want and certain intervals need it. We wouldn't re-run the resamples again for different types of intervals with the option changed.
TBH there are a lot of resources out there in regards to model tuning. Here are some on-line ones that may help FES, chapter 3 and TMwR. There are many others.