Branching strategy is not a remedy for instability
4 years, 5 branching strategies. First we worked all in one branch. Then we became hyper-branched. Then we consolidated into a couple branches. Switched companies. First we were all in one branch. Now we're splitting into branches.
This has all been in Perforce since it is the de-facto SCM system for the games industry. But if we were using DCVS we'd probably have the same issues. The problem has not been merging changes. So DCVS is not the answer here (though I love DCVS).
I've been through this at two companies and have read about the experiences and strategies of other companies. I've found one constant across the differences in companies and strategies:
Branching strategy changes are in response to the instability that follows fast growth.
You cannot simply take a working model of how some project manages its branches, apply it to your studio, and be done with it. In fact, you cannot seek out or design an "ideal" branching strategy for your studio that is going to fix your instability problems. Why?
Branching is not designed to fix code instability.
Branching is a way to isolate changes and manage a release. It allows a much more flexible and intuitive use of version control by both developers and the studio, and allows sane release management. The DCVS branching model has proven itself and now we're stuck trying to figure out how to get something similar in SCM systems like Perforce. But this is largely orthogonal to the problem of code instability.
You can keep unstable code in a branch, but it does nothing to fix the instability. You can require developers to run smoke tests, but they're still going to integrate broken stuff, and they even get less 'free QA' while in their branch. We can put everyone on their own branch, or group teams on branches, or whatever strategy you want to come up with, and I don't think any are guaranteed to work for your studio. Furthermore, studios change people and size, so what works one year may not work the next.
Yet we put so much effort into branching strategy as a way to solve these problems. We design a system for how the branches are laid out. We make some tools for creating and managing branches. We focus communication and training on how people people are supposed to work. Yet branching is not and should not be the way we actually fix the problems that caused the instability that caused us to change our strategy.
How do I know this? Because with every change in strategy, there is a much less prominent component at work.
Infrastructure and automated testing are coincidentally improved when we change branching strategies.
I don't think anyone doesn't consider these two things important for improving code stability. It is just that I think they're almost totally responsible. I think that if you were to trace the successes of people's branching experiments, they'd be completely dependent upon when their automated testing and infrastructure (like continuous integration and better messaging) turned a corner and became robust. So the fact that Strategy D worked is because the improvements to testing and infrastructure made from A to B, B to C, and C to D, have accumulated to where you have far less instability problems.
So what's my beef with branching, or more specifically, changing strategies?
I don't have any. I think there are, definitely, better and worse ways to do things. My problem is when we focus on branching strategies as the most important part of the instability solution. My problem is that we document, educate, build in order to support branching. We talk about "how we are going to be working in branches," rather than "how we are going to build testable systems and get legacy code under test." We put our resources behind developing tools and fixing the fallout of branching, instead of making a focused education and cleanup effort towards getting things into a more testable state (which often includes the testing infrastructure as much as it means the application code).
Imagine if every time you heard 'branching' it was replaced with 'testing/infrastructure,' my guess is you've never heard managers talking about testing and infrastructure that much. Unfortunately you are unlikely to, because branching is an easy problem to think about. It is a chess board. No real work, personalities, real-world spikes. Just figuring out how to best move around your pieces in a theoretical way.
When you're creating infrastructure, it isn't a chess board. It is a world of incremental changes, no glamour, making do with the bare minimum, all on mission-critical systems that have countless tentacles. It isn't the world of a plumber, it is the world of a septic tank diver.
But the real reason you're not likely to see branching effort replaced with testing and infrastructure effort is because to do so can require a huge cultural and educational shift at a studio. Good luck teaching dozens of really smart developers who have decades of experience on successful projects that their code isn't sufficient anymore, that you want to use your new fangled techniques that have actually proven successful in the rest of the development world. Those conversations aren't why people become managers.
But mark my words, if you have a studio where testing is a fact of life, where it is not just an ideal but a requirement, where your infrastructure and developer systems are well understood, documented, extensible, and reliable, you are going to see very little code instability, regardless of what your branching strategy looks like.
If you're thinking about changing how you branch, consider instead if all of that effort is spent on turning your codebase into something testable, your infrastructure and systems into something widely usable and reliable. If you want to achieve stability, you are going to have to do it anyway. The question is, do you do it as a side effect and keep taking the painful medicine of changing branches strategies to keep getting the side effect, or do you do the much more difficult thing in the short term and approach your instability problem head-on, through building, and creating a culture of, testing and infrastructure?