When your team is too big
How we got there
Before splitting, our team had 3 engineering managers and 19 engineers, excluding other cross-functional roles like PM and design. The team organically grew in order to support the needs of our company’s new app development platform. All of us engineering managers worked in unison to plan and execute a single roadmap, while each of us supported a few engineers along the reporting lines.
We executed projects in smaller squads, with each manager supporting one or two such squads. We were executing 3–5 concurrent projects at any given time, with each project assigned to one of the engineering managers and run by a feature lead. Each project had its own rituals (standup, planning, and retro), but additionally we had team-wide bi-weekly standups and health monitors in order to stay in touch.
The projects were in different shapes and forms, with no clear way to categorize them in independent roadmaps. Our team maintained more than a dozen micro-services across a few mono-repos, each having their own unique development history and technical complexities. To summarise the situation, we had a fair bit of context and cognitive load to get our heads around.
Enter team health monitor
Although the management believed this team will ultimately be broken into multiple teams, but there was no sense of urgency for doing it sooner rather than later, when things seemed to be working and projects were being delivered.
My company doesn’t have any rules around the team size, like the ‘two-pizza team’ rule 🍕, therefore it’s up to the teams to find the right structure, and advocate for change if the current structure no longer works for them.
What made us confident there is a problem worth solving were health monitor discussions. Team cohesiveness was a recurring theme, manifesting in different forms based on the context of that week. Here are some of the discussion points:
I feel very connected to my immediate stream, but for the rest of the team I can barely remember who’s working on what.
Knowledge seems very spread out, so sometimes it takes a while to figure out how to do things.
Everyone is responsible for everything / nothing. Responsibilities are temporary, no-one is taking responsibility to support own code.
This is what I reported after running the last health monitor:
The “big team” problem comes up over and over again, which dominated this discussion as well. I see no point in repeating health monitor until we have resolved it.
It triggered a set of discussions that eventually led to a team split in a few weeks. This seemed like a very interesting topic to me, so very quickly I found myself in the center of it, pushing for finding a solution. Everyone involved was supportive of the idea. The only thing left to figure out was “when” and “how”.
The stepping stone
A few months back we were in a state where one manager would pick up a project and assemble a team based on who was available and had the skill set. Being aware of the possibility of a team split, the three of us had already started aligning projects with reporting lines over a few months, so people reporting to the same manager would work together on the same project. A live “project allocation” document was very useful in planning our way towards the desired state.
We were very close to the desired state when the team split discussions became serious. This meant we had a fairly balanced teams already and the team split would have no immediate impact on the day-to-day work of our engineers. We only needed to figure out the long term mission and area of responsibilities for each of these teams.
I found the Team Topologies book very useful at this time. It didn’t give me a breakthrough idea, but provided a good language to talk about the problem. Understanding fundamental toopologies and different interaction models was not only useful here, but helped me identify possible miscommunication issues in other situations.
We deployed an inverse Conway maneuver to form teams around our ideal architecture. However, we didn’t put a heavy weight on what was in the “wish list” of our architecture, i.e. proposals we were not so sure about and even if we were wouldn’t be implemented in the next year or two.
We chose software boundaries to match the team’s cognitive load. We had grown so much that no-one would be able to effectively understand all different flows in our platform. We split those flows along with the services that backed them up, in order to increase the team-scoped flow and reduce the (unnecessary) communication between teams. This split utilized the natural seams in our software, so that high-scale low-latency APIs were mostly separated from high-consistency transactional APIs.
What I learned along the way
I can recall a few useful lessons I learned along the way that can be applied in similar situations:
- Be flexible and put progress over perfection. You can spin around all the time figuring out the best split, but by the time you execute it the world has changed. You can get a better buy-in from key stakeholders if you plan to be flexible and iterate on your plan based on feedback. A good sign of being flexible is keeping a few decisions open, pending further validation.
- Applying inverse Conway maneuver, involve senior engineers and architects, but don’t overindex on the ideal architectural state that is not in sight yet. It might diminish the immediate impact of the split in reducing the cognitive load.
- You don’t need to solve everything at once. Initially we didn’t plan to split the team’s roadmap and new teams worked off a single priority list determined by a single PM, to be able to honour our previous commitments. We are now starting to see a more specific mission and an independent roadmap for each team.
- The incremental approach to align projects with people worked very well for us. At first it required some effort to apply this “artificial” constraint and fight the obvious choices for upcoming projects, but soon we started to see the immediate benefits and it became easy to apply.
- Going through your current year’s roadmap and next possible projects will provide enough confidence that none of these teams will run out of work soon. Don’t try to overindex on a balanced roadmap, instead look for themes that will continue in the future. Goals and objectives can change very quickly.
- Timing turned out to be very important at the end. Our changes helped shape a bigger organisational change happening around us at the same time. Our teams came out of a bigger re-org basically unchanged, in their best possible state to execute their plans.
- It pays to own a problem. I was involved in conversations around organisation changes outside and above my level, which happened only because I had more context. I had the opportunity to ask questions and provide feedback when those plans were still in draft. Of course, my peers and managers were kind and supportive to give me this visibility.
Thanks for reading all the way up to here. Let me know what you think in the comments. Here is a cat photo as a reward. It’s actually a rare photo of the Schrödinger’s cat.