Modern Software Engineering Collaboration Tools
Weaving Technology and Culture to Write Correct Software Fast
Software development is the weaving of technical and cultural systems to deliver a product. Writing code is the easy part – the complexity comes with customers and collaboration: requirements, testing, deploying, task tracking, and conveying updates. Although tangential to the “real work” (read as: product code) of making a product, they are just as critical of part of a product’s success, with drastic effects on the pace and chance of success. As an example, as anyone whose worked with a moderately complicated software project can tell you, fighting a bad system can be like pulling teeth. I better hope you’re ready for a coffee break after rm -rf’ing the output directory just to get it to rebuild correctly! Or the all too common hour (or longer!) runtime for continuous integration, often resulting in the dreaded “master failed to build” as changes are merged before being fully tested. However, neither of these are the individual fault of the technical system – cultural norms around technical expertise (knowing the tools) as well as process controls (controlling how changes get into the shared project) are just as important to writing highly reliable software.
Investing in good systems is the difference between a highly effective or an incredibly frustrating software development experience. A common trope in software development is “Adding manpower to a late software project makes it later”, coined by the venerable The Mythical Man Month by Fred Brooks. This suggests that additional communication overhead from more contributors overwhelms their marginal utility, resulting in an overall decrease in pace, contrary to conventional expectations. As Brooks’ claims, this is a good “zeroth-order approximation” with many caveats and highly context-dependent results, however it makes a useful analogy of the second-order effects of scale: Adding complexity to a project (in the context of new tools, components, or just more people) will result in slower progress unless systems are put in place to manage it.
Good systems include both effective technical and social tools. In the technical domain, tools like version control, automated testing, automated deployment and static analysis are all force multipliers to help ensure code is correct, consistent, and solving the problems it sets out to do. However, the best tools are useless without the social structure to ensure they’re used consistently and effectively. Cultural standards like code review, code integration, issue tracking, and status updates (as with the often misused stand-up1) are all important prerequisites for an effective team. In this context, building the correct technical system for an existing (or desired) social system is much more important than building a technically impressive system.
With this context, let’s take a look at the current set of technical tools available:
Build Systems
Generally, these are tools for taking some input (source code) and transforming it into something that can be deployed. Depending on context, this can be many things: Docker containers, DEB/RPM packages, language specific packages (e.g. Python .whl, Javascript npm packages), apps (Android .apk or iOS .ipa packages), OS images, or even source code tarballs.
I’d separate this this into three classes:
Special-purpose systems, often tied to a specific language like Maven (Java), webpack and esbuild (JavaScript), Cargo (Rust), or go build (Go). In most cases, these include language-specific package management as well.
Legacy generic Directed Acyclic Graph (DAG) based build systems such as Make, SCons, and CMake. These all provide an “analysis phase” building a language-independent set of targets (node) with dependencies (edges), along with a “build phase” where commands are run to satisfy each target in order.
Modern DAG build systems like Bazel, Buck, and Pants. Over legacy systems, these provide dependency management, analysis/build caching, build environment isolation, and remote execution. These are designed for large monorepos, which provide some benefits (focus on scalability) and drawbacks (large initial setup cost).
I’m a huge nerd when it comes build systems, and think effective use of whichever you choose is crucial to moving fast without breaking things. Quick feedback on changes (either via compiling, static analysis, or testing) is essential to making changes to a system without fear of breaking things.
Continuous Integration Systems
These are tools for automatically running a set of actions on code, either periodically or on change. Often, these are integrated into a version control / code review system like Github, Gitlab, or Bitbucket. Most of the current options can be reduced down to “something that can run a shell script”. Historically, this configuration was independent of the source code (via some admin console) however with the advent of DevOps / Infrastructure as Code, this is now commonly configured in the source code repository via a domain-specific language (DSL) such as Jenkins’ Pipelines groovy-based DSL or the Github/Gitlab yaml files.
Again, a non-exhaustive list of commonly used options includes Jenkins, CircleCI, Travis CI, Github Actions, Gitlab CI, and Bitbucket Pipelines.
Recently, I’ve been thinking about the overlap between build systems and continuous integration. They both describe a set of actions to run against a source code tree, and often result in duplicate information (X depends on Y, only rebuild X if Y changes). For more reading on this, I’d suggest Gregory Szorc’s blog post Modern CI is Too Complex and Misdirected. In industry, most large tech companies have all landed on similar solutions where DAG-based build systems are tightly integrated into their CI systems2, allowing building and testing only components have changed, rather than every project in the repo. Unfortunately, there’s very few publicly available options that provide this same level of integration. The only commercial offering I’ve seen for this is BuildBuddy workflows, which provides a hosted CI tightly integrated with Bazel.
Version Control Tools
Git has clearly won as the most-used version control tool, as expressed by this graph from the Debian packaging team3:
There are some alternative version control tools, however they’re mostly relegated to niche communities or large companies. Some of the options are:
Mercurial: A tool of similar era and data-structures to Git but without the same level of community adoption.
EdenSCM: Facebook’s Mercurial-inspired VCS written in Rust to support large-scale monorepos.
Piper: Google’s large-scale VCS, historically based on Perforce, although seemingly very custom now. It was hinted at in the article Why Google Stores Billions of Lines of Code in a Single Repository, although with a very limited public presence.
Version control systems are an area that I think has seen very limited open-source innovation in the last decade, and I want to write more about this sometime in the future.
Code Review Systems
Within “Git compatible collaboration tools”, there’s still quite a diverse number of repository hosting and collaboration tools such as GitHub, GitLab, Bitbucket, Gerrit and the now-dead Phabricator (although I hear Facebook’s internal variant is still alive and quite good).
I’d split these into two categories:
Branch-based review, whose main unit of review is the branch and each Pull request (GitHub, Bitbucket) or Merge Request (GitLab) is to merge one branch into another. Code review is often done on the set of changes between the branch and it’s target, which may include multiple commits. Merges can either generate a merge commit (intermixing commit history), or rebase the new commits onto the head of the default branch.
Patch-based review, where the unit of review is the set of changes against the target branch. This is a Patch Set (Gerrit) or Diff (Phabricator). This is functionally equivalent to branch-based review with fast-forward & squash merges, although most branch-based systems poorly support this workflow.
Once again, I’d suggest reading Gregory Szorc’s Problems with Pull Requests and How to Fix Them, which expands on the challenges associated with reviewing in branch-based system that are fixed with a commit-based workflow.
This is one of the most tightly culture-integrated tools, where team norms (number of reviewers, required testing, comment noise vs signal, time-to-review, and time-to-land) drastically affect the day-to-day speed of making changes.
Task Tracking Systems
Project management systems — most notably ticket based systems like Jira — are the core driver of many teams. Many modern code review tools also provide integrated first-party support for issue tracking.
Using ticket tracking systems effectively is something I’m not super familiar with. I’ve experienced plenty of poorly implemented setups that seemed to do more harm than good, so perhaps I’ll think on this more and expand on this in the future.
Summary
In each of the categories exposed here, the most important question isn’t which tools is the absolute best, it is much more important that the tool fits into your cultural system. If your team is ineffective, it’s unlikely that a different set of tools will magically make it better! No tool is stronger than an intentional set of cultural norms that make everyone more effective.
I’m hoping to dive into some of these tools and effective patterns for using them more in future posts. Making an effective team is much more than just picking the best tools.