We’ve had some discussion at SCADA MINDS about what the best size of a pull request is, so I’ve been looking at what the internet thinks. Let’s dive in!
People tend to suggest very small pull requests as the optimum. E.g. see [1], [2], [3] that both suggest 2-400 lines for a pull request.
A lot of teams have done their own study of pull request efficiency, but a lot of them also seem to rely on the SmartBeart study, which is worth a read.
In the SmartBear study they tested defect rates and optimum code review sizes during 10 months and 2500 pull requests. Their findings seem to be in line with a lot of the other recommendations, but briefly summarized:
These two metrics put a pretty big damper on how big pull requests can be, at least if they are to be reviewed in one go.
But there’s also a time cost involved with creating multiple smaller pull requests which you potentially can calculate up against the cost of having larger pull requests that go unmerged.
This article does a good job of weighing those two costs up against each other, and end up recommending a pull request size where the key metric is that it shouldn’t take longer than 1 hour to review it.
They have a good point that as your pull request grows, so does the risk that you have implemented the wrong thing. That risk is mitigated by smaller pull requests or factors like good problem understanding, design reviews or pair programming.
Maybe for large established codebases but I don’t think that “400 lines or bust” is a policy that makes sense for all projects. Both because project maturity varies, but also because not all lines of code are identical:
Two examples:
Pull requests in greenfield projects will also tend to be larger - there’s less old code you have to deal with and when building new features from the ground up it might be harder to split them into small well-defined pull requests.
So to sum up - I don’t think pull requests should be less than 400 lines, but I do think they should be on the smaller side.
I think “Time taken to review” is a better metric than lines of code - and I think trying to keep that below an hour is a good rule of thumb.
This is also a balancing act. Programmers famously hate to be interrupted so your colleague shouldn’t bang you on the head with every newly created pull request. Interrupting your work to review code will slow you down.
However leaving your colleagues waiting for too long might slow them down!
Particularly in less-established projects where you might have to continue building your work on the pull request you’ve just created, being blocked by a code review can be frustrating.
For more established projects there’s generally a few more strategies you can employ
To me, two times a day is a good interval for when to review. I like to check for pull requests in the morning and after lunch.
At those times you naturally have a break from all the context you’ve built up in your mind. At these points (or after other natural interruptions) you lose the least attention - so they’re good times to see if there are any pull requests waiting for you.