Have you ever wondered why it takes minutes to install npm dependencies on your CI server, even when they should be fully cached? We have experienced such an issue, taking more than 6 minutes
to install the npm dependencies, even when they haven't changed between builds, and being fully cached by the CI service.
This short post will analyze a case study to detect slow dependencies during installation and debug the source of the problem.
Case study
Before diving into the problem, let's describe our context to understand the premises. We've experienced long installation times for our dependencies on the continuous integration (CI) server. However, the problem wasn't the CI itself, as we could also reproduce the same problem locally.
- We're using npm for dependencies management.
- We're executing
npm ci --production
to install dependencies on the CI. - We have properly set up the cache folder for npm.
npm ci
command performs a clean install, deleting any existing node_modules
folder and installing dependencies based solely on the exact versions found in the package-lock.json
file.Additionally, the
--production
flag will skip installing devDependencies
, which for us was a micro-optimization.We've noticed that the dependencies install time was surprisingly high, more than 6 minutes. However, when the dependencies are cached and haven't changed from the previous build, they should have been copied from the cache, which would result in short install times.
However, this wasn't working as expected.
Detecting slow dependencies
To get to the root of the problem, we needed a way to tap into the npm installation process and collect some cold metrics. My colleague Nicu Surdu found a neat package called slow-deps
built by Nolan Lawson, which is a:
CLI to measure which dependencies in a project are the slowest to npm install.
Running npx slow-deps
on the repository will output the install time for all dependencies in package.json
, sorted from slowest to fastest:
$ npx slow-deps
npx: installed 86 in 15.755s
Analyzing 170 dependencies...
---------------------------------------------------------------
| Dependency | Time | Size | Deps |
---------------------------------------------------------------
| react-toolbox | 7m 42.9s | 4.6 MB | 12 |
| react-scripts | 1m 12.8s | 149 MB | 1037 |
| @storybook/react | 1m 5.9s | 133 MB | 791 |
| @storybook/addon-essentials | 56.0s | 141 MB | 816 |
| craco-esbuild | 16.2s | 34 MB | 201 |
| cypress | 10.0s | 17 MB | 157 |
# ...
In our case, the output appalled us: one of the smallest packages, with only 4.6MB
, took the longest time to install: 7m 42.9s
. react-toolbox
is an old and discontinued components library for React, which we used in our project.
Now the question was, what is so special about react-toolbox
?
The problem
So far, we know what exactly is causing the problem, but we don't know why. So another one of my colleagues, Cristian Diaconescu, took a peek at our package.json
file, where he discovered something interesting.
Apparently, react-toolbox
was installed directly from a github repository, not from the npm registry. The reason for this is that the package was abandoned in the past, so we had to fork it and maintain our own version of the library.
{
"dependencies": {
"@storybook/addon-essentials": "6.4.8",
"@storybook/react": "6.4.8",
"craco-esbuild": "0.5.0",
"cypress": "9.5.0",
"react-scripts": "4.0.3",
"react-toolbox": "github:********/react-toolbox#9d38c6"
}
}
Now, there are 2 potential problems here:
- First of all, npm didn't cache the dependency. Subsequent installs should have been quite fast, regardless of how long the first download takes.
- Secondly, downloading the package is way too slow. This could indicate a problem with the Github repository.
As we couldn't find any documentation regarding the lack of caching for dependencies loaded directly from Github, we took a deeper look at the second problem.
Verbose logsCristi also ran a verbose npm install to get more details regarding what's happening during the installation of the package:
npm install -ddd
# -ddd flag outputs the most verbose log
Searching for something meaningful, we've found an intriguing clue in the output:
[...]
Retrying git command: ls-remote -h -t git://github.com/********/react-toolbox.git attempt # 2
Retrying git command: ls-remote -h -t git://github.com/********/react-toolbox.git attempt # 3
git manifest for react-toolbox@github:********/react-toolbox#9d38c6 fetched in 394569ms
[...]
There were some failed attempts to read the repository using git ls-remote
command. A third colleague, namely Florin Sarghe, ran the same command in isolation, confirming that, indeed, this was the bottleneck:
> git ls-remote -h -t git://github.com/********/react-toolbox.git
# 🚫 fatal: unable to connect to github.com: errno=Operation timed out
Notice that we used the git://
protocol for the repository url. However, Florin also discovered that running the same command, but using the https
protocol instead, resulted in an instant response:
> git ls-remote -h -t https://github.com/********/react-toolbox.git
# ✅ instant response
Optimization
Fortunately, we didn't share the library across multiple projects in different repositories. Therefore, we could move the entire library inside our repository since it was already forked and maintained internally.
Surprisingly, after this simple refactoring, we have witnessed a huge speed boost, from more than 6:30 minutes
down to 30 seconds
, as we expected in the first place.
Takeaways
In case you're experiencing slow npm install command, locally or on your CI, there are some easy ways to debug potential bottlenecks:
- Run
npx slow-deps
on your repository to get an idea regarding which packages are slow to install. - Use
npm install -ddd
if you need verbose logs and timings during the installation process. - In case you're using git urls for the slow dependencies with the
git://
protocol, try switching tohttps://
. - Use
git ls-remote -h -t <repository-url>
to confirm that the optimization actually worked.