Caching in CI is a double-edged sword: it speeds up builds dramatically, but if it’s wrong, it silently corrupts your tests.
Let’s watch a real CI job using GitHub Actions to build a Node.js project, specifically focusing on caching the node_modules directory.
name: CI with Caching
on: [push]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Cache node modules
uses: actions/cache@v3
id: cache-nodemodules
with:
path: '**/node_modules'
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- name: Install dependencies
if: steps.cache-nodemodules.outputs.cache-hit != 'true'
run: npm ci
- name: Run tests
run: npm test
Here’s what’s happening:
actions/checkout@v3pulls your code.actions/cache@v3is the star. It looks for a cache entry matching thekey.- The
keyis constructed from the OS (runner.os) and a hash of yourpackage-lock.json(hashFiles('**/package-lock.json')). This means if your dependencies change, the key changes, and a new cache is created. restore-keysprovides fallbacks. If an exactkeymatch isn’t found, it tries to find a cache with a prefix likeubuntu-node-. This is crucial for whenpackage-lock.jsonhasn’t changed but therunner.osmight have (though less common on a single runner type).
- The
- The
Install dependenciesstep only runsnpm ciif the cache wasn’t hit (steps.cache-nodemodules.outputs.cache-hit != 'true'). If the cache was hit,npm ciis skipped, andnode_modulesis restored from the cache. npm testthen runs against thenode_modulesthat are either freshly installed or restored from cache.
The "cache hit" output from the actions/cache step is your primary indicator. It will say Cache restored from key ... for a hit, or Cache not found for key ... for a miss.
The mental model is simple:
- Cache Hit: The
node_modulesdirectory is exactly as it was when the cache entry was created. This is the fast path.npm ciis skipped. - Cache Miss: The
node_modulesdirectory is either empty or doesn’t match thekey.npm ciruns, installs everything, and then this new state is uploaded as a cache entry for future runs. - Cache Expiry: Caches aren’t infinite. GitHub Actions has a retention policy (typically 7 days for workflow caches). After that, they are automatically deleted. Your
keygeneration andrestore-keysstrategy are how you manage this.
This setup ensures that npm ci (which is generally faster and more deterministic than npm install) runs only when necessary. When it does run, it produces a clean, reproducible node_modules state that can then be cached.
The one thing most people don’t realize is how sensitive the key is. A change in any file that hashFiles is watching will invalidate the cache. This is usually package-lock.json or yarn.lock, which is good! But if you’re caching other things (like build artifacts), be incredibly precise with your hashFiles glob patterns. A too-broad pattern will cause unnecessary cache misses. For example, hashFiles('**/*') is almost always wrong for caching dependencies.
The next problem you’ll hit is cache invalidation leading to stale dependencies.