Research IT

lines of code

Keeping Track of Dependencies

Dependency tracking is one of the tasks that it is easy to forget to do, and yet is important for code of any lasting significance. GitHub has tools available to help you keep up to date to avoid unsupported dependencies and vulnerabilities. Anyone producing code at the University is encouraged to take advantage of these tools so problems due to upstream code are addressed more rapidly.


Once you start to build up any complexity in a software project, you tend to end up depending on other packages. This is extremely prevalent in some language ecosystems (such as Python, JavaScript, Java, C#, and Rust) but is a wider problem, as with that dependency comes the need to ensure that you are tracking appropriate versions of those packages. Of particular note is that fixes for software vulnerabilities in your dependencies often come as updates to those packages.

In general, this is a complex topic (especially if you want to support a range of versions), but there is support for monitoring your graph of dependencies for problems. If your project is on Github, it can make use of a software robot called dependabot that can parse some common project dependency description files and check for updated versions of those dependencies.

You enable this by creating a file, .github/dependabot.yml, that says what it should look for, how often it should look, and (optionally) what it should do about it. And you probably should always include the github-actions ecosystem in your checks, at least if you are doing any CI workflows on Github. As with any software artifacts, workflows can have dependencies and those can be a vector for problems if allowed to go out of date.

Dependabot and Python

This example is from the Python ecosystem:

version: 2 
updates: 
 - package-ecosystem: pip 
   directory: "/" 
   schedule: 
     interval: weekly 
   assignees: 
     - "dkfellows" 
 - package-ecosystem: github-actions 
   directory: "/" 
   schedule: 
      interval: weekly

This defines two update rules, one for pip (that is, Python dependencies listed in a dependencies.txt file, where the definitive list of packages is hosted on PyPI), and the other for Github Actions themselves (which Github knows definitively how to resolve). Both sets of dependencies are checked for updates on an explicitly weekly basis (it’s rare that you need it more frequently than that).

When the robot runs, it parses the list of current dependencies and compares it against the latest versions in the master list, filing Pull Requests to update the version if things are behind. (In our example, the PR is automatically assigned to me for the Python dependencies, and left unassigned for Github Actions. You can also add tags to the PR and other basic operations, though the defaults are often good enough.) This makes it much easier to keep software up to date and to avoid vulnerabilities due to outdated dependencies, which is a major vulnerability class.

Dependabot and Java

Here's another example, this time for a Maven (Java) project:

version: 2 
updates: 
  - package-ecosystem: maven 
    directory: "/" 
    schedule: 
      interval: weekly 
      day: Tuesday 
    open-pull-requests-limit: 10 
    reviewers: 
      - "dkfellows" 
  - package-ecosystem: github-actions 
    directory: "/" 
    schedule: 
      interval: weekly 
      day: Monday 
    reviewers: 
      - "dkfellows"

Apart from the different package ecosystem identifiers, we are also specifying which day of the week to run the check on, and raising the limit on the number of Pull Requests that may be made (in case many updates happen at once).

Note that if you’re using Maven, you might need to include the advanced-security/maven-dependency-submission-action action at the end of your CI build workflows due to the high complexity of analysing Maven dependency graphs so that dependabot can figure out exactly what you are really using.

Dependabot and JavaScript

Dependabot control files can be quite short. Here’s a full example from a JavaScript project:

version: 2 
updates: 
  - package-ecosystem: "npm" 
    directory: "/" 
    schedule: 
      interval: "weekly" 
  - package-ecosystem: "github-actions" 
    directory: "/" 
    schedule: 
      interval: "weekly"

You can provide a lot more detail than that (e.g., listing exactly where to find the project dependency descriptions) but rarely need to do that.

Using Dependabot Effectively

Be aware, however, that merely having the robot there does not guarantee that its proposed updates are correct. While it is good at proposing a minimum set of changes to update a version, it is poor at picking up breaking changes where the dependency’s API has been modified in a way that requires alterations to your code to adapt to, such as when a major version change has removed some operations you were using. Because of that, I strongly recommend that you only adopt a service like dependabot once you have some automated testing in place (such as via a Github Actions on-push or on-pull-request rule). Like that, you get to find out that a problem has been caused by the dependency updating before you break your primary branch. (Fixing such problems can be easy or difficult, but a good set of tests will at least detect them and warn you.)

Dependabot’s pull requests include instructions for how to work with them further. In the simplest cases, just merge or rebase the PR (after any checks run) and the bot will handle the cleanup. In more complex cases, you can take some ownership of the branch by adding commits to it to fix the API changes, or you may tell the bot to recreate the branch anew (if you suspect it is outdated by other changes you were doing). You can also tell the bot (by a suitable comment on the PR) to ignore a version of a dependency if you know it to have a problem for other reasons.

Once you have dependabot monitoring, you can also have it perform simple security auditing. It scans for vulnerability reports in your dependencies and provides you with a confidential warning if it finds any, both by email and in the Github web user interface. You can choose to act on such reports or not (sometimes the problems occur in parts of the software that you aren’t using, or you have other mitigations in place) but at least you get the option to find out and act before they become an embarrassing problem.

Recap: If you have dependencies, you need to handle keeping on top of updates to those dependencies. Dependabot helps with a lot of the basics of that.

Having any problems with keeping up to date with your dependencies, or even just suspect you might be having problems with them? Please contact the Research Software Engineering Department and we’ll help you out.