Java’s Dependency (Mis)Management: How Maven and Gradle Cope

Dependency management is a hard problem, and Java doesn’t make our lives any easier. Popular build tools like Maven and Gradle do what they can, but their default behavior doesn’t always alert us to potential problems. While both tools can fail when there are dependency conflicts, they’re not enabled by default. Below we will walk through the evolution of a set of modules; how they can go from having no version conflicts, to having runtime failures that should be build time failures. We’ll examine why this happens, and what tools Maven and Gradle have to help guard against it.

Stage 0

Stage 0 Dependency Graph

Consider this simple dependency graph, with three modules, A‑1.0.0, B‑1.0.0, and C‑1.0.0. Each module is semantically versioned. Each of these modules contains one class in its public API.

Module A contains

package com.implementsblog.a;

public class A {
    public static String call() {
        return "A-1.0.0";
    }
}

Module B contains

package com.implementsblog.b;

import com.implementsblog.a.A;

public class B {
    public static String call() {
        return "B-1.0.0 -> " + A.call();
    }
}

Module C contains

package com.implementsblog.c;

import com.implementsblog.a.A;
import com.implementsblog.b.B;

public class C {
    public static String call() {
        return "C-1.0.0 -> " + A.call() + "\n"
             + "C-1.0.0 -> " + B.call();
    }

    public static void main(String[] args) {
        System.out.println(C.call());
    }
}

When each module is built and C#main is invoked, it outputs the expected result, regardless of the build tool used:

C-1.0.0 -> A-1.0.0
C-1.0.0 -> B-1.0.0 -> A-1.0.0

Stage 1

Stage 1 Dependency Graph

Stage 1 Dependency Graph

Suppose now that a bug was found in module A so a new patch version is released, A‑1.0.1. Module B is modified to use the new version of A so it too must make a new release. Since the public API of B did not change, it also increments it’s patch version, B‑1.0.1. Module C decides to use the new version of B, but not the new version of A (either on purpose or by mistake). Since C‘s public API does not change, it’s new version is C‑1.0.1.

What’s the result of calling C#main?

Stage 2

Stage 2 Dependency Graph

Stage 2 Dependency Graph

Suppose now that we change the API of class A to this:

package com.implementsblog.a;

public class A {
    public static String call(String dummy) {
        return "A-2.0.0";
    }
}

Since this change is incompatible with the previously released version, the next release will be A‑2.0.0. Module B is updated to use A‑2.0.0, and is released as B‑1.0.2. Next, module C updates to use B‑1.0.2 but continues to use A‑1.0.0.

Now what’s the result of calling C#main?

Stage 3

Stage 3 Dependency Graph

Stage 3 Dependency Graph

In this final iteration, module A finally upgrades to use A-2.0.0, but module B has regressed to A-1.0.1.

What’s the result of calling C#main?

The -classpath

Different build tools have different strategies for managing the dependency graph of our Java projects; so the answer to , “what’s the result of calling C#main in stages 1, 2, and 3?” depends on which build tool is used.

The dependencies of a Java module (including transient dependencies) can be represented as a graph — specifically, a directed acyclic graph. However, when it comes time to run that module, the graph needs to be flattened. This is because of the way Java identifies and searches for classes.

Java uniquely identifies a class by its fully qualified name, which is its package and class name. Java searches for classes along the classpath. The -classpath option of the javac and java commands (among others) is an ordered “list of directories, JAR archives, and ZIP archives which contain class files”. Furthermore, “the Java interpreter will look for classes in the directories in the order they appear in the class path variable (emphasis mine). There is no notion of a class’s “version”; the convention used by the above Stages was adopted many years after Java was introduced.

If there are two classes with the same fully qualified name on the -classpath, the first one wins. This can lead to runtime exceptions, depending on how the dependency graph is flattened. Stages 1, 2, and 3 all have version conflicts. Let’s see how Maven and Gradle behave in each of these stages.

Maven (3.2.1)

When Stage 1 is built and ran using Maven, the output is

C-1.0.0 -> A-1.0.0
C-1.0.0 -> B-1.0.1 -> A-1.0.0

Even though module B explicitly depends on A-1.0.1, they cannot both exist on the -classpath, so Maven must pick one. It resolves the conflict by using the “nearest definition”, that is, “it will use the version of the closest dependency to your project in the tree of dependencies”*. Since A‑1.0.0 is only one step away from C and A‑1.0.1 is two steps away, Maven chooses the former.

This scares me, and it should scare you, too. In Stage 1, the consequences aren’t so bad, but in Stage 2 and 3, the consequences are disastrous. Maven will happily build each module. But when we call C#main, it throws a runtime exception. In Stage 2 it’s:

Exception in thread "main" java.lang.NoSuchMethodError: 
    com.implementsblog.a.A.call(Ljava/lang/String;)Ljava/lang/String;
         at com.implementsblog.b.B.call(B.java:7)
         at com.implementsblog.c.C.call(C.java:9)
         at com.implementsblog.c.C.main(C.java:13

And in Stage 3:

Exception in thread "main" java.lang.NoSuchMethodError:
    com.implementsblog.a.A.call()Ljava/lang/String;
        at com.implementsblog.b.B.call(B.java:7)
        at com.implementsblog.c.C.call(C.java:9)
        at com.implementsblog.c.C.main(C.java:13)

Now that’s bad!

How to Fix It

Maven provides the Enforcer Plugin which has a dependency convergence rule. This rule will fail your build when any version in the dependency tree does not match. If we had been using this rule, C‘s build would have failed in Stage 1.

Gradle (1.12)

For the Gradle builds, I am using the maven-publish plugin.

When Stage 2 is built and ran using Gradle, the output is

C-1.0.1 -> A-1.0.1
C-1.0.1 -> B-1.0.1 -> A-1.0.1

We can see that — while Maven uses the “nearest definition” to resolve the dependency conflict — by default, Gradle uses “the newest version of the dependency”*. This is (usually) fine if your dependencies are backwards compatible, but since Semantic Versioning allows for backwards incompatible changes, this strategy doesn’t always work.

Gradle does a little better than Maven with Stage 2: the build for C fails with the much preferred compile time error:

error: method call in class A cannot be applied to given types;
    return "C-1.0.2 -> " + A.call() + "\n"
                            ^
  required: String
  found: no arguments
  reason: actual and formal argument lists differ in length

But it’s no better with Stage 3:

Exception in thread "main" java.lang.NoSuchMethodError:
    com.implementsblog.a.A.call()Ljava/lang/String;
        at com.implementsblog.b.B.call(B.java:7)
        at com.implementsblog.c.C.call(C.java:9)
        at com.implementsblog.c.C.main(C.java:13)

How to Fix It

Similar to the “enforcer plugin”, you can configure Gradle to fail whenever there’s a version conflict*.

configurations.all {
    resolutionStrategy {
        failOnVersionConflict()
    }
}

Discussion

The purpose of showing these different Stages was not to compare Maven verses Gradle, but to show that no build tool can make up for Java’s flawed dependency management system. While the default behavior of these tools allows dependency conflicts, the draconian “fail when any version differs in the dependency graph” strategy is far too rigid because all modules within a dependency graph must upgrade to the same version of a particular module at the same time. This does not scale, and is not tenable unless you control all modules and their dependencies within the system.

Where’s the middle ground? Semantic Versioning is just one convention, but since it’s not baked into Java, the version number is a suggestion; it’s difficult to tell if switching from A-1.0.0 to A-1.0.1 will break our module. This is where unit and integration tests can help. Unit and integration tests validate the correctness of a module when one of its dependency’s version changes. Unit tests also verify the integrity of the system as a whole; all unit tests must run when the entire system is built because a module is not guaranteed to have the same dependency version at runtime that it had at compile time.

You might be asking, “if dependency conflicts can happen so easily and are so hard to detect, why doesn’t every project’s build fail or constantly throw runtime exceptions?” Undoubtedly most large code bases are about to throw a runtime exception at any moment because of dependency issues, but version mismatches do not always spell disaster. To see why, consider a module X who depends on a module Y. X only depends on a subset of Y‘s API, call it {X→Y}. If the set of changes to Y‘s API are outside of {X→Y}, Y can update to any version without breaking X.

The above stages are trivial. And their problems are easy to solve when we can control all the moving parts, but suppose this happens in a project with dozens of direct dependencies and hundreds of transitive dependencies? We quickly find our way into the Pit of Dependency Despair.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s