Friday, December 5, 2008

Maven Continues to Suck

Maven is supposed to be a tool to support collaborative software development.

Maven is based on the concept of software modules, each of which is versioned, and which have interdependencies. Let's say version 1.0.2 of MyThing depends on version 2.4.1 of YourThing, for instance. Now, what if I want to make some changes to both YourThing and MyThing? Well, there's a special version called "SNAPSHOT" - as in 2.4.1-SNAPSHOT - that means "the latest and greatest up-to-the-minute version of YourThing version 2.4.1". Setting aside the discussion of how the concept of a "version of a version" is inherently flawed, this introduces a big problem: it does not distinguish between changes I make and changes somebody else makes.

Case in point. Another developer at my company, and I, both are working on modifications to a module that our other stuff depends on. My modifications won't ever be shared with the rest of the world, they're just for my temporary purposes, but nonetheless I need them. The other developer, however, made some changes and committed them to version control. When I went to do a build of MyThing, Maven checked dependencies and noticed that there was a new SNAPSHOT of YourThing.

In a sane world, a collaborative software tool would notice the conflict, and perhaps notify me: "a newer version of YourThing is available, but it conflicts with your local changes - do you want to upgrade and lose your changes, or keep the stale but familiar stuff you've got?" The default, of course, would be to keep what I've got; after all, if I made some local changes, it was presumably for a reason.

Not Maven, though. Because I said SNAPSHOT (so that I could make changes locally), Maven silently and transparently discards my local version and updates me to somebody else's changes, at a random time of its deciding (typically, in the first build I do after midnight, of any project that happens to depend on YourThing).

Fortunately, the other developer's changes contained a serious bug. I say fortunately, because otherwise I might not have noticed that my changes had been silently discarded, and I might have spent a lot of time trying to figure out why my code, that used to work, no longer did.

What kind of collaborative software development tool is it that can't gracefully handle the simple case of two people working on a shared module?

Maven is not really a collaborative software development tool, it seems. Maven is a tool for letting one person develop a single module of code, with external dependencies on an otherwise-static world. That does not describe any software project I have ever worked on.

I keep hoping to see the light, and discover that I'm wrong about Maven. But it keeps on sucking.

19 comments:

Anonymous said...

you do realize that if you build and install your snapshot it will override one downloaded from central repository and you will not have changes from your fellow developer until you check out from repository

Walter Harley said...

My experience has been that that works for an indefinite, but finite, period of time. That is, if I build and install my snapshot it won't download for a while, but at some point it will decide that what's out there in the world is newer than my snapshot and it will download and override my local changes.

Eviradnus said...
This comment has been removed by the author.
Anonymous said...

The anonymous guy is right: remember that the latest snapshot is retrieved from your repository (that is, from both local and remote repos, if not using the trouble-saving -o option), so the midnight snapshot should not bother you too much.
Then again, if many other devs restlessly deploy their snapshots on the remote repo, you may be forced to stay in offline mode (using -o)... but you may also ask them not to do it too often. :)

Anonymous said...

> That is, if I build and
> install my snapshot it won't download
> for a while, but at some point it will
> decide that what's out there in the
> world is newer than my snapshot and it
> will download and override my local
> changes.

You mean it's behaving like you didn't turn off "update snapshots"? You know, the one that causes Maven to download updated snapshots once a day. Might be related to your "usually happens after midnight" problem, don't you think?

Walter Harley said...

My point here is that Maven's basic model of how software development works is broken. "Update" should never happen implicitly. Similarly, a cross-machine repo should never contain snapshots, because work on one project destabilizes work on another.

The problem with carving out my own steel-wire path through Maven - e.g., setting up per-project repositories, always using -o until I am ready for an explicit update - is that it requires too much knowledge of Maven; it is too fragile; and I'm swimming upstream. In fact, if substantial customization is required, then I don't understand why Maven provides any benefit at all.

Commenters might want to check out my more recent post on features of an ideal build language.

Anonymous said...

Walter, I don't think the problem is Maven but an incompatibility between yours and Maven development process.

You said "My modifications won't ever be shared with the rest of the world, they're just for my temporary purposes, but nonetheless I need them."

In this case your changes don't belong to the snapshot. It should be put in another version and also in another branch in your SCM.

That's what version and branch are for.

If you don't want to share your changes with the World, why even put them on Maven repository?

I really think that statements like "Maven Suck" are a bit too strong when you don't show enough evidence that Maven Suck.

I am tended to question your decision to even put what seems to be a personal temporary change in a shared module, but I'm sure you have good reasons to do so.

Good luck buddy.

Jason van Zyl said...

Here is a reference to the settings.xml and specifically you will want to look at the updatePolicy.

http://books.sonatype.com/maven-book/reference/appendix-settings-sect-settings-repository.html

Your setup is not optimal for your mode of operation. We generally have some teams decide to be agile in which case they want to be exposed to all changes happening. In this mode people actually lower the check interval: by default it's daily but people working with a high degree of flux will sometimes make it 10 minutes. This is predicated on the setup of a CI server that consistently publishes SNAPSHOTs. You cannot rely on developers doing this properly. People just forget.

The other mode is what you describe where you want to work in peace for a given period of time. In this case you set the update interval to never and then you update when you choose using the "-U" option.

We have groups in the hundreds and even thousands that use SNAPSHOTs and work with many people to produce a final product and it works just fine. But it's predicated on a good team setup, having a good CI system in place, and actually knowing enough about Maven. I don't think it's so much that Maven continues to suck as much as we need to do more then the free book we already have to get people past these simple setup problems that cause frustration.

Walter Harley said...

@Handerson: you say "In this case your changes don't belong to the snapshot. It should be put in another version and also in another branch in your SCM." Maybe I wasn't clear in the blog post: I'm talking about local changes on my local machine, like adding a println() in an upstream module for debugging purposes. You're saying I need to branch, version, and check into SCM, just to tweak a line of code locally on my machine? That would support my argument, I think.

@Jason - thanks for the settings ref, which I will check out, and for the concrete suggestion. -U might indeed be a good answer for me, for this problem, although when I did go to update it would still have blown away my locally changed version without asking, which is counter to the way I think it should work.

I will respond at more length in a separate post, that will go up in a moment.

Anonymous said...

You're trying to use snapshots for something they are not supposed to be used for. Not sure why.

if you want to make some small local change to another project without having them overridden, check out the code, change the version number to something else like 1.0-WH, run mvn install, and then have your own code depend on the local version. It's not exactly rocket science. :)

Walter Harley said...

@Henrik - changing version numbers isn't viable when there are more than one or two downstream dependents. But even if it were, it seems like your suggestion just supports my point: in order to accommodate what is in my own experience a very common part of collaborative software development, I have to do a fragile, hacky bypass around Maven's version model.

Anonymous said...

If you want to do a quick hack like inserting a println() for debugging purposes in an upstream module, then Maven will allow you to do that. Of course, it will still be a hack. You were talking about a quick local code change that will not be committed to your SCM.

If you want to do it the right way, then you need to either convince the person responsible for the module that a log statement would be nice, or you can create a branch.

What you are dealing with in two out of the three cases is another version of the module that are off the main path of development. Thus, it should at least have its own version number and if it's anything more than a quick hack it should have its own branch as well. I don't consider that ugly at all, and Maven will support any choice.

Walter Harley said...

@Henrik - you say "If you want to do a quick hack like inserting a println() for debugging purposes in an upstream module, then Maven will allow you to do that."

Yes, that's the basic idea of what I want to do. So, how do I get Maven to let me do that? Assume that I want my local hack to last until I revert the local code change (rather than silently vanishing when the clock strikes midnight); that I don't want to completely disable Maven's ability to download other unrelated artifacts, at least on explicit demand; and that I do not want to have to change the POMs of all the downstream projects that depend on the component I'm modifying. Is there a way? I'm afraid I'm not seeing it.

Jason van Zyl said...

I believe I told you how to do this when I responded the first where the repository can have settings so that snapshots are not updated unless you ask. Maven cannot magically know for what set of artifacts you want update and the ones you don't. If you have different policies for different sets of artifacts they generally in different repositories and for each repository you can have different update policies.

Walter Harley said...

@Jason - thanks again; doing customer support for an admittedly hostile customer on his blog rather than on your forum is probably beyond the call of duty :-)

It's an interesting idea to have multiple local repositories, using one for "locally hacked versions" and one for everything else, with the update policies set differently. I did read the page of the reference that discussed repository settings (as well as every other section I could find that discussed repositories or dependencies) but I'm still not quite able to tell whether that would actually work, and how I would selectively deploy to one or the other repository.

But I think perhaps all this has gotten off track of the broader point that I was trying to make, which is that I think Maven is by design unpredictable and nondeterministic in a way that interferes with my workflow. Maven's normal operation is to do updates as part of another goal (e.g., compile) and when it does an update, it honors newer code regardless of whether the original source was me or someone else.

You say "Maven cannot magically know for what set of artifacts you want update and the ones you don't." But that's a consequence of Maven's design: it's because, for instance, Maven doesn't track whether an artifact in the repository came from a local project. To me the fact that I built something locally is highly salient. My source management software doesn't blow away local changes when someone else updates a file; rather, it tells me when I try to commit, or when I explicitly ask it. That model fits my workflow much better.

Anonymous said...

Interesting discussion :)

There is another way of handling some of your issues: Use a repository manager (Nexus) locally. I'm told the memory footprint is reasonable and it's relatively easy to set up. Configure Nexus to proxy the company repository and configure Maven to always go through your local Nexus when downloading. That should in theory give you a better experience when coding off-line. In addition, you can block the proxy when you aren't interested in snapshots from somewhere else.

Anonymous said...

I just set the above up on my macbook. As it turns out it works really well. It took about 20 minutes but I have a bit of experience with Nexus. Here is the documentation http://www.sonatype.com/books/nexus-book/reference/index.html

For settings.xml setup look here: http://www.sonatype.com/books/nexus-book/reference/maven.html

Walter Harley said...

@Hendrik - thanks, that sounds like the way to go then. Based on comments in a different thread I'd been under the impression that it was not practical or typical to install Nexus on a local machine (we do use it for our company-wide repository) but apparently that was incorrect, which is good.

This still feels to me like a workaround, though. I don't think my workflow is that unusual, and I think that Maven would better map to my workflow if it was able to treat locally-produced artifacts differently than external ones.

Another example of this is that right now I am working on two Maven projects, one of which (B) depends on the other (A). The system tests are in A but need to be run from B (because they also apply to C, D, ...). But I don't know any command that I can launch from B, that will get A to rebuild; so instead I keep two console windows open, so that after I change A, I 'mvn install' A and then 'mvn integration-test' in B. What I want instead is for Maven to understand that A is a local project, and instead of checking the internet for updates, I want it to check whether A's build product is out of date from its local source code and if so, rebuild it.

In other words, in general I want Maven to preferentially treat artifacts (at least SNAPSHOT versions) as if they came from source code, rather than from the internet.

That would also solve the ugly problems with M2Eclipse not showing source code for upstream projects even when the project is in the workspace.

Sorry to harp on this, but it seems like the responses have focused on the specific problem of getting unexpected updates, rather than on the broader point about workflow that I'm trying to make.

Anonymous said...

First of all, I find this discussion very helpful. I'm by no means a Maven expert. I've reviewed it on and off in the last 18 months and with the emergence of Nexus it finally seemed mature enough for me to pitch to the company I'm working for. That was about six months ago. It took a while but we've decided to migrate from some rather complex Ant scripts (basically build management "programmed" in Ant) and so far the response has been very positive. On the other hand we've only been at it for about a month and we're migrating as needed so it isn't that complex yet. Nothing but positive response makes me deeply suspicious so it's nice to get a critical POV and I do think you have some valid points. :)

With regards to Nexus, based on http://blogs.exist.com/bporter/2008/11/19/learning-from-maven-training/ I don't think it is typical to install a repository manager locally, but it's definitely practical and I believe that Nexus was created with this use in mind as well.

With regards to your other scenario you might want to look into modules http://www.sonatype.com/books/maven-book/reference/multimodule.html . I haven't used modules a lot yet as I've mostly been making proof of concept code for projects in our existing code base. Integration testing the Maven way is not something I've looked at extensively yet but my initial feeling is that it's something of an afterthought in Maven.

In our setup any installation or deployment results in a source jar (and a javadoc jar) being uploaded as well, but we use IntelliJ instead of Eclipse so I have no experience with m2eclipse.