Are there any maven experts here? I've been writing my own pom parser + dependency resolver and the full flow of how POMs are resolved is still super wacky to me and idk if I get it (details in thread)
So first, I know how to parse the content literally represented in the POM
from there, I think the next step is "find the parents"
Then there is a step where the inheritance tree is collapsed and properties are subbed
and at some point <scope> import is handled in some way?
which is recursive, since i think I need to fully parse that pom too!?!
and then i need to collapse dependencies and dependencyManagement
/**
* POM (xml file)
* |
* | Parse XML
* V
* PomInfo
* |
* | Fetch all parent poms
* V
* ChildHavingPomInfo
* |
* | Squash parent poms and replace properties
* V
* EffectivePomInfo
* |
* | Fetch BOMs and other deps with import scope
* v
* ?????
* |
* | Collapse dependency and dependency-management
* v
* PomManifest [ contains just list of deps ]
* |
* | Normalize snapshots and version ranges
* v
* PomManifest [ type unchanged ]
*/i'm just kinda lost
and i've been reading both other code doing this same task and the documentation - its helping but its painful
There are a couple maven books they wrote a long while ago that might actually cover some of this
You didn’t mention property resolution above but that also should be in there
But yes, it is painful
yeah i squished property resolution and parent poms into the same step
maybe they should be distinct
this is the "flow" i got rn (huge code chunks)
record PomInfo(
PomGroupId groupId,
PomArtifactId artifactId,
PomVersion version,
List<PomDependency> dependencies,
PomParent parent,
List<PomDependency> dependencyManagement,
List<PomProperty> properties,
PomPackaging packaging
) {
}
|
v
record ChildHavingPomInfo(
PomGroupId groupId,
PomArtifactId artifactId,
PomVersion version,
List<PomDependency> dependencies,
List<PomDependency> dependencyManagement,
List<PomProperty> properties,
PomPackaging packaging,
Optional<ChildHavingPomInfo> child
) {
|
v
record EffectivePomInfo(
PomGroupId groupId,
PomArtifactId artifactId,
PomVersion version,
List<PomDependency> dependencies,
List<PomDependency> dependencyManagement,
PomPackaging packaging
) {
static EffectivePomInfo from(final ChildHavingPomInfo childHavingPomInfo) {
var properties = new LinkedHashMap<String, String>();
var top = childHavingPomInfo;
while (top != null) {
for (PomProperty property : top.properties()) {
properties.put(property.key(), property.value());
}
top = top.child().orElse(null);
}
Function<String, String> resolve =
str -> resolveProperties(properties, str);
Function<PomDependency, PomDependency> resolveDep = dependency ->
new PomDependency(
dependency.groupId().map(resolve),
dependency.artifactId().map(resolve),
dependency.version().map(resolve),
dependency.exclusions().stream()
.map(exclusion -> new PomExclusion(
exclusion.groupId().map(resolve),
exclusion.artifactId().map(resolve)
))
.collect(Collectors.toUnmodifiableSet()),
dependency.type().map(resolve),
dependency.classifier().map(resolve),
dependency.optional().map(resolve),
dependency.scope().map(resolve)
);
PomGroupId groupId = PomGroupId.Undeclared.INSTANCE;
PomVersion version = PomVersion.Undeclared.INSTANCE;
PomPackaging packaging = PomPackaging.Undeclared.INSTANCE;
var dependencies = new LinkedHashMap<PomDependencyKey, PomDependency>();
var dependencyManagement = new LinkedHashMap<PomDependencyKey, PomDependency>();
top = childHavingPomInfo;
while (top != null) {
if (top.groupId() instanceof PomGroupId.Declared) {
groupId = top.groupId().map(resolve);
}
if (top.version() instanceof PomVersion.Declared) {
version = top.version().map(resolve);
}
if (top.packaging() instanceof PomPackaging.Declared) {
packaging = top.packaging().map(resolve);
}
top = top.child().orElse(null);
}
groupId.ifDeclared(value -> properties.put("project.groupId", value));
version.ifDeclared(value -> properties.put("project.version", value));
var artifactId = childHavingPomInfo.artifactId().map(resolve);
artifactId.ifDeclared(value -> properties.put("project.artifactId", value));
top = childHavingPomInfo;
while (top != null) {
top.dependencies()
.forEach(dependency -> {
var newDep = resolveDep.apply(dependency);
dependencies.put(PomDependencyKey.from(newDep), newDep);
});
top.dependencyManagement()
.forEach(dependency -> {
var newDep = resolveDep.apply(dependency);
dependencyManagement.put(PomDependencyKey.from(newDep), newDep);
});
top = top.child().orElse(null);
}
return new EffectivePomInfo(
groupId,
artifactId,
version,
dependencies.values().stream().toList(),
dependencyManagement.values().stream().toList(),
packaging
);
}
|
v
record PomManifest(
@Override List<Dependency> dependencies
) implements Manifest {I'm curious why you're doing this :)
For what it is worth I find the maven APIs overwhelming too. I have learned a thing or two by reading https://github.com/clj-commons/pomegranate and https://github.com/clojure/tools.deps sources.
bright side - i have successfully detached resolution from all this nonsense in the same way tools.deps has
> I'm curious why you're doing this Mental illness, tbh
every time I read the Maven source I just want to throttle someone. it's such a paragon of 00's Java style, but it's all just data that could be maps and you would need none of the "abstractions" and "patterns"
but the thought is that java should have an actual build tool built in
I think maybe we've finally reached a point where new languages actually think about this early
and one part of that would be a resolver
I have no power to make that happen
but I do have the ability to experiment in a public way
I remember well building Java applications before Maven and repos existed
it was not fun
Trying to make sense of the Maven APIs put me in a similar mood to when I tried to make sense of Windows shell escaping rules.
And continuous integration YAML config.
yeah there is stuff maven does like "every repo is checked for every library" that feels insane
and I don't feel a strong need to recreate
pretty sure I can just attach a list of repos to check to the coordinate.
it's not only insane, it has all kinds of security issues
right now this is my conception of a coordiante
public interface Coordinate {
default Coordinate normalize(Library library, Cache cache) {
return this;
}
enum VersionComparison {
INCOMPARABLE,
GREATER_THAN,
LESS_THAN,
EQUAL_TO;
public static VersionComparison fromInt(int comparisonResult) {
return comparisonResult == 0 ? EQUAL_TO : comparisonResult > 0 ? GREATER_THAN : LESS_THAN;
}
}
CoordinateId id();
Manifest getManifest(Library library, Cache cache);
/**
* Gets the location of the given library on disk, assuming the library was located
* with this coordinate.
*
* <p>
* If the library is not downloaded on disk, this method will do so before returning.
* </p>
*/
Path getLibraryLocation(Library library, Cache cache);
default Optional<Path> getLibrarySourcesLocation(Library library, Cache cache) {
return Optional.empty();
}
default Optional<Path> getLibraryDocumentationLocation(Library library, Cache cache) {
return Optional.empty();
}
}Manifest is just a named List<Dependency> and thats what all this pom BS collapses into
public record MavenCoordinate(
Version version,
List<MavenRepository> repositories,
List<Scope> scopes
) implements Coordinate {(scopes...idk need to read one of those books i guess)
but the big ? for me at this level is whether exclusions are conceptually part of the coordinate
I would say yes
scopes tell you how to make different kinds of classpaths out of the same set of dependencies (at least that's how I think about it)
you seem to be missing classifiers too unless that's part of your version. classifiers are a weird little world of their own as they bring in conditional stuff
they are not - i was just thinking of source/javadocs/none
like on maven repository there is this method
abstract InputStream getFile(
Library library,
Version version,
Classifier classifier,
Extension extension
) throws LibraryNotFound;they are really important for a small percentage of Java libs
particularly those with native stuff
the two big conditionals they cover usually are jdk (which is rarely used anymore and shouldn't even be needed now that we have multi release jars), and native architecture
classifiers are weird in that all classifiers technically share the same pom and have the same coordinate. that they wedged docs and source into that is a bit of a hack
so i can conceive of how to add that to the MavenCoordinate - like use this classifier for the jar - but how would that affect resolution?
record MavenCoordinateId(Version version)
implements CoordinateId {
}rn this is what I am keying things on in my (non-functioning) clone of t.deps algo
in deps I decided to make it part of the artifact: group/artifact[$classifier]
vs making it part of the version
ugh so thats not suuuper fun
Library is currently this
public record Library(
Group group,
Artifact artifact
) {which is already a little maven specific, but slapping Classifier on there feels gross...
classifier is really a variant of an artifact that is specific to jdk/arch/something else (we use "aot" as a classifier in a few of the contrib libs)
I can make Library an interface - its basically just a key in a hashmap
clojure itself publishes both the "normal" compiled jar, but also a "slim" classifier jar with just source
just questioning how to handle sources + docs in that world
well like I said, those are imo hacks - they are also classifiers
really, Maven should have had a way to associate some kind of ancillary information (like Clojure metadata) to an artifact
gradle has that with its module metadata
you see why im thinking about this? I think tools.deps does its job for clojure, but would be neat if a foundational part of modern programming didn't elicit reactions like > I just want to throttle someone. and > a similar mood to when I tried to make sense of Windows shell escaping rules.
Fight the good fight :)
@emccue are you writing your own version of maven and why? :)
@borkdude of the resolver, not the build tool
@emccue and why, if I may ask?
Reasoning only extends as far as this https://clojurians.slack.com/archives/C0H28NMAS/p1680363571330679?thread_ts=1680362691.333169&cid=C0H28NMAS
Right, I'm asking since I considered doing something like this once when I had trouble compiling tools.deps.alpha (at the time) to native
so I wondered what your motivations were
That would be a benefit
I'm not using any SPI or dynamic stuff and the number of deps is 1
So id imagine compiling to native would be easier
what is the dep? perhaps it could also run with bb :)
No it is the dep
As of now the whole thing is zero dep
You are historically a way more productive person, so I would welcome participation
It would be interesting to have a pure .clj solution (even more so if it had a chance of being adopted by tools.deps). The resolution mechanism isn't changing all the time is it?
I’d consider it if it got me everything I needed, but really this is just part of the puzzle
I'm doing all my stuff in a style that would be idiomatically translatable enough
Idk if that would be useful or not for that goal but 🤷♀️
There’s so many features we can provide via the maven libs and it’s a lot to take all of that on
Which is why I’ve given up every time I’ve considered it :)
yes, it's going to be work
btw in the end I managed to compile tools.deps to native: https://github.com/babashka/tools-deps-native and this is one of the projects using it: https://github.com/babashka/tools.bbuild nothing more than a "look how far we can go" project at this point, in bb I'm still shelling out to Java via deps.clj to fetch deps