Fork me on GitHub
#tools-deps
<
2022-09-05
>
didibus00:09:09

When using git libs, what should my lib name be? With Maven it was kind of obvious it was group/name, but now say my namespaces start with com.foo but I publish git deps on github, so is the git lib com.github.foo ? If I put that as the git lib, it will figure out the URL, but what happens if someone else also depends on com.foo {:git/url "..." ...} ? Will it now pull the same lib twice? So basically, is the lib name in a dependency declaration just whatever someone puts? So they could even by accident typo it and it would work if a git/url is provided? And suddenly they pulled in the same lib twice with potentially different shas ?

seancorfield00:09:52

io.github.<username>/<repo-name> is the coordinate.

seancorfield00:09:15

The nses are unrelated and can be anything.

seancorfield00:09:51

Now, you could tell people to use a different libname -- but then they'd have to specify the :git/url in full.

seancorfield00:09:38

Like:

didibus/my-lib {:git/url "" :git/sha "..."}
vs
io.github.whoever/whatever {:git/sha "..."}
Does that help @U0K064KQV?

didibus00:09:48

The thing is, it's not the coordinate, its whatever you want. I tried it, so I can pull multiple different version of the same lib:

com.github.xadecimal/riddley {:git/tag "0.2.2" :git/sha "d1ac17e"}
        com.wtv/riddley {:git/url ""
                         :git/tag "0.2.2"
                         :git/sha "d1ac17e"}
        com.lololol/riddley {:git/url ""
                             :git/tag "0.2.1"
                             :git/sha "c905720"}
        com.xadecimal/riddley {:git/url ""
                               :git/tag "0.2.1"
                               :git/sha "c905720"}

seancorfield00:09:23

It's convention to use io.github.<username>/<repo-name> so you avoid conflicts 🙂

didibus00:09:49

But I own com.xadecimal and that's my namespace prefix

seancorfield00:09:56

With :local/root, the libname can also be absolutely anything.

didibus00:09:58

So now it seems at risk of conflicts

seancorfield00:09:14

I don't understand.

didibus00:09:19

I don't know, this seems like a potential disaster slowly in the making 😛

seancorfield00:09:27

What's your GH username?

didibus00:09:35

(ns com.xadecimal.riddley)

;; my code
http://www.xadecimal.com <-- my website I publish to git and clojars com.xadecimal <-- My clojar groupId

didibus00:09:01

So now in github, where I publish as a git deps, ideally it should be:

com.xadecimal/riddley {:git/url ""}
But you can also write it like:
com.github.xadecimal/riddley {...}
Except now this is treated as two separate lib which will just get random order for which one it uses at runtime

seancorfield00:09:07

If I was using your lib from GH, I'd use:

io.github.xadecimal/riddley {:git/sha "d1ac17e667d0159d65a7b6d3fc5893bc0402ff7d"}
so that it deduces the URL.

didibus00:09:44

Isn't that just a 3rd way to import the lib?

didibus00:09:11

So what will happen is in a chain of transitive deps, if a lib depends on a different version of my lib but just happen to put a different name in their dependencies it won't conflict

didibus00:09:31

So we're back at tools.deps not detecting and resolving conflicts between libs

seancorfield00:09:34

Well, that's the danger if you publish to Clojars and to GH.

didibus00:09:43

It's the danger if I publish only to GH as well

seancorfield00:09:52

But, yeah, for :local/root and git deps, the lib name can be "anything" so you're responsible for conflicts. But bear in mind that something published to Clojars cannot depend on a git dep

didibus00:09:34

Right, which is where I'd expect tools.deps to manage conflicts across. Without it, I feel it's pretty precarious, like I'm in the JS ecosystem 😛 I think it could introduce a mandatory lib name inside deps.edn or something and rely on that for conflicts. Now it just relies or whatever name people want to give the lib in their deps

seancorfield00:09:39

If you only "publish" via GH deps and encourage folks to use the "standard" (convention) then conflicts are less likely (and t.d.a is more likely to be able to detect them). If you go off-script then...

didibus00:09:17

As far as I can tell, t.d.a mentions no standard, the doc just mentions the multiple ways to do it

didibus00:09:01

And even then, I'm very hesitant to adopt languages with unreliable package managers that can create conflicts, random ordering of deps at runtime are the most costly bugs

didibus00:09:16

And solving them are tons of headache

seancorfield00:09:57

If you're publishing to Clojars, be consistent and document the git deps using the same coordinate 🙂 So far, the reality has been that this isn't a problem with Clojure.

didibus00:09:14

Because no one uses git deps yet

didibus00:09:25

I don't know, this seems like a huge disaster in the making

seancorfield00:09:04

tools.build is GH only so that's clearly not "no one".

didibus00:09:28

Its a build tool that can't conflict not included in your prod app

seancorfield00:09:11

I think you're imagining a problem that won't come to pass. ¯\(ツ)

didibus00:09:20

Now all it takes is one typo in a popular lib pulling in a common package, and suddenly tons of people are secretly getting two versions of that package, it goes for a while without issue, until they diverge in version enough

seancorfield00:09:52

"popular lib" and "no one" seem mutually exclusive?

didibus00:09:15

No, eventually more people will move to git deps, so it's just an issue that will creep on the community

seancorfield00:09:11

The deps/CLI docs say: > repository lib - the Clojure CLI uses a convention where the URL does not need to be specified if you use a library name like io.github.yourname/time-lib for the GitHub url

seancorfield00:09:28

Where are you seeing it suggest other git approaches?

seancorfield00:09:28

Just because it shows both io.* and com.* mapping to the same thing?

didibus00:09:56

I mean, it doesn't mention any preferences, it just says here's many ways to specify the URL

seancorfield00:09:38

I quoted from the guide above where it mentions the convention. The reference just goes into more detail. I think you're making a mountain out of a molehill.

didibus00:09:44

The issue is, at least if it used the URL for conflicts... Then at least within git deps it would detect conflicts, though I still wish it also detected them across maven

didibus00:09:58

Maybe I read that different then you, but that just tells me that if I want I can use a different lib name and it avoids me having to specify the URL. Which I find is even more risky, because now it recommends people to use a different name that can conflict

didibus00:09:43

Why does T.D.A bother checking the sha? I mean, at this point, its all pretending to detect conflicts anyways

seancorfield00:09:01

I think you'll find that libraries will be clear and consistent about how to refer to them and folks will use whatever libname is documented in the README...

didibus00:09:25

I'm making a lib myself, and even I had no idea what name to use

seancorfield00:09:38

You can't compare a git dep and a Maven dep -- neither is obviously newer than another.

didibus00:09:46

It just feels weird that the group name in my deps is not the namespace prefix

seancorfield00:09:58

You are publishing to Clojars already. You have a group/artifact name already yes?

didibus00:09:20

No I was planning going only GH for now, but now I'm thinking no GH since its unreliable and only Clojars probably

seancorfield00:09:09

Clojars has group name restrictions so if you are publishing there, you have less leeway. And by now you should be used to Clojure not telling you how to build and distribute projects! 🙂

didibus00:09:20

It's not that, I cannot trust any dependencies now. Like almost need to have a build step to validate we don't have git deps

seancorfield00:09:13

I don't know whether libraries will ever really move to GH only since they can't be used in another lib that wants to publish to Maven/Clojars -- and some companies simply don't allow "random" GH libraries to be used right?

seancorfield00:09:12

If you want other libraries to use your lib, you pretty have to publish to Clojars. That's been my deciding factor: is it tooling? GH only; is it a reusable library? Clojars only.

didibus00:09:00

I mean, if t.d.a doesn't take reliability and reproducibility of GH deps seriously, probably people won't adopt it. But I think if it did, having internal mirrors of git deps is much easier than Maven, and the publishing story for them is much more convenient. I could see it becoming popular if it adds features to properly detect conflicts and maybe also to detect conflicts against Clojars/Maven, which it very much could if it just introduced a more formal lib name and version as part of deps.edn

didibus00:09:55

> is it tooling? GH only; is it a reusable library? Clojars only. I guess until t.d.a matures aroud Git deps, I'll have to follow that as well.

seancorfield00:09:05

This isn't just a git deps issue. Suppose I fork a library that's published to Clojars and I publish it under different coordinates (but with the same namespaces) -- that's not a detectable conflict.

seancorfield00:09:31

And that's true of Java libraries published to Maven today. The group/artifact is independent of the Java packages.

didibus00:09:44

Some languages actually validate that the namespace prefix all match, I think Maven Central does that as well

seancorfield00:09:15

No, it doesn't. I use a bunch of Java libraries from Maven that are totally inconsistent in package names.

didibus00:09:54

Hum, ok maybe Maven Central doesn't. I wasn't sure.

didibus00:09:00

But still, I think that's more acceptable.

seancorfield00:09:14

My point is that this isn't a unique problem to t.d.a and git deps and if your concern was a real world problem we'd be running into it already with Java libs and stuff on Clojars.

didibus00:09:37

The scenario now is just prone to typos, or even people just randomly using a diferent convention, io.github or com.github or groupID with url, etc.

didibus00:09:17

I disagree, it's very different problem. One requires a fork that did not change namespace names, and for libs to depend on the fork and the original

didibus00:09:34

The other just requires people to declare the deps with a different name

didibus00:09:45

It's the same lib, not a fork

seancorfield00:09:58

If you don't use :git/url then you can't typo the libname -- else t.d.a won't find it.

seancorfield00:09:22

io.gitflub.xadecimal/riddley isn't going to resolve.

seancorfield00:09:06

If you use :git/url then, yeah, you need to be more careful. But I think that's really only likely for regularly-published libs (to Maven/Clojars) that folks want to test against the latest source version -- and they already have a group/artifact so they'll just comment out :mvn/version and add :git/url / :git/sha.

seancorfield00:09:26

I'm basing this on my experience as an early adopter (and fairly extensive user) of the CLI, t.d.a, and git deps with several projects out there using :git/tag / :git/sha and, at work, relying on a number of tools (and some libraries) via git deps. Once you've had real world experience doing this, we can compare notes. Otherwise you're just imagining problems that I don't think will happen in the real world.

seancorfield00:09:51

(and this isn't the first time we've had this kind of discussion -- LOL 😆 )

didibus01:09:34

io.github.xadecimal/riddley com.github.xadecimal/riddley com.xadecimal/riddley {:git/url "https://github.com/xadecimal/riddley.git"} are all going to resolve and won't conflict though

seancorfield01:09:31

Well, choose your poison, document it in your readme and live with it. Users of your library will generally do what you tell them in the README.

seancorfield01:09:18

If you think you're going to publish to Clojars as well you need to pick coordinates you can use for both.

seancorfield01:09:50

Clojars lets you use a domain you own or you can also verify a github-style domain (by logging in via GH if your username matches as I recall). So those are your Clojars choices.

seancorfield01:09:12

If you put two different group/artifact values in your README, you're just asking for trouble.

didibus01:09:15

Hopefully the problem doesn't manifest. But I've seen all possible version issues happen in prod in my short career. They're my most dreaded issues, so I'm sensitive to it 😛

didibus01:09:59

Ya, so I think I will go with explicit git url:

com.xadecimal/riddley {:git/url ""
                               :git/tag "0.2.1"
                               :git/sha "c905720"}

didibus01:09:47

This one allows me to change my git hosting in the future without breaking people and still detecting conflicts once I change to say gitlab or some other git hosting And it can match Clojars or Maven Central coordinates

seancorfield01:09:53

There ya go! You looked at the tradeoffs, made a decision, and now you're going to stick with it 🙂

didibus01:09:59

It should also detect conflicts between git deps and clojars correct? Since they'll have the same coordinate?

seancorfield01:09:26

It'll detect that they can't be compared, yes.

👍 1
didibus01:09:28

lol, well the decision I have not made yet, is if I want to ban git deps from work 😛

seancorfield01:09:20

That will ban several tools -- including tools.build and build-clj and deps-new

didibus01:09:28

This behavior makes me more worried as a user of git deps, as in, someone who might pull in a git deps, what kind of hidden conflicts will I suffer from if I do 😛

didibus01:09:59

Ya, might need to accept-list specific build tooling

seancorfield01:09:03

So much easier to go to a given GH repo and SHA and see exactly what it depends on tho'...

seancorfield01:09:16

With some random JAR on Maven, you can't do that.

seancorfield01:09:48

You have to trust that contents are built from something you can trust.

didibus01:09:10

But I don't understand why equal coordinates can't be treated equal by t.d.a and just conflict?

seancorfield01:09:09

Equal coordinates => choose the newest version (and throw an error if you can't determine the newest one). That's exactly what it does.

didibus01:09:26

Hum... ok so wait, maybe there is no problem then. You mean that:

io.github.xadecimal/riddley
com.github.xadecimal/riddley
com.xadecimal/riddley {:git/url ""}
garbage/whatever {:git/url ""}
All these actually are treated equal and will be auto-resolved to only pull in the newest version?

seancorfield01:09:26

No, coords = group/artifact

didibus01:09:28

I forgot that, when I tested, I thought it won't throw a conflict error, but I forgot t.d.a auto resolve to newest. So I may be worried for nothing

didibus01:09:20

Generally, I trust the domain validation done by Maven Central.

didibus01:09:52

And the groupId prevents obvious name squatting issues

seancorfield01:09:15

But you have no guarantee of what's in a given Maven JAR: just compiled code. You can't guarantee what it was built from. Do you decompile and verify every Maven JAR? With GH deps you know exactly what source code is being used...

seancorfield01:09:50

And with all the git deps, the source is downloaded locally to your CLI process for inspection

didibus01:09:58

Yes I agree, there's a lot to like about Git deps, that's why I'm more so disappointed that it introduces other issues. I don't feel it needs too, seems a bit self-inflicted. Like deps.edn could add a :lib key and a :version key, and then t.d.a could use that for conflicts. Or it could use the actual coordinate, in the sense of where it pulls from, everything pulled from the same place would conflict. Not as good since it can't detect conflicts across sources, but still better.

seancorfield01:09:19

Can't be in deps.edn in the repo - that would allow people to hijack other libs

didibus01:09:54

I was thinking the combination, like if inside deps.edn you declare a deps on com.foo/bar then in the deps.edn of com.foo/bar it has to say :lib com.foo/bar

didibus01:09:41

That way, it's not just my README that enforces the coordinate name of my lib, but if I say the lib is meant to have coordinate com.foo/bar that's how people have to declare it.

seancorfield01:09:53

Interesting idea. You should put that on http://ask.clojure.org! I'd vote for that optional check.

👍 1
seancorfield01:09:45

Just to mess with you further, a git dep could have pom.xml and no deps.edn I believe:wink:

didibus02:09:43

lol, I need the Jackie Chan emoji that goes like Why? for that one. But, I think that would be fine, you can validate that the coordinate used to declare the dependency on it is equal to its groupId/name

seancorfield02:09:55

How? I'm talking about a random git repo with a pom and no deps.edn

seancorfield02:09:55

🥋 Jackie Chan?

didibus02:09:36

So a package with a deps.edn depends on a git does that only has a pom file correct? And pom files include groupId and name no? So t.d.a can compare the coordinate I used in my :deps map again what's inside the pom no?

seancorfield02:09:48

Is that required in a pom file? Does any tooling check that?

didibus02:09:52

This emoji :rolling_on_the_floor_laughing:

1
seancorfield02:09:57

I don't know if aether exposes that when analyzing a pom file but it's an interesting option. Again, add that to http://ask.clojure.org item!👍

👍 1
didibus02:09:11

I'm not sure, I assumed it was required in a POM, but maybe it's not. That said, I think what I'm asking could be optional, might have to be anyways for backwards compatibility. So maybe it can be if pom.xml or deps.edn declare a lib group/name then assert it matches otherwise fallback to current behavior.

didibus01:09:57

So to recap my concern:

{io.github.xadecimal/riddley {...}}

{com.github.xadecimal/riddley {...}}

{com.xadecimal/riddley {:git/url "" ...}}

{com.xadecimal/riddley {:mvn/version ...}}

{garbage/whatever {:git/url ""}}
Will all resolve and will not conflict with one another, and be treated as separate libs if I understand, even though they are all the same lib, thus allowing to bring multiple conflicting versions on the classpath and having random order of which one will actually be used in production. And this is true even in the transitive dependency closure. Is this just an accepted issue, or are there plans to address this in the roadmap?

hiredman01:09:18

Same issue exists with maven alone, just takes more effort to trigger, because you have to publish the same artifact under different group and artifact ids

hiredman01:09:38

Which definitely happens on clojars (people have a habit of publishing 3rd party jars to clojars under their own group id, so the same library is there under different maven coords)

didibus01:09:48

Yes, but at least here, in theory, it is two different libs, albeit with people not taking care to properly namespace their code so it won't conflict. I guess we don't know the statistical likelihood of this scenario, and shall see if it happens a lot or not. But I also feel t.d.a could address it before we even have to find out, and was curious if it's already planned, or if it's predicted to not be a major issue and won't get addressed.

hiredman01:09:40

Nah, it is the same lib, just published under different names

hiredman01:09:38

Also git is a distributed vcs, and tools.deps supports local repos too, so you can go absolutely wild, no reason the artifact id has to be anything like the url of the repo

hiredman01:09:43

This is fundamentally the issue in maven, that the maven coordinate names have no relationship to the names of the code in the artifact

hiredman01:09:44

Which can express itself in different ways (badly packaged jars that include some dependency in them), the same code packaged slightly different ways and pushed to different maven coords, etc

hiredman02:09:14

The difference between this and git deps is who gets to assign the name

didibus02:09:50

It's a fork no? If it's published by someone else? I mean yes it might be the same code, but in theory it's a fork. I guess it's philosophical if a fork is the same lib or not. Anyway, I feel the forking scenario logically would be much less common. Also, for some forks you'd want them to be treated differently, so you can have both the fork and the original on your classpath (for good forks that remembered to rename their packages/namespaces) But here it's the same lib in every possible way. Same code, same git repo, same author, same publisher, etc.

hiredman02:09:00

In maven it is whoever publishes an artifact (which doesn't have to be the person who wrote or packaged it)

hiredman02:09:19

For git deps the consumer chooses the name

hiredman02:09:48

You can absolutely find the exact same code published on clojars under different artifact names

didibus02:09:02

Yes, same code, I said same code and author and publisher and git repo, not or.

didibus02:09:02

Anyways, consumer choosing the name seems at odds with conflict resolution.

hiredman02:09:14

I think the best thing would be to use the git url as the artifact name directly, instead of having the url in the version map, but the ship has likely sailed on that

hiredman02:09:27

Again, that doesn't solve all possible clashes, but puts things in a similar place to maven

hiredman02:09:37

But, depending on your view of the capabilities of the consumer, putting the choice in their hands could be seen as a way to give them an escape hatch to rectify poorly chosen names

hiredman02:09:19

So maybe instead of making the git deps situation more closely match maven, there should be a way to override the artifact id for maven deps

hiredman02:09:40

For the discerning build engineer

didibus02:09:14

I think overriding might be a good feature in addition to also being better at properly identifying conflicts between equivalent coordinates.

didibus02:09:08

Right now I'll create an http://ask.clojure.org that suggests adding a validation that if a lib defines a :lib com.foo/bar inside it's deps.edn, then people that take a dependency on it have to use that same lib name otherwise t.d.a errors. That would solve the issue, in that the owner can now again decide what the coordinate should be and make it standard. And possibly a second feature could be to override that when needed, if say someone else happened to use the same lib-name.

hiredman02:09:03

That would lock things down beyond what maven does

didibus02:09:25

How so? Seems more to me it would bring parity. You publish a lib on git, you get to choose the unique name for it that will also be used for conflict resolution.

seancorfield02:09:04

Nope. Any old garbage can be published to Maven/Clojars once you have the group verified.

seancorfield02:09:54

You can't currently enforce much on the Maven side. But I think this would be a great enhancement on the git deps side.

Alex Miller (Clojure team)03:09:09

I didn't read all of this, but basically you, as a library maintainer, should tell people what coords to use. To date, this has not been an issue. I'm open to adding more validation around this - at the moment it's low priority. Also, I've scoped out how to validate maven deps against deps (maven deps usually have git coords and that's sufficient to determine “newer”. Not implemented yet, but there's a hole for it.

👍 2
1
didibus03:09:10

@U064X3EF3 The request on http://ask.clojure.org summarizes it, you don't have to read all this. But basically, I'm asking for a way for tools.deps to assert that users use the coordinate that the lib maintainer want them to use. That will also let the lib author make sure if they double publish to maven central or clojars as well as git deps, that the coordinate used by users is same between those.

1