Betterstack attempts a definition
cyclic graph be like:
:ekil eb hparg cilcyc
What do people think of the recent NPM supply chain attack, and how Ruby just removed all access to RubyGems, and so on... And how that all relates to Clojure? Like Clojars, Maven, git-deps, etc. ? And I guess also ClojureScript, since that's at the mercy of npm as well.
I'd argue that the biggest issue was brought in by Maven with its transitive dependencies. And with it the dependency graph just exploded. Prior to that you were forced to a conscious relationship to your dependencies (or at least an explicit one). Scrutinizing dependencies in a jvm-based project today is overload.
... but convenience always wins, i guess
Your messages seem to contradict each other. How come scrutinizing is an overload while convenience wins? If we compare two hypothetical projects with identical functionality with one project having well-scoped transitive dependencies and the other having only top-level dependencies that all implement what they need or explicitly require you to manually add a dependency on something else, the first one would much more maintainable. I would have to look at some transitive dep X@2 just once to know what it is and what it does. I wouldn't have to look at A's, B's, and C's version of X in their own sources. In my eyes, transitive deps are both more convenient and less overload. But they are more prone for supply chain attacks simply because the attack surface is larger. If the convenience to review is there, it should be used.
No, transitive dependencies leads to more dependencies, just because it's convenient. It's a complete culture change.
That's why I mentioned "well-scoped". More dependencies does not necessitate more code to review. It also draws a clearer and much stricter boundary between different pieces. It makes it easier for you to substitute something if you find it unappealing for whatever reason.
That last one doesnt make sense. If you'd like to substitute something 7 levels deep in the graph, then good luck. A flat list is always easier to reason about. But yeah, it's much harder to maintain dependencies manually, but at least it gives you the control.
Why would substituting it be a problem if it has a separate API? It's literally just a coordinate change in your deps.edn, it's a trivial thing that I've done more than once.
I don't mean "substitute" as in "use a different dependency with a different API thus rewriting all the client code".
I mean "use the same API but a different impl".
I still have full control with deps.edn, there's nothing that can happen that I wouldn't be able to alter.
I guess that this is a much less of a problem with Clojure and the culture therein. But with Java and frameworks the dependency graphs quickly end up being unwieldly. And I still argue that because of the convenience of transitive deps, we get much more deps. If we didn't have that convenience we'd have more self contained libraries and maybe even a richer standard library.
npm is a great example of dependency explosions, btw
(and it took npm several years to admit that automatic upgrades of transitive deps was a potential bad thing)
Yes, we're arguing about different things. I'm staying in the "well-scoped, at least mostly" realm, and you argue that in reality it's harder to stay in that realm with transitive deps. And I do agree with that notion.
> (and it took npm several years to admit that automatic upgrades of transitive deps was a potential bad thing)
Ha, last time I checked (years ago), the incredibly idiosyncratic author of one of the most popular Python package managers, pipenv, had an explicit and not-to-be-argued-with strong opinion that all deps must be automatically upgraded if you want to upgrade even a single library.
Ugh! That's horrible. That was also the prevailing opinion in the npm community that it was much better to automatically upgrade dependencies because it allowed fixing security issues easily. But the potential cost of the side-effects were completely ignored for a long period. (anecdotally, we were using npm in a project on windows a long time ago, and someone managed to break a very central utility library in a patch upgrade, because they didn't test it on windows, and it was impossible to fix except for manually patching the library - which was not easy to do on the ci server ... shortly after that npm introduced shrinkwrap, which I guess is the precursor to lock-files).
So it seems we're not any better guarded than NPM was.
1. You have to be careful about git preps 2. You can't tell if a transitive dependency gets updated under your feet. 3. Clojars tokens have full permissions to all groups/packages 4. No 2FA on CI/CD possible 5. Top-level code running in tests or at AOT time can do anything the process is allowed to do. 6. No way to lock transitive deps easily 7. No way to add a "lag" like grab new version if released for a month. 8. No trusted publishing with short lived tokens and so on.
> You can't tell if a transitive dependency gets updated under your feet.
That doesn't happen when fixed versions are used, which is almost always the case with Clojure.
> No way to lock transitive deps easily
With the above, there's no reason to lock them (assuming an already existing version cannot be overridden in the repos).
But there's still a way - you just print out all the deps and put them all into deps.edn. But again, there's no reason to do it.
> Top-level code running in tests or at AOT time can do anything the process is allowed to do.
And any non-top level code can do anything your app can do, yes. That's why any dependency should be vetted. :)
Assuming nothing on your dependency tree uses "latest", but what I mean if you bump up the version of a direct dependency, now you cannot easily know what else has been bumped up transitively.
Doing anything your app can do is fine. The issue is also being able to do anything your build script can do when it launches test runner and AOT.
Assuming nothing on your dependency tree uses "latest", but what I mean if you bump up the version of a direct dependency, now you cannot easily know what else has been bumped up transitively.Depends on what you mean by "easily". It's just two runs of clj -Stree with two different versions of deps.edn and a diff on top. (Aliases should be taken into account of course, that's highly app-specific. Perhaps something like antq can actually be extended to output the list of transitive dependencies that would be changed after any suggested update.)
> Doing anything your app can do is fine.
There are attacks that target things like this. Take any cryptominer for example.
There's always threat vectors, you want to reduce the surface though. Given the particular issues that were brought up, they are not as known. That your app can be compromised and then someone could do anything it can do is an issue, but it's well known. That simply pulling down a repo and running the build can be a threat vector, without ever running the app, is much less known.
I don't know if you read the details. But it was able to make its way through more popular packages by stealing CI/CD access one at a time. It would add itself to another build pipeline to steal their keys. Then it would add itself to all the pipelines those keys had access too, and so on. That way it managed to make its way in many packages.
Once it found itself in a popular package, it could now push malicious code to the library itself.
And it didn't have to modify the GitHub code, it could publish a new version of the lib with malicious code directly to NPM.
> That simply pulling down a repo and running the build can be a threat vector, without ever running the app, is much less known. Not sure I agree that it's less known, but alright. You run thirdparty code in your CI/CD pipeline. I struggle to see how anyone would not consider it as inherently unsafe without proper measures. To me it has always seemed the case of people not caring enough. In any case, this particular disagreement is unlikely to be of any use in this thread.
> You run thirdparty code in your CI/CD pipeline. I struggle to see how anyone would not consider it as inherently unsafe without proper measures I feel it's pretty easy to not realize you are running 3rd party code. In NPM's case, a lot of people didn't realize that just pulling down a package automatically could run some arbitrary install script for example. In Clojure, you might not realize that calling prep on a dependency, will execute arbitrary code on your machine. You might not realize that using the AOT task in your build tool to create an AOT uberjar will run arbitrary code.
And even if you realize it, this isn't a simple thing to mitigate the risk from. Even if you containerize and sandbox it all, it could be enough to exfiltrate the Clojars token by just having it println the env variable. Now with that token, you can just publish a new artifact that is all new malicious code.
> Teams should start enforcing hardware-based 2FA, short-lived tokens, off-by-default install scripts in CI, cool-down period before adoption and organization-wide review of new package versions. Pairing these practices with SBOM-driven inventory with automated blocklists provides even better protection.
> also ClojureScript, since that's at the mercy of npm as well Only if you use NPM deps. Which you can avoid completely. It's nothing new, even places with the most scrutiny get successful attacks like that from time to time. But proper protocols and scrutiny reduce the probability of attacks, so hypothetically that would be nice to have. But it also incurs a lot of additional cost of various kind and might end up not being worth it. But also, it's commonly stated that dependency management in reasonably large projects is a full-time job. That's one of the reasons why. Don't just blindly install X@3 because you had X@2 and 3 > 2.
Maybe, just maybe, we don't need a lot of dependencies? I guess frameworks and all are cool, but there are tons of deps out there that export a single function. LLMs might help with that a bit, I guess. As these one function kind of solutions can be regularly easy to solve with LLMs. I use Claude Sonnet 4 for these and it just works.
I think maven, while there is the behemoth of central, has more support for running your own and more people taking advantage of that support
it just worksTill it doesn't. :D Just yesterday, an LLM was feverishly trying to convince me that RLS policies in PostgreSQL are restrictive by default. "Quoting" its "sources" and telling me that I misunderstood the docs. Then I found a few very professional-looking articles that claimed the same, all written after 2024. Now, it's not hard to imagine that someone would believe an LLM or such an article...
The common practice of using version ranges in js projects makes this kind of attack more effective, make a patch release with your payload and people just download it
Two things could make the JS ecosystem less dangerous:
β’ A package manager flag (or default) which ignores the "optional upgrade" aspect of npm dep specifications and just pins to the actual version number. This would make package.json act like a package lock with exact versions specified, similar to how Clojure/Maven deps work. It should apply throughout the deps hierarchy, not just at the top level.
β’ On new installs of e.g. something@latest don't install any version that is newer than one week (or maybe one day) ago, unless explicitly specified by the user. These backdoors of big packages generally get patched within hours. The reason they are so dangerous is CI and people doing npm install x for the first time.
Unlike other ecosystems, npm does this crazy yolo install-upgrades-automatically thing by default. It's absolute madness. It means 100s or 1000s of pieces of software are getting modified underneath you every time you install without a package lock in place.
The basic philosophy is newer > older. Whereas in my opinion the opposite of that is actually more often true (https://en.m.wikipedia.org/wiki/Lindy_effect effect).
Upgrades should be considered an intentional.
I feel the "don't use dependencies" or "audit everything" while good, are best intentions and not super realistic. I'm sure there are ways to limit the attack surface more in the design of things, that are systematic, than these best intentions. NPM seems more prone to supply chain attacks, but I'm wondering is it really? How does Clojars/Maven fair? In this case it seems one big issue was that CI/CD had a token that allowed modifying NPM packages that was very broad. Like it could just modify any packages owned by that person, and not just the package that the CI/CD was auto-publishing.
Something else I read, is if 2FA or some similar should be required for CI/CD publish, so there's always a human in the loop for approval.
This blog mentions some steps NPM is taking as well as GitHub: https://github.blog/security/supply-chain-security/our-plan-for-a-more-secure-npm-supply-chain/
PyPi introduced trusted publishing in 2023, and NPM is adding support for it. I don't know if Maven/Clojars supports it.
JS is a very small language, with almost no "core" library. If you want to do anything remotely useful with JS, you'll need a dependency. That might be why NPM is vulnerable to these attacks - if you want to know if something is a number, it's not straightforward So we end up with these immensely useless-ish "is-odd" packages
Also one aspect which at least is a bit safer in maven world is there isnβt an automatic post-install script you can run on each deps installation as youβd have in js world
@ovidiu.stoica1094 True, though I've wondered if you CI/CD does AOT compilation, it could run macros that would steal env vars maybe... Or I guess any top-level form would get executed during AOT.
If you're using a Clojure library with a prep step aren't you potentially pwned? Package managers are always a risk at some level. You're reliant in a central repository. What if it's down or compromised?
Security is a spectrum. Current npm defaults are quite far on the wrong end of the spectrum.
Ya, and here I was curious where on that spectrum Clojure would land. We benefit from being a niche nobody targets as actively to attack, but if we were?
If you manage to sneak malicious code into a repository you can wreak as much I'd not more havoc than you could with npm
There are a handful if developers in this community which if you manage to compromise will pwn half the ecosystem if not more. On the other hand, we're a small diligent community and I wouldn't be surprised if those libraries' commits are read for fun by n > 1 people.