This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-12-26
Channels
- # beginners (74)
- # cider (3)
- # cljsrn (1)
- # clojure (182)
- # clojure-dev (4)
- # clojure-europe (1)
- # clojure-spec (5)
- # clojure-uk (58)
- # clojurescript (44)
- # core-async (5)
- # core-logic (10)
- # cursive (6)
- # datomic (13)
- # duct (1)
- # fulcro (2)
- # graalvm (4)
- # leiningen (1)
- # off-topic (7)
- # overtone (6)
- # random (3)
- # re-frame (17)
- # reitit (2)
- # shadow-cljs (6)
- # spacemacs (4)
- # sql (12)
- # tools-deps (3)
I've noticed a common use of comp to use a function that is passed in as an argument. For example
(comp f vals) v (f vals). Why the comp?
Can you show an example ? From what you have here, (comp f vals)
is not the same as (f vals)
. Perhaps you meant apply
? If vals
is a list of (say) 3 values (1 2 3)
, and f
takes 3 arguments, then (apply f vals)
would be the same as (f 1 2 3)
.
What I've sen is functions which use (comp f arg) where I would normally just do (f arg) or maybe (apply f args) and I wasn't sure why you would use comp. I've seen this more than once, so thought there must be a reason. Is there any situation you would use comp when you just have a single function to evaluate?
@theophilusx That doesn't make sense. (comp f arg)
is not the same as (f arg)
nor is it the same as (apply f arg)
. Can you link to examples of functions that you think are using (comp f arg)
in this way?
Unsure if this is the right channel to ask this question... I'm designing a microservice on AWS, and its artefact is an orchestration of a couple of SQS queues, 3 Lambdas, APIGW, among other things. How do I structure my code, and write my build program (invokable with cli-tools) so that it produces three artefacts, one per lambda? I'd like to stick to one git repo if possible. Some amount of domain logic will be shared b/w the three lambdas.
@jaihindhreddy What sort of artifacts do you want to build? Basic JVM uberjars? Is the JVM/Clojure startup time acceptable for Lambda for you?
Assuming uberjars are fine, you're talking about a monorepo with multiple subprojects -- which is what we have at work -- and we build uberjars with depstar
/ CLI / deps.edn
.
Each subproject has its own deps.edn
file and treats the other subprojects as :local/root
dependencies (via ../<subproject>
paths) as needed.
Since depstar
builds the requested JAR/uberjar based on the classpath, you would just cd
into each lambda subproject, and build the JAR based on the classpath created from the deps.edn
file, just as if you were running the lambda locally (but running hf.depstar.uberjar
as you main namespace instead of whatever is in the lambda code).
Does that help?
Helps a lot!
At least one of those is latency sensitive, but doesn't do much computation, so I'm thinking Graal Native image.
Never tried Graal though, and not sure how it works with Clojure.
But yeah, the monorepo route makes sense.
There's a #graalvm channel so you can check that out. There are some restrictions that go with it (can't use Spec, for example, at the moment, and certain "dynamic" things) but startup times are awesome.
@seancorfield Sorry, typo on my part - should have been ((comp f) v). One example is
(defn process-map [f m]
(into {} (map (fn [[k v]]
[k ((comp f) v)]) m)))
which I think is just poor code, but wasn't sure. I can think of a couple of alternative ways to do the above which are shorter and I think clearer. However, I saw this ((comp f) v) construct a couple of times and was wondering if there is something subtle I'm not understanding
No reason whatsoever since (comp f)
is just f
. Probably a leftover code that used to be something like (comp f g)
.
@p-himik Thanks. That was my suspicion, but thought I'd check
I've seen this a few times, people asking how to build several services together because of shared code. Wondering if this is common to Clojure (I'm a Clojure newbie). What I would do in Java though is, isolating that code to it's own project and build that as a dependency for the other services. You would also be able to have independent build routines for each then.
I don't know if it's more or less common in Clojure than in other languages. Having multiple projects, potentially multiple git repos, instead of one single big codebase is an open question IMO. I've had a very annoying experience in a previous work where we had many common libraries, each living in their own git repo, each used in multiple projects.
There’s a lot of folklore around it but once you’ve worked in a monorepo, you probably won’t want to go back.
> many common libraries It sounds wrong. Maybe you made libraries from things which are used only in 1 project or split them too much. Like a micro libraries ;)
I'm sure if we were a bigger team I would have found advantages to that, but as it stand, our small 3-man team dev spent a lot of time time doing commit -> pr -> release -> update dependencies of all relevant projects
whenever we needed to make a change in one of our common libs. Which was often. Working there I dreamed at night of working with a monorepo.
you could use ci/cd for this purpose. If detect new version of X, then update dependency in project and test it with new dependencies. If pass make a auto commit with new ver. of your library.
This is probably the way which I would choose if I would find myself always bump version in each project after release.
But for sure not keeping everything in one repository. It always end with a mess and complexity made by developers.
Yes, context definitely matters. I don't think he was talking about a monolith though (one big code base). But they can also be so small that having them in the same project could make sense (has it's pros & const). Also Alper also brings up monorepo, which many companies seem to be using now (never used it myself). You could also group small utility projects into a super pom, then use it as a dependency in your services. It's maybe just a personal preference, but I feel that edge/outward facing services should be as independent as possible.
you could use ci/cd for this purpose. If detect new version of X, then update dependency in project and test it with new dependencies. If pass make a auto commit with new ver. of your library.
> many common libraries It sounds wrong. Maybe you made libraries from things which are used only in 1 project or split them too much. Like a micro libraries ;)
Monorepos have an intrinsic risk to couple everything to everything (b/c no clear boundaries), and also hinder reusabiliity Say you want to reuse a given a snippet in a completely different project, what do you do? - copy/paste - add the new project to the monorepo, further increasing size/complexity Another problem I can think of is: what if you want to keep 2 versions of a function? With libraries it's easy, you keep 2 versions of the library Also, probably say goodbye to sharing functions as OSS. ...I don't doubt one can do monorepos right, but they are also some sort of self-fulfilling prophecy when it comes to their actual benefits
I do lib-driven development. I'd say the trick is in making lib creation, CI integration, and .jar releasing damn easy. For that I leverage https://github.com/technomancy/leiningen/blob/master/doc/TEMPLATES.md quite to the extreme
> you want to reuse a given a snippet in a completely different project You add it to the monorepo. That’s what it’s for.
> what if you want to keep 2 versions of a function? Keep two versions but separate them using a different name/folder whatever?
> You add it to the monorepo. That’s what it’s for. Not necessarily. It's also plausible to have 1 monorepo for a single logical project, however big. Having different logical projects (e.g. accounting + garden maintenance) under the same repo starts becoming ugly. Think IDE integration, git history, etc
> Keep two versions but separate them using a different name/folder whatever?
Expensive way of versioning.
To be fair advocates this, but in practice people aren't making a v2
namespace for every single change that can be deemed breaking. It's not practical as developers/branches/... scale
A lot of the problems (besides) folklore are tooling support, that’s true (that’s why FB and GOOG have their own versions of vcs to support this).
It’s usually called ‘branching by abstraction’ and I recommend it 100x over git branch
.
We have this very same situation at work but nothing says that you couldn’t build libraries from the monorepo and ship them to your local artifact repository for instance. Linking does not need to be at the source code level. The whole Golang dependency/library situation seems to be messed up anyway.
Sure thing! I just thought it best to keep this out ouf the main channel.
For context we have 90k lines of Clojure in a monorepo with about thirty subprojects representing about a dozen services.
An alternative way to handle monorepos with lots of libraries in them is with https://github.com/ingydotnet/git-subrepo Makes it much easier to both have easy code availability and the ability to share changes with other projects.
I think that just adds extra complexity...?
Not if you vendor in shared libraries and you both 1. Expect them to be updated externally 2. Want to change them and push the changes upstream
> For context we have 90k lines of Clojure in a monorepo with about thirty subprojects representing about a dozen services.
That doesn't tell much (positive or negative) about codebase health
As I wrote, I don't doubt one can do monorepos right
. It's just somewhat unfortunate that the current status quo seems to be defaulting to monorepos. Reinventing git/lein/maven/IDEs with techniques that subvert their base assumptions seems just wasteful (but not unfeasible!) to me.
@p-himik ah, external libs... OK, I can see a small benefit if you do that a lot. I would rather avoid that practice tho'
Not necessarily external - just shared. "Externally" w.r.t. the project. To each its own. :)
@U45T93RA6 how does a monorepo reinvent or subvert how git/lein/etc work?
(We switched from lein to boot in 2015 and then to CLI/deps.edn in 2018... So maybe monorepos are harder to use with lein?)
It’s not that much about the code I believe, it’s about the scalability of your organization. There is a reason why larger orgs go for this approach. In part because they can afford the tooling overhead, but in larger part I believe because the overhead of separate repos across many contributors/projects adds up and slows everything down.
* git history becomes far less useful * one may find himself using submodules or one of the various alternatives * github issue/pr templates (generally) become one-size-fits-all * one may starting using e.g. https://github.com/amperity/lein-monolith (many languages have an equivalent thing), depending strongly on unofficial features * normally IDEs assume one directory = 1 runnable project, so one would have to work around that * likewise, CI runners generally assume 1 repo = 1 project I bet all of these are workable. The point is about using things as they were designed, which IMO tends to be the simplest approach (in the sense) to begin with
Git history -- maybe, but the upside is also being able to see all related changes across all subprojects in a single PR.
Submodules -- why? I mean, just no, don't do that. It adds unnecessary overhead/complexity.
Issue/PR templates -- OK, if they're used at all. And I wouldn't advocate a monorepo for public OSS use except in very specific circumstances; I think a lot of companies like the monorepo approach internally for a lot of reasons to do with simpler workflows and management of everything.
lein-monolith
-- argh, no, just don't! Again, like submodules.
IDEs -- Not seen that in Clojure. We run a single REPL with "everything" on the classpath and do all our development via that single REPL.
CI -- we have a single, fairly simple build
shell script which wraps clojure
so we can automate running multiple clojure
commands from a single shell invocation -- and that's the only nod to non-standard setup we have: build tests *
will drop into each subproject and run clojure -A:test:runner
basically.
If you like submodules or multiple repos, use them. They just add complexity as far as we're concerned at work.
And I also disagree about them cramping any OSS efforts -- we've spun off several things as separate OSS libraries and that library just becomes a dependency in the subproject it was lifted from. Although I will admit that doing so adds overhead (both of documentation and maintenance, in addition to fragment PRs if you need to rev the external lib first as part of some internal rev -- so it's easier to do this for stable pieces of subprojects, rather than things that are still evolving).
(but, yes, I will also concede that I'm sure there are orgs that do monorepos really badly -- but there are orgs that do all sorts of aspects of software really badly)
> IDEs -- Not seen that in Clojure. We run a single REPL with "everything" on the classpath and do all our development via that single REPL.
if using tools.namespace refresh
everything will become slower.
...I think you don't use refresh
, but that's a large different topic. I do think though that monorepos impose a specific namespace dependency layout, whereas libs keep things separate. You can refresh
libs in isolation, or compose a mega project if you really feel like it
> CI -- we have a single, fairly simple build shell script which wraps clojure so we can automate running multiple clojure commands from a single shell invocation
Also sounds slower? i.e. one changes a single utility defn, the whole integration suite may be triggered
Btw, I'm happy to see all kind of points reflected in this thread, reflecting the variety of techniques used among practitioners. But personally I'm not here to "argue on the internet"
We don't use the refresh
workflow. We're in the same camp as Eric Normand and Stu Halloway on that.
Re: CI being slow -- we run tests as we're working in the editor/REPL so it's not like we suffer the CI overhead locally. Our build
script can also run "just" the relevant tests for a given subproject. But, yeah, we're looking forward to whatever Cognitect do with automating the concept of test-only-what-changed...
(and, yes, this is an interesting discussion -- always good to hear people's opinions on pros and cons of various approaches!)
> We're in the same camp as Eric Normand and Stu Halloway on that @seancorfield Can you elaborate a bit? I cannot find anything relevant quick enough.
Eric talks about this in his REPL-Driven Development course as something he's never used -- and the course talks about best practices around working in the REPL and how to avoid the problems that refresh
is supposed to solve.
Stu has talked about this in podcasts, and perhaps in one or two of his talks.
Both of them favor very simple, streamlined tooling/workflows, with as little "magic" as possible, as do I.
This is a good place to start https://danluu.com/monorepo/
So they didn't really choose a monorepo, they had a legacy monorepo and a whole infrastructure build around it. So when they moved away from perforce they just kept that model
I would really need to work somewhere that uses a monorepo between a large number of teams I think to know if I'd like it better
At Google scale you definitely need custom tooling since there nobody will have the entire thing checked out on their laptop ever, but for most companies at their scale it is very much doable to have all source code in one repo. I would go for that if ever I had a greenfield situation again.
One thing: if you get developers used to working trunk-based with a monorepo, then it becomes very hard for them to go back. I have one such case right now.
Described here https://gist.github.com/chitchcock/1281611
Basically what @U45T93RA6 was saying. The lack of strong boundary, and thus people seeing the whole monorepo as one project and start to couple components and have circular dependencies and all that, since the monorepo make those easier
As I said earlier -- some developers will do bad things regardless of the environment. That's why code reviews / PRs are a good thing -- to help educate everyone to the same level in terms of good (or at least accepted) practice within an org...
Ya, that's something I've been struggling with. Obviously, as a Clojure fan, I believe the developer is the most important part of any software project. And I like tools and languages that empower them instead of constraining and limiting them. But, as I go along, I'm finding it more and more difficult to mentor and scale. You can't do all CRs, and as an organization grows from one team to 5 and then 10, etc. With more and more junior devs. It gets hard, and it becomes really tempting to automate best practices by having tools enforce them or systems that make the good way most obvious and easiest.
Well, the problem there isn't the language or the tooling -- it's a management problem in terms of how they've handled growing their organization: they've allowed quality to slip as they've added more developers, and more junior developers, without proper support for mentoring and training. But you're right that is a common problem in industry, unfortunately, and it's often deeply entrenched in organizations that have already grown large -- and are often beyond help at that point 😞
And as I go, I'm also realizing a lot of developers want that and enjoy it. Which is surprising to me, because it's so different from my personality. Where I want more power, options, flexibility so that I can innovate and optimize everything to what's best for me and my team. I'm starting to realize a lot of developers don't want that. They want things to be consistent, straightforward, with checks in place, and clear directives, etc.
I worked at a firm of actuaries in the UK for a while, and when I left, the head of department asked me if I had any advice on my way out. I said they could fire about half their IT department, flatten the management structure, and they'd probably be twice as productive for about two-thirds of current costs.
They had a terrible interview process. They promoted good developers into a tall, very hierarchical management structure very quickly, backfilling with juniors, and they had no training or mentoring programs in place.
The head of dept was very proud that he had 100 people working for him 😞
Its weird. I always believed in "dream teams". A select set of developer with balancing skills that work effectively together. Give them focus and independence, and you get the legendary 10x team 🤪 But the industry really doesn't
Indeed. Very few companies really believe in small, high-quality, autonomous teams 😐
Adobe was the same (but Macromedia was not). At Macromedia, I was considered "director" level but had no direct reports. I was a free agent to move from project to project and inject architectural guidance across the whole of IT. When Adobe bought Macromedia, I was told "we don't have architects" and because I had no direct reports, I was demoted two levels down to "tech lead". It was miserable.
I quit within a year. I could never work in that sort of environment again.
Unrelated, and back to monorepo. What does that link mean by managing dependencies? Does it mean that you'd add the src folder of your dependencies to the say.. Lein sources vector? Instead of using artifact dependencies like maven?
The whole monorepo having a single version number? I don't fully get it. Doesn't that just make it one giant code base, like a big monolothic app, which I feel is not the same as a monorepo.
"version number"? With Git/SHA-based deps, that isn't relevant.
I don't know how you'd manage a monorepo with lein
-- we use CLI/`deps.edn` and that supports :local/root
for dependencies, so each subproject can depend on others that way: worldsingles/environment {:local/root "../environment"}
for example.
I see, and its unversioned. So its assumed you can only commit when all dependencies have similarly been updated? Or do you adopt a forever backwards compatible approach?
It's still versioned. It's just versioned via Git. We tag full system releases, and for individual artifact releases, they are versioned based on their SHA and how many commits they are past the last tag, e.g., 2019-12-17_14.44.00-3-g361052e30
(that's the current production version of one of our services -- 3 commits past the last full system release, with a git SHA starting with 361052e30
).
Looking in our Git history, I can see that includes a fix for WS-11932, vs the tagged release, so I can find the JIRA ticket relating to it easily.
The whole repo has a SHA. That is the version of any given dependency for a particular artifact build.
Like I mean, if I use local/root with a subfolder I would always be getting the latest code of that dependency. But say it's maintainted by another team, and their latest broke my code?
And I don't want to wait for their fix. I want to deploy my code changes which maybe also provide a critical fix. But now the code on their subfolder doesn't work? In a multi-repo, I can choose the version of their package I want to use. How would I do that in the monorepo style?
master is always stable and fully-tested. Features are developed on branches. So you can release from master or a branch and each will be consistent.
Git branches provide that level of control.
Their is no concept of "versions" -- there's no point in that.
Does it? Without a cherry pick? Like can I grab team's Bs latest and team's A two commit before and my latest?
You're still thinking in terms of "large releases". I'm in basically a CI/CD world.
Master is always releasable to production. You can branch/merge to get the latest working set of features you want to develop on top of. But the granularity between releases is generally very small so you're never "waiting" for a completed feature -- it will always be on master already.
(the only reason we're a week past our last full system release is the holidays and folks have been out of the office so there just hasn't been any work to release)
Maybe its just a whole other approach so I'm getting confused thinking in multi-repo terms
I think one of the things that Cognitect are pushing with CLI/`deps.edn` and for Clojure in general is to get away from this idea of "versioned artifacts" as "things" that you deploy to some archive and then fetch particular ones when you "build" your system. Instead, you can depend on source code via local "dependencies" and git URL/SHA.
For us, we make releases of services to production based on a "snapshot" of (a portion of) our source code at a given point (where all tests pass etc). We can make independent releases of each service as we need. Each service is built from a git SHA that is "consistent" across all the source code in that "build", and it is arrived at however that "team" needs it. In reality, that's nearly always a "snapshot" from master but it can also be from any branch since the SHA uniquely identifies any committed "world" of source code -- even tho' not all of that "world" is packaged into a service for deployment (just the subset it actually depends on).
Because we have the SHA of each released service (directly queryable via an API each service exposes), we can easily reproduce any past release by resetting the repo to that SHA -- and branching from it to do any additional work we need, such as for releasing a patch etc, and then that branch will ultimately get merged back to master (and may be merged across to other services' branches if needed for other interim releases).
Hum. It seems interesting. Very different. I'm still not sure I'm following all of it.
Happy to continue the discussion any time via DM if you feel this thread has exhausted itself...
These two articles are interesting https://medium.com/@mattklein123/monorepos-please-dont-e9a279be011b and https://medium.com/@adamhjk/monorepo-please-do-3657e08a4b70
The first was directly related to my prior question, and the second kind of explains it by saying that maybe you should feel the pain of breaking the entire company
Anyway, ya. I might revisit this in the future. Will DM you if I have any more question. Thanks for the insights into monorepos
I remember Matt Klein's article when it appeared... and of course I disagreed with most of it 🙂
Which builds, and runs everyone's tests, and makes a deployment everywhere on all code change?
We can run CI/CD on a per-service basis, where it only runs the tests for the subprojects that a given service depends on. We can also run it for the entire system. If we run it per-service, then there are quite a few repeated tests between runs since some of the low-level subprojects are used by every service but, on the other hand, the slowest tests tend to be those for the "top" of the service chain so those typically are not repeated by multiple per-service test runs.
I don't think I'd read Adam Jacob's response before so thanks for that link. Good reading. I agree with him 🙂
OK I see. Well, I don't have a lot of ways to try out a monotepo. But maybe I will change my personal projects to all be in one repo, see how that would go
Ya, I found the rebuttal pretty relatable. I definitely encountered scenarios where I was team D. And you go to team A, hey you guys broke us. And they're like.... Oh.. Well you see, B and C already did all the migration work, and we forgot about you, so we've already added more features on top, and you took longer to notify us, so I guess you'll just have to migrate as well or use an old version
You want to say.. Well... clearly you made a wrong decision and you shouldn't have made such a breaking change without consulting all your clients first, but you're kind of forced to capitulate since everyone else already migrated and moved on.
I will say, if you have Facebook/Twitter/Google scale and the sheer volume of source code won't fit on an average developer's hard drive then, yeah, you have a real problem. But a polyrepo only mitigates that if no developer needs to work across enough of the code that all the necessary repos can't fit locally -- and then you're back in the A/B/C/D team API breaking scenario...
Ya, honestly I can see both having pros/cons. And it doesn't totally seem like either solves all the pain points
I think I tend to like the open source model, and I treat internal packages the same generally
If you are not at that scale -- and you can fit "all" the code on every developers' laptop -- then a monorepo has a lot of benefits with very few downsides in my opinion.
Aye, and OSS projects don't generally work well with a monorepo approach (again, IMO).
In my search I found this as well: https://github.com/mateodelnorte/meta
Any time you use a one of those shell scripts or git extensions, I’m pretty sure you’ll enter a world of pain. (I consider git submodules already to be unusably abstruse.)
Seems pretty straightforward too, it just runs shell commands over a set of directories
And it stores the set of repos/local directories to run the command over and that are part of the meta repo in a git repo of its own.
I liked the simplicity of the idea. Now, it might have bugs or rough edges so I cant really say
Our build
shell script at work is essentially a specialized variant of loop
-- but it doesn't require Node.js/npm 🙂
(`meta` is a Node.js/npm system built on top of loop
)
Ya, I wasn't super stoked about that part. Tried to see if there was an equivalent not in Node
Yes. build tests api auth login system
will drop into the api
, auth
, login
, and system
folders and run clojure -A:test
in each one, essentially.
(it does a bit more than that since it has some conveniences around certain aliases, and it "knows" about expanding :local/root
deps.edn
files to get a recursive list of subprojects to operate on: build test-all api
expands to build tests api and all the subprojects it depends on
I have defined these predicates and spec
but when I execute the above line in repl I get false
as result
(invalid_transfer? {:uuid_account "745286b0-24d3-4b17-ab24-d1265e9fb8d1" :transfer_data {:uuid_account_destination "44444444444"}})
amount is missing from (s/def :unq/transfer (s/keys :req-un [::uuid_account ::transfer_data]))
qq, how do you pass a map as an arg with lein run -m?
lein run -m my-ns/my-fun {:opt1? true}
obviously this doesn’t work, but I am wondering how to format the hasmap to make my-fun using it
such that it calls (my-ns/my-fun {:opt1? true})
thanks!
Yup, that would turn the sequence of strings into a hash map from string to string, pairwise.
user=> (let [{:as opts} (range 4)] opts)
{0 1, 2 3}
@U0K064KQV That would pass it in as a string which would then need to be parsed (read as EDN).
The suggested solution is to provide the key value data without { }
and without :
-- just as strings on the command line, and then use & {:as opts}
to automatically turn that into a hash map (string -> string).
But quoting it on the command-line (as a regular EDN hash map) and then parsing a single argument (first args)
from -main [& args]
would be another reasonable approach.
Either way, the function on the receiving end is going to need to do some work if the goal is to pass values that are not strings (such as true
).
Ah I see. What's the machinery at play here? Does destructuring auto parses string into maps in any function as well? Or is that specific to the way clojure.main bootstraps the main fn?
Right. So either you quote the EDN as-is and the receiving function must parse it (read it back as EDN from a string) or you pass the key/value pairs as plain sequential values on the command line and deal with them as a hash map of string -> string which still may require some parsing.
Bottom line, you can't pass actual Clojure data directly into the target function without at least some parsing in a wrapper function) @U07C4S0EM
ok, thanks guys!