Fork me on GitHub
#clojure
<
2020-07-01
>
solf01:07:45

I have a very simple but long regex, with hundreds of or:

(foo|bar|hello|world|...)
Performance is acceptable, but I wonder, does the JVM automatically optimize the regex, or should I do it myself? For example, should I extract the common parts, or is this done automatically?
(aaa|aab) => aa(a|b)
I might consider switching to something like aho-corasick

seancorfield03:07:23

@dromar56 Regexes are "compiled" to an intermediate form before matching is actually done. You can follow along in the JDK here https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/util/regex/Pattern.java#L1742 but it is gnarly stuff. I can't say this particular implementation compiles out common prefixes but when I've worked with regex parsers before (which was, admittedly, before Java appeared), it was fairly common to compile things to state machines and optimizing common prefixes was often part of that.

Adam Helins09:07:17

It's a small thing really but it has often bothered me... Is there a particular reason that split-with is implemented as [(take-while ...) (drop-while ...)] ? It is definitely not the best implementation one can imagine.

emccue09:07:00

@adam678 Probably lazyness?

emccue09:07:36

like - if you have a lazy infinite seq, that might be the best way to make two independent seqs at the end

Adam Helins09:07:52

@emccue This is a semi lazy version that works with infinite seqs and behaves as I expect (question is, am I the only one to expect that?) https://gist.github.com/dvlopt/03317879b5722baa9e915ff7dde28e3a

Ivar Refsdal12:07:04

I don't think this gist is equivalent to clojure.core/split-with:

(do
    (def p #(>= 5 %))
    (set! *print-length* 10)
    (println (clojure.core/split-with p (range 1e3)))
    (println (split-with p (range 1e3))))
[(0 1 2 3 4 5) (6 7 8 9 10 11 12 13 14 15 ...)]
[[] (0 1 2 3 4 5 6 7 8 9 ...)]

Adam Helins12:07:52

@UGJE0MM0W Indeed, this version would reproduce the first result by using #(= % 6) which I find to be semantically more accurate (but that's a personal opinion).

Adam Helins12:07:21

Or simply with #(>= % 6) which is closer to what you mean.

Jan-Paul Bultmann16:07:30

Say, if you have java code in a deps.edn project, you have to compile it with a makefile or something right? ^^'

deactivateduser16:07:22

Yes, though there are plugins that can do that from “within” a clj / clojure command line. Some examples here: https://github.com/clojure/tools.deps.alpha/wiki/Tools

deactivateduser16:07:01

The best way to think about “vanilla” deps.edn is that it’s not a build tool - it simply knows how to build classpaths and then run some code. This just happens to be a good first step towards various code-related tools (including build tools), and various plugins provide those kinds of capabilities.

Jan-Paul Bultmann16:07:37

Those things aren't necessarily separable though. For example if you have a deps.edn project and it contains java files and a custom build script...

Jan-Paul Bultmann16:07:06

With boot and lein, as long as it's published via the default publishing mechanism -> maven, you're guaranteed to be able to use it.

Jan-Paul Bultmann16:07:26

With deps.edn and a git url and hash, tough luck.

Jan-Paul Bultmann16:07:30

Also I don't think anybody wrote that particular kind of tool... ^^'

vlaaad16:07:35

> With deps.edn and a git url and hash, tough luck. Unless you commit compiled class files and state them in deps.edn as source paths 😄

Jan-Paul Bultmann17:07:50

😂:rolling_on_the_floor_laughing: good one

kwladyka17:07:02

Just make Java as deps of your clojure code and do:

{:aliases  {:uberjar {:extra-deps {seancorfield/depstar {:mvn/version "1.0.94"}}
                     :main-opts ["-m" "hf.depstar.uberjar" "api.jar" "-C" "-m" "api.core"]}}}}
clj -Spom
clj -A:uberjar
You can do it in a few ways, with pom or without. more on depstart dock

kwladyka17:07:42

personally I think compiling different language dependencies, like for example Solidity (Ethereum for cljs) it should be compiled separately from clojure / java. People often put it in lein, but IMHO it is wrong.

Jan-Paul Bultmann17:07:14

but that still wouldn't solve the recursive java dependency issue, or do I missunderstand?

Jan-Paul Bultmann17:07:53

you'd still have to commit prebuild stuff into your repo

seancorfield18:07:47

@USB60P79B Right. @U0WL6FA77’s suggestion doesn't help you with Java source code at all.

seancorfield18:07:51

You still have to compile your Java source to class files somehow. Manually or via some Clojure wrapper for running the javac process.

kwladyka18:07:08

What do you mean? Don’t you use Java in Clojure to just use :import? Maybe I misinterpret the question.

seancorfield18:07:38

Java source needs to be compiled separately.

deactivateduser18:07:44

This sounds like two different use cases: 1. I have Java code as part of my project that I need to compile (for which there may be deps.edn plugins that do it - see earlier link) 2. There is 3rd party code hosted elsewhere (e.g. github) that I want to use as an upstream dependency, and it isn’t deployed as a binary artifact anywhere (for which there are no deps.edn plugins that I’m aware of, but which may be solved via http://jitpack.io)

Jan-Paul Bultmann18:07:55

@U04V70XH6 Yeah that's what I feared, so basically clojure libraries containing java code are completely broken in the git as a dependency world. 😞

deactivateduser18:07:07

@USB60P79B which use case do you have?

Jan-Paul Bultmann18:07:55

3. I have java code as part of my project and I want to release it as a git dependency.

seancorfield18:07:12

Yes, any Clojure library that contains Java source code cannot be used as a git dependency, unless it also builds the Java source and adds the classes to git (yuk!) or build it as a separate artifact and make it a normal dependency of the Clojure code.

deactivateduser18:07:16

As part of your project or separately?

Jan-Paul Bultmann18:07:35

as part of the project

seancorfield18:07:13

Note: clojure.tools.deps is in this boat now -- it contains Java source code and therefore can no longer be used as a git dependency.

deactivateduser18:07:19

Some of the plugins I listed earlier may do this, or if not you may have to fall back to Leiningen (which absolutely supports this).

Jan-Paul Bultmann18:07:20

@U04V70XH6 yeah exactly, oh my god, I've been gone for 2 years and the eco system is completely broken 😭

seancorfield18:07:39

(well, you can depend on an early SHA before the Java source code was added 🙂 )

deactivateduser18:07:12

@U04V70XH6 I’d (strongly!) argue that projects that contain Java source code should be deployed as “binaries” (JARs, containing the compiled Java code) to Clojars or whatever.

deactivateduser18:07:25

It’s not insane at all, it’s just Java. 😉

seancorfield18:07:30

I don't think you could ever depend on a mixed language project as a plain git dependency @USB60P79B?

☝️ 3
💯 3
Jan-Paul Bultmann18:07:46

@U04V70XH6 I think lein should support it

seancorfield18:07:50

You can still build and package a mixed language project (and deploy it to Clojars/Maven). Clojure CLI doesn't support Java out of the box but there are tools (listed on the t.d.a wiki) that support building mixed language projects.

seancorfield18:07:16

I've never tried to use lein for git-based projects (and that's not core to Leiningen -- that also needs a plugin, right?)

Jan-Paul Bultmann18:07:43

but I think it manages to resolve correctly but I don't really remember because when everybody publishes jars it's a non issue

seancorfield18:07:47

(I haven't used Leiningen for close to five years, at this point)

Jan-Paul Bultmann18:07:03

the problem I have is actually that I'm depending on a library that hasn't been released as a jar

Jan-Paul Bultmann18:07:06

they use git dependency only

Jan-Paul Bultmann18:07:16

my library in turn has java files

seancorfield18:07:27

Sure, and if you package and deploy a JAR from a mixed language project there's no issue (and you can do that with deps-based tooling).

Jan-Paul Bultmann18:07:31

yeah I can use dep and package it

deactivateduser18:07:45

Does the upstream library you want to use contain Java source code though?

Jan-Paul Bultmann18:07:11

but this kind of multi class citizens stuff is exactly what makes other languages so annoying

Jan-Paul Bultmann18:07:22

no it doesn't, so I could build an uberjar from mine

deactivateduser18:07:25

If not, you can use a git/SHA dependency for it, but you’ll still have to build (compile) and deploy your library as a binary.

Jan-Paul Bultmann18:07:27

but that stuff is just ugly

seancorfield18:07:37

"depending on a library that hasn't been released as a jar" -- how do they expect people to use it? What instructions do they provide for it?

deactivateduser18:07:39

So don’t use Java? :man-shrugging:

deactivateduser18:07:21

So the upstream library is absolutely not an issue here, and we should focus on your mixed language project.

Jan-Paul Bultmann18:07:35

the issue here is that deps is not well thought out

seancorfield18:07:36

I thought Arachne had been abandoned?

seancorfield18:07:53

Clojure CLI / deps.edn is extremely well thought out.

☝️ 6
Jan-Paul Bultmann18:07:13

@U04V70XH6 you mean it's stabilised with extreme prejudice ;)

deactivateduser18:07:21

To repeat - Arachne is not your issue. The issue is the Java in your project, and how to build it.

Jan-Paul Bultmann18:07:08

I've been writing clojure professionally for 7 years without the gap, and mixing java into clojure has always been one of the main points

deactivateduser18:07:22

Arachne contains a deps.edn, and can be consumed downstream (using Clojure CLI) via a git/SHA coordinate.

Jan-Paul Bultmann18:07:23

we're not talkig about groovy here

Jan-Paul Bultmann18:07:48

yeah nah, I'm just gonna port that stuff to rust 😂

deactivateduser18:07:51

If you choose to mix Clojure and Java, that has nothing to do with your use of upstream libraries.

deactivateduser18:07:34

But as I said originally, I think there are plugins that can build a mixed language project via deps / CLI.

seancorfield18:07:53

Badgeon does it, I believe?

seancorfield18:07:04

(or however it is spelled)

deactivateduser18:07:12

Yeah that was the first one to pop up when I googled this.

deactivateduser18:07:18

There’s a blog post on it too.

Jan-Paul Bultmann18:07:48

mind sharing your google filter bubble?

seancorfield18:07:57

From Badigeon's README "Compile java sources" -- that's the very first bullet 🙂

vlaaad18:07:46

I don't see any problem with checking in compiled classes. Distribute binary artifacts, versioned by git? sign me up!

Jan-Paul Bultmann18:07:53

@U04V70XH6 hm yeah that might work thanks!

Jan-Paul Bultmann18:07:07

@U47G49KHQ the issues with that is that you break the reference any hash as a version contract

Jan-Paul Bultmann18:07:02

alternative is you run your build script on every commit and commit the artifacts

seancorfield18:07:09

Given that Cognitect work with mixed Java/Clojure projects and they've designed the whole deps.edn stuff and are working on tools.build (whatever that is), I expect this to become a solved problem with standard Cognitect / core Clojure tooling at some point, in the not too distant future.

🙏 3
vlaaad18:07:18

*on every commit that touches java

Jan-Paul Bultmann18:07:42

which is potentially every commit

Jan-Paul Bultmann18:07:37

@U04V70XH6 yeah but it's apparently been like this for 2 years, that's a pretty long time for a partitioned eco system where you have an officially endorsed rouge player that doesn't go nicely with the others

vlaaad18:07:29

I understand your frustration that tools-deps is not a full-blown build tool that can compile and run everything

vlaaad18:07:57

I am certainly looking forward to a build tool from cognitect that they are working on

seancorfield18:07:39

The design goals of the CLI/`deps.edn` have been very clearly stated from the start -- and dealing with mixed language projects was never on the table. Folks are always free to continue to use Leiningen (or Boot).

☝️ 6
vlaaad18:07:01

but as a "tool to download dependencies and assemble classpaths from them" it is the best I've ever used

seancorfield18:07:27

It is indeed awesome in that respect -- which is why we switched from Boot to the CLI back in 2018.

seancorfield18:07:06

We have a build shell script wrapper for the CLI that handles "build" tasks as opposed to classpath/running code.

Jan-Paul Bultmann18:07:24

yeah I don't mind that it does dependency resolution well, my issue is that it endorses git repositories as an alternative to maven, and that gives people the wrong idea that they don't have to publish their stuff anymore

seancorfield18:07:04

If tools.build means we can get rid of (or at least simplify) our build script, I'll be very happy. If it also handles Java source compilation, we may be more inclined to mix Java into our project, or at least start relying on "building" a couple of 3rd party Java projects that we've otherwise had to manually build and deploy ourselves (since they are not published on Maven).

seancorfield18:07:31

Anyone who has been following the development of the CLI stuff would not have the wrong idea 🙂

vlaaad18:07:21

I wonder if it's possible to compile java sources in-process...

seancorfield18:07:23

You came back after a few years away, dove into the CLI deps.edn stuff without reading the design goals etc, and then you complain that it doesn't replace Leiningen 🙂

seancorfield18:07:40

Which, well, no, it was never designed to replace Leiningen.

Jan-Paul Bultmann18:07:21

I don't care that I doesn't replace leiningen

Jan-Paul Bultmann18:07:38

my complaint is not that it does too little or too much

Jan-Paul Bultmann18:07:44

my complaint is that it breaks other build tools

seancorfield18:07:49

@U47G49KHQ Yup, you can run the Java compiler in process easily enough. Or at least you used to be able to. I can't remember where it is these days after the tools.jar changes in the JDK.

seancorfield18:07:18

@USB60P79B That's a ridiculous argument. How can it possibly break "other build tools"? They are independent.

Jan-Paul Bultmann18:07:27

because of git dependencies

Jan-Paul Bultmann18:07:34

which are advocated for

seancorfield18:07:26

None of that breaks other tools. They all still work the same as they always have.

seancorfield18:07:10

You're complaining because you're trying to use Arachne and it's never published an artifact consumable by lein/`boot` right?

Jan-Paul Bultmann18:07:36

I'm complaining because that they did that is because deps encouraged them to

seancorfield18:07:08

The reason there are no published artifacts is because the project was abandoned long before it was anywhere near complete 🙂

Jan-Paul Bultmann18:07:38

"clojure.deps is my own personal preferred toolchain as well as others who are currently using this project. However, if you or anyone else wished to submit a PR with a working leiningen or boot config and version 1.0, I'd be happy to push to clojars using that config."

Jan-Paul Bultmann18:07:53

deps gave them very much the impression that they don't have to publish it

seancorfield18:07:11

I give up. You're just not being reasonable. I'm out.

vlaaad18:07:41

https://github.com/arachne-framework/arachne-core by the way it checks in class files, so using it as git dep should work

vlaaad18:07:10

see they have class folder in the repo that is mentioned in deps.edn

Jan-Paul Bultmann18:07:16

@U04V70XH6 how is quoting the guy give that exact reason not being reasonable?

Jan-Paul Bultmann18:07:18

@U47G49KHQ yeah no, that's just ... urgh, I'm gonna port this stuff to rust, at least they have good error messages and a decent package managing system

seancorfield18:07:20

You're ranting and raging about tooling whose specific design goal did not include mixed language projects. And you're using as a straw man, a project that was abandoned nearly two years ago and never got an actual release.

vlaaad18:07:47

...and works with git deps 😛

Jan-Paul Bultmann18:07:35

@U04V70XH6 because it wasn't abandoned at the time, again I don't complain about deps itself, I complain about that it doesn't give warnings in the documentation about git repository deps

Jan-Paul Bultmann18:07:56

@U04V70XH6 I dont't care how deps does it, I think it's great that there is a minimal way to resolve dependencies, my problem is with the consequences this has on incompatible library publication shemes namely git

Jan-Paul Bultmann18:07:25

@U47G49KHQ yeah, but I've never had compilation issues there, I also never had compilation issues with leiningen or boot, which is one of the reasons I love(d) clojure so much, but apparently those times are over...

seancorfield18:07:43

Stop @-ing me. I don't want to discuss this any further. Thank you.

Jan-Paul Bultmann18:07:35

@U04V70XH6 have a nice day still!😊

deactivateduser19:07:12

That way you can consume deps.edn projects just the same as if they were deployed to a Maven/Clojars style artifact repository.

Jan-Paul Bultmann20:07:09

@U0MDMDYR3 thanks for the link looks like a really cool project!

Jan-Paul Bultmann20:07:45

adding deps.edn support would indeed solve the grimes I have with it

Jan-Paul Bultmann20:07:01

I think my main issue I have with it is that clj ignores a lot of best practices that the community build over the years

Jan-Paul Bultmann20:07:48

contrary to seans assertions i've read all of the deps documents and watched all of the conj talks about it and related tooling

Jan-Paul Bultmann20:07:08

but I still find pushing git deps into the world without a note that says "get deps are meant as a convenience during development, and as a tech preview of the capabilities to come, please still publish a maven artifact for backwards compatibility" incredibly careless

deactivateduser20:07:10

Yeah I can understand that. But I also know that Clojure has always had a bit of a culture of “Rich’s way or the highway”. 😜 (grabs popcorn and waits for the pitchfork wielding masses to appear…)

Jan-Paul Bultmann20:07:44

the thing is, clj is advocated to newcommers, those are not the people that "followed the developments around deps over the years" and those people will run into these issues head first

deactivateduser20:07:54

And to a great extent, that has served the language very well. Perhaps like you I just feel that sometimes it’s not so great for the community side of things.

deactivateduser20:07:51

Yeah - the git/SHA coordinates stuff absolutely assumes a “clean room” environment of tools.deps only, and that’s not the real world (at all - I mean 99% of what I do with Clojure involves Java libraries).

Jan-Paul Bultmann20:07:19

I totally agree on both counts

Jan-Paul Bultmann20:07:28

spec for example has made errors worse

Jan-Paul Bultmann20:07:59

the spec 2.0 stuff is super cool

Jan-Paul Bultmann20:07:28

but still, the idea that "just spec your macros and all the difficult errors will become much easier"

Jan-Paul Bultmann20:07:51

was simply a false hope, and now my error messages have become much worse 😞

deactivateduser20:07:41

I haven’t really checked out spec yet. It seems to solve a problem I don’t really have on the small scale hobby projects I’m mostly using Clojure for. That said, one change related to error message that has greatly hurt my productivity was pushing stack traces into a temporary file. I find that frustrating in the extreme.

deactivateduser20:07:17

But perhaps that’s because I came from Java, and so didn’t find stack traces problematic to read.

deactivateduser20:07:38

I get that they’re intimidating for beginners, but most languages I know have them in one form or other, so it’s not like the concept is alien - it’s mostly just the format and length that I think surprises newcomers.

Jan-Paul Bultmann20:07:23

yeah, especially because clojure doens't have a good alternative

Jan-Paul Bultmann20:07:47

if you've used rust for a while the difference becomes unbearable when you go back

Jan-Paul Bultmann20:07:40

it's insane how nice and helpfull good error messages can be, even if you know what you're doing

emccue01:07:47

Maybe a hottake on the tools.deps thing, but maybe this wouldn't be such an issue if gen-class worked without AOT

emccue01:07:17

That being said, from a quick skim that class doesn't seem needed