Fork me on GitHub
#announcements
<
2022-04-06
>
djblue06:04:05

Just released https://github.com/djblue/portal/releases/tag/0.23.0, a data exploration / navigation tool for clojure. Some recent highlights include: - Gruvbox color theme - :deps/prep-lib for git deps support - Table viewer now supports group-by style maps - Runtime icons for the log viewer Drop by #portal with any questions.

🙏 18
👍 5
🚀 3
🎉 3
Matthew Davidson (kingmob)09:04:29

After all these years, we’re announcing primitive-math version 1.0.0, the first under clj-commons. https://clojars.org/org.clj-commons/primitive-math/versions/1.0.0 Let’s get this out of the way first: there are no bug fixes or changes to the existing code, so if you don’t use Graal or clj-easy, this update won’t interest you. If you do use Graal/clj-easy, all code has been copied from a top-level single-segment ns to be also under a clj-commons ns. If you update your requires, you should no longer have any issues with single-segment namespaces. (There have been some slight updates to the docs, which are now available at cljdoc.) Many thanks to @p2b for bringing up the single-segment issue and especially @skynet for fixing it! Plus, thx to @slipset for setting up the CircleCI tag deployment.

💯 14
2
🎉 5
Asko Nōmm19:04:13

Introducing Clarktown: https://github.com/askonomm/clarktown A zero-dependency, pure-Clojure Markdown parser. Also entirely modular and extensible, so you can take out parts of it or add new parts to it with ease and really make it your own. I’m currently using it to render my own blog (http://bien.ee), and it works wonders. Do note however that it does not support the whole CommonMark spec, and I do not aim for it either, but rather would like to be more pragmatic and implement the parts that I myself use, or that people point out that they could really use.

🎉 22
Noah Bogart19:04:38

> it does support is this supposed to be "it doesn't support"?

Noah Bogart19:04:41

otherwise, great job!

Asko Nōmm19:04:31

Yup you are totally right! It does not support, fixed now!

👍 1
russmatney19:04:57

nice work! i wonder how this compares to https://github.com/kiranshila/cybermonday

Asko Nōmm19:04:18

From quickly parsing cybermonday the big difference is that Clarktown only outputs HTML. That being said, since it is modular, it is quite easy to make it output whatever you want, but my goal was just HTML.

👍 3
russmatney19:04:46

definitely makes sense to be practical - i'll give it a read soon. no deps is pretty fun!

Asko Nōmm20:04:39

I’m a fan of tiny, self-contained software. My experience seems to keep telling me that the less there are dependencies, the longer the software lasts.

borkdude20:04:16

@U026NQLSBLH Aw yeah:

$ bb -cp src -e "(require '[clarktown.core :as clarktown]) (clarktown/render \"**Hello, world**\")"
"<ol><li><strong>Hello, world</strong></li></ol>"

👍 1
Asko Nōmm20:04:03

I’m happy it works with Babashka out of the box! But you also just discovered a bug 😄 It renders bold text in an ordered list. I’ll get right on that.

😂 5
babashka 2
borkdude20:04:30

@U026NQLSBLH I'm using markdown-clj in https://blog.michielborkent.nl/ (rendered by bb). Where do you think your lib differs?

Asko Nōmm20:04:52

Bug is fixed in 1.0.3 version! As for the difference to markdown-clj, on the outset, it doesn’t differ. In fact actually it has less Markdown support built-in than markdown-clj, for example markdown-clj has footnote support which Clarktown doesn’t. But, Clarktown is modular and extensible (pick it apart and make it your own), so if you wanted to make some quick changes or add new features, that would be very straight forward to do. Like I said above as well, I do not intend to target full CommonMark spec as a goal of this library, and only add things that people would request from that spec or that I myself happen to find needing, so as to be more pragmatic in its development, and perhaps eventually reaching that spec completeness sort of organically.

👍 3
👏 1
Vincent Cantin10:04:35

Congratulation for the release

chrisn19:04:17

Lighting https://cnuernber.github.io/dtype-next/tech.v3.datatype.char-input.html#var-read-json without the jackson dependency hairball. Same speed (or faster with options) as jsonista which by my tests put it about 5-10x faster than clojure.data.json. This is the same story as the fast CSV parser in the same library - don't use pushback reader and write tight loops in java. In any case, here is a https://github.com/cnuernber/fast-json. Overall if you are already using jsonista then there is no benefit unless you don't want immutable datastructures but for https://github.com/techascent/tech.ml.dataset, for example, it doesn't make a difference if the input is an array of java hashmaps or an array of clojure persistent maps. Interestingly the fastest option is a mix where for small maps I use a persistent array map and for larger maps I use a hashmap. Given that you get the full array of key-value objects up front I bet there is some fancy way to make a persistent hash map very fast (like computing all hashes in parallel type operations) - this I haven't researched. One more thing - the relative performance of clojure.data.json gets worse moving from jdk-8 to jdk-17 🙂. If you want to know why please read replies to the the earlier announcement about the CSV parsing system - PushbackReader's single character performance drops noticeably from JDK-8 to JDK-17. In any case, enjoy - https://cnuernber.github.io/dtype-next/tech.v3.datatype.char-input.html#var-read-json.

🎉 26
🚀 17
clojure-spin 6
gotta_go_fast 10
⏱️ 3
🔥 2
Daniel Jomphe20:04:26

Impressive, Chris! > Overall if you are already using jsonista then there is no benefit unless you don't want immutable datastructures No dependencies on Jackson is a very serious advantage. What do you mean about immutable datastructures? > In any case, here is a https://github.com/cnuernber/fast-json. Have you also profiled impact on RAM? I suppose there's quite an improvement there too.

chrisn20:04:22

I haven't checked the RAM usage - criterium does record such things so I could but I would guess everyone's RAM usage is nearly identical as we are all returning the entire object graph in memory at once. The performance really comes down to how fast you can create maps. So I profiled a few variations such as always use persistent maps, use a mix of persistent and java hashmaps, etc. Turns out the fastest thing aside from just returning the object[] in place in a marker object is to use https://github.com/cnuernber/dtype-next/blob/master/src/tech/v3/datatype/char_input.clj#L377.

Daniel Jomphe20:04:08

And when you mentioned "unless you don't want immutable datastructures", is that just an internal implementation detail? If we read json with this lib and want immutable data, should we then add a step for that?

chrisn22:04:22

It's an option - the default is immutable. See to docs specifically about :profile.

domparry05:04:51

This is super interesting! Well done Chris. I agree that removing the dependancy on Jackson is a great advantage.

ikitommi06:04:34

Well done, impressive numbers!

ikitommi06:04:36

there is also a https://github.com/metosin/jsonista/blob/master/src/clj/jsonista/core.clj#L103-L108 btw, would be interesting to see that too. Bit surprised that clojure.data.json is so much slower here, based on the Jsonista tests, it’s much closer. I could add this (and Jsonista mutable) to Jsonista JMH perf suite.

eskos10:04:01

Hi, this might be a bit daunting request but since this is a custom implementation of JSON, could you run the conformance suite used to produce this: http://seriot.ch/json/parsing.html

chrisn12:04:05

@U8SFC8HLP - Note that I used the https://github.com/cnuernber/dtype-next/blob/master/test/tech/v3/datatype/json/json_test_suite_test.clj as clojure.data.json. @U055NJ5CC - Yes for sure - it isn't clear exactly how to use this module instead of the clojure one. Will mess with it. If you have a snippet that does this exactly it would be helpful.

❤️ 1
chrisn12:04:16

I think the timing discrepencies are because I decided to parse the data from in-memory strings - reason being that I think any high perf use case can separate getting the data prepared and into memory from actually parsing the data. The dtype char-input system also includes a simple blocking io abstraction for larger files or network input streams but performance testing that involves gridsearching buffer sizes and such so it is much more involved.

chrisn12:04:37

I wanted to see the performance of the parsers specifically, not the io subsystem.

chrisn13:04:55

@U055NJ5CC - Updated with jsonista-mutable.

chrisn14:04:06

Also I should say this - if you want much better performance then using a https://github.com/techascent/tech.ml.dataset and serializing https://github.com/cnuernber/tmdjs/blob/master/src/tech/v3/libs/transit.clj#L124 is much faster than anything mentioned thus far. So in essence the above benchmarks are useful isofar that it is measuring generic json performance but if you want to transfer large amounts of data via JSON the general pathway is definitely not the best one.

Jakub Holý (HolyJak)13:04:19

@UDRJMEFSN I guess there is a typo in https://cnuernber.github.io/dtype-next/tech.v3.datatype.char-input.html#var-read-json or I just do not understand the sentence; I guess "has" -> "and": > :mixed produces a mixture of persistent maps has hashmaps and is nearly as fast as :raw

chrisn12:04:16

Definitely a typo - thanks.

👍 1