This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-03-04
Channels
- # announcements (123)
- # asami (1)
- # babashka (50)
- # beginners (49)
- # biff (21)
- # calva (48)
- # cider (6)
- # clj-yaml (8)
- # cljsrn (2)
- # clojure (9)
- # clojure-conj (1)
- # clojure-indonesia (1)
- # clojure-losangeles (1)
- # clojure-uk (18)
- # clojurescript (21)
- # data-science (19)
- # datahike (13)
- # events (1)
- # fulcro (1)
- # honeysql (4)
- # hyperfiddle (40)
- # improve-getting-started (3)
- # inf-clojure (1)
- # malli (6)
- # off-topic (45)
- # releases (3)
- # rewrite-clj (14)
- # scittle (1)
- # shadow-cljs (57)
- # tools-deps (1)
- # vim (13)
- # xtdb (14)
HoneySQL -- Turn Clojure data structures into SQL -- com.github.seancorfield/honeysql {:mvn/version "2.4.1002"}
-- https://github.com/seancorfield/honeysql -- this is the @p-himik release! Almost every issue here was raised by him and several were fixed by pull requests from him -- thank you!
• Address https://github.com/seancorfield/honeysql/issues/474 by adding dot-selection special syntax.
• Improve docstrings for PostgreSQL operators via PR https://github.com/seancorfield/honeysql/pull/473 https://github.com/holyjak.
• Address https://github.com/seancorfield/honeysql/issues/471 by supporting interspersed SQL keywords in function calls.
• Fix https://github.com/seancorfield/honeysql/issues/467 by allowing single keywords (symbols) as a short hand for a single-element sequence in more constructs via PR https://github.com/seancorfield/honeysql/pull/470 @p-himik.
• Address https://github.com/seancorfield/honeysql/issues/466 by treating [:and]
as TRUE
and [:or]
as FALSE
.
• Fix https://github.com/seancorfield/honeysql/issues/465 to allow multiple columns in :order-by
special syntax via PR https://github.com/seancorfield/honeysql/pull/468 @p-himik.
• Fix https://github.com/seancorfield/honeysql/issues/464 by adding an optional type argument to :array
via PR https://github.com/seancorfield/honeysql/pull/469 @p-himik.
• Address https://github.com/seancorfield/honeysql/issues/463 by explaining :quoted nil
via PR https://github.com/seancorfield/honeysql/pull/475 https://github.com/nharsch.
• Address https://github.com/seancorfield/honeysql/issues/462 by adding a note in the documentation for set operations, clarifying precedence issues.
Follow-up in #C66EM8D5H
The parser combinator library parsesso
has been released.
https://github.com/strojure/parsesso
Congrats! I love parser combinators and having a maintained one in the CLJ ecosystem is great.
I'm interested in using this in bb for scripts.
I noticed you're dropping down to .first
interop in clj/cljs. Does this really matter much for performance? I notice micro-benchmarks in the repo, but I'd be interested in macro-benchmarks, as in, parse a 1mb file 1000 times and then see if it matters. At least in bb it won't work today, but with a small tweak it will:
https://github.com/strojure/parsesso/pull/2 (also added a CI config to run lein test + bb test:bb)
> Does this really matter much for performance?
Well, checking for if (x instanceof ISeq)
on every token has some effect...
> instanceof is the cheapest operation there is and totally useless in the context...
true, but in bb's case, :bb
has to go before :clj
, else it will pick the :clj
branch, which most of the time is what you want
that's how reader conditionals work: the language picks the first one that is applicable
you're right that in a JVM it can matter significantly:
user=> (time (dotimes [i 100000000] (.seq [123])))
"Elapsed time: 9.2935 msecs"
nil
user=> (time (dotimes [i 100000000] (seq [123])))
"Elapsed time: 489.359167 msecs"
especially in a hot loop of course which parsers are prone tobut it seems the instance check itself isn't the performance problem:
user=> (time (dotimes [i 100000000] (instance? clojure.lang.ASeq [123])))
"Elapsed time: 6.648 msecs"
> Does this really matter much for performance? https://github.com/strojure/parsesso/blob/default/test/perf/bench.clj#L127 this benchmark becomes 2x slower with first/next.
ISeq and ASeq differ significantly 🙂
(instance? clojure.lang.ASeq [123])
;; Execution time mean : 4.182330 ns
(instance? clojure.lang.ISeq [123])
;; Execution time mean : 36.959700 ns
> instanceof is the cheapest operation there is it depends on tested type more classes inherited, more expensive test (when missed)
I haven't looked at the implementation but how important is it that the input
is a seq? it could e.g. also be a vector so you don't have to go through the seq abstraction at all? Just thinking out loud.
You're right about the inheritance chain, yeah
> (also added a CI config to run lein test + bb test:bb) this made your PR not so small 🙂
feel free to take whatever you want and leave out whatever you want, it's your project :)
you can do something about that. Add a blank line followed by:
Co-authored-by: Michiel Borkent <[email protected]>
in the commit message@U04V15CAJ Does not this part require changes? https://github.com/strojure/parsesso/blob/default/src/strojure/parsesso/impl/char.cljc#L14-L17
for me the rule looks like “use :bb before any :clj with cross-platform implementation”
for comparison:
(require '[clojure.string :as string])
(def c1 \f)
(def c2 \g)
(time (dotimes [i 1000000]
(.equals ^Object (Character/toLowerCase ^char c1)
(Character/toLowerCase ^char c2))))
(time (dotimes [i 1000000]
(= (string/lower-case c1)
(string/lower-case c2))))
$ bb /tmp/perf.clj
"Elapsed time: 2007.004583 msecs"
"Elapsed time: 112.205875 msecs"
$ clj -M /tmp/perf.clj
"Elapsed time: 19.156291 msecs"
"Elapsed time: 49.091 msecs"
what is best for bb here?
(defn- c
"Cross-platform char."
[s]
#?(:cljs s, :clj (first s)))
nbb uses :org.babashka/nbb
and it has the same rule with respect to the :cljs
branch
ok, I'll check locally I think adding a test runner for bb would be good as a follow up? this is better than checking manually imo
$ bb test:bb
Running tests in #{"test"}
Testing strojure.parsesso.char-test
Testing strojure.parsesso.expr-test
Testing strojure.parsesso.parser-test
Ran 52 tests containing 320 assertions.
0 failures, 0 errors.
This is what I had in my PRI think I didn't understand your last reply. Are you asking what a test runner is for?
Adding that would enable you to run the tests in test
with babashka, as a verification that nothing is broken
And the .github/workflows
that was also part of my previous PR then runs both lein test
and bb test:bb
in CI on every commit, so you automatically know that everything still works, without having to run things locally
what is this file for? https://github.com/strojure/parsesso/blob/3d00c45679aea43154f853f3c5bd7a8223f71db9/deps.edn
It allows you to use your library as a deps.edn dependency (local or via git). This is both useful for clojure users as babashka users, as they both use the deps.edn
dependency manager, which is the official clojure tooling
In the bb.edn test runner file I am using your dependency as a local deps.edn dependency
so do I understand correct? 1. Build and run tests • ci.yml + bb.edn 2. Use library as deps.edn • deps.edn
yes. also 2 allows you to use your library as a git dependency so people can test it even before you release it to clojars
how do you maintain this dependency changes?
:extra-deps {com.cognitect/test-runner
{:git/url ""
:sha "a522ab2851a2aa5bf9c22a942b45287a3a019310"}
the above means that I'm using https://github.com/cognitect-labs/test-runner as a library at a specific commit. https://github.com/cognitect-labs/test-runner/commit/a522ab2851a2aa5bf9c22a942b45287a3a019310 Does this answer your question? If not, please clarify
You have tools for this. How do you do this for project.clj? I will give you a link to a similar tool for deps.edn
This tool is a popular one for deps.edn: https://github.com/liquidz/antq
In a similar way people can use your library with:
{:deps {com.github.strojure/parsesso {:git/sha "..."}}}
It's pretty cool :)When you use com.github.
or io.github.
you don't have to specify the :git/url
, then it can be inferred from the library name
You can also specify a :git/tag
in which case you can use a short :git/sha
(7 characters)
hmm, locally it works:
$ bb test:bb
Running tests in #{"test"}
Testing strojure.parsesso.char-test
Testing strojure.parsesso.expr-test
Testing strojure.parsesso.parser-test
Ran 52 tests containing 320 assertions.
0 failures, 0 errors.
clj-kondo should add better support for linting multiple languages, currently it only does :clj and :cljs
@U0HJNJWJH I’d appreciate a paragraph in the readme explaining what parser combinators are & are good for 😅 🙏
@U0522TWDA Should I compile some text from this? https://en.wikipedia.org/wiki/Parser_combinator
@U04V15CAJ Maybe you have somewhere a configuration for cljs tests on node as well?
@U0HJNJWJH absolutely, I recommend: https://github.com/Olical/cljs-test-runner
here is my config for babashka cli: https://github.com/babashka/cli/blob/26aad2fc56c82a89822e94f021c15235b29147a3/deps.edn#L26 here is how to call it from the command line: https://github.com/babashka/cli/blob/26aad2fc56c82a89822e94f021c15235b29147a3/bb.edn#L21
An eluded (we all know parser parse strings) simplified, more pragmatic explanation? Perhaps worth an example, and most importantly a list of cases you use it for? How does it differ eg from instaparse, which also parses text into data?
@U0522TWDA A parser combinator library is a library with functions that can be composed into a parser
instaparse takes a grammar specifiction, but in a parser combinator library you build the specification from functions, rather than a DSL
@U0522TWDA kern has the best docs I saw in various Clojure implementations https://github.com/blancas/kern/wiki
Borkdudes 2 comments is exactly what I needed
@U04V15CAJ Can you tell if this implemented correct and works? https://github.com/strojure/parsesso/blob/default/resources/clj-kondo.exports/com.github.strojure/parsesso/config.edn
you can unduplicate this config by replacing this line: https://github.com/strojure/parsesso/blob/bd18de34e04aba22faec6dd0709c7f90c64e8c4f/.clj-kondo/config.edn#L9 with
:config-paths ["../resources/clj-kondo.exports/com.github.strojure/parsesso"]
on the top level> looks great!
only looks or also works?
I could not test in Cursive.
The confusing part in documentation is if in path should be ns strojure.parsesso.parser
or piece of project name com.github.strojure
In a project with your dependency
mkdir .clj-kondo
clj --lint $(clojure -Spath) --dependencies --copy-configs
clj-kondo --lint src
You can test your commit with:
{:deps {com.github.strojure/parsesso {:git/sha "..."}}
It seems clj-extras also supports it: https://github.com/search?q=repo%3Abrcosta%2Fclj-extras-plugin%20--copy-configs&type=code
@U0HJNJWJH When should I pick parser combinators over EBNF? Do they offer the same and it is only question of which one I prefer to learn or is there some distinct advantage over a DSL such as EBNF? Perhaps it is easier to describe more complex grammars b/c I can make my own helper functions, or something?
Looks like here is good explanation https://softwareengineering.stackexchange.com/questions/338665/when-to-use-a-parser-combinator-when-to-use-a-parser-generator. I learned parser combinator for parsing mustache which grammar is context dependent.
And another good article https://medium.com/@chetcorcos/introduction-to-parsers-644d1b5d7f3d
I gathered this as the answer: > in general, parser combinators such as parsesso are for creating top-down (i.e. LL) parsers, with the ability to reuse common code (this lib). Parser Generators typically generate a finite state automaton for a bottom-up (LR) parser. Though nowadays there are also combinators for LR grammars and generators for LL ones (e.g. ANTLR). “Which one you should use, depends on how hard your grammar is, and how fast the parser needs to be.” Especially if the grammar has lot of non-trivial ambiguities then it might be easier with the more flexible combinators approach.
ring-control
— More controllable composition of Ring middlewares.
https://github.com/strojure/ring-control
https://github.com/babashka/json: a JSON abstraction library It lets you choose your preferred JSON implementation (currently data.json and cheshire, more to come) by providing them on the classpath, while not coupling your libraries or scripts to them.