This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-01-14
Channels
- # adventofcode (2)
- # announcements (61)
- # babashka (26)
- # beginners (125)
- # calva (63)
- # cider (33)
- # clj-kondo (40)
- # cljs-dev (24)
- # clojure (165)
- # clojure-australia (8)
- # clojure-dev (4)
- # clojure-europe (44)
- # clojure-finland (1)
- # clojure-greece (4)
- # clojure-losangeles (1)
- # clojure-nl (28)
- # clojure-taiwan (3)
- # clojure-uk (64)
- # clojurescript (2)
- # core-async (14)
- # datomic (34)
- # docker (2)
- # fulcro (9)
- # garden (1)
- # jobs (4)
- # jobs-discuss (21)
- # kaocha (3)
- # off-topic (48)
- # pathom (4)
- # practicalli (3)
- # remote-jobs (3)
- # shadow-cljs (46)
- # spacemacs (6)
- # sql (4)
- # tools-deps (22)
- # xtdb (5)
- # yada (2)
https://github.com/dainiusjocas/lucene-grep Lucene based grep-like utility compiled with GraalVM native-image. Grab a binary and tell me what you think. Cheers!
thanks!
If I do this in a Clojure repo:
lmgrep "select-keys" .
should this work? it turns up emptythe problem is with the . at the end
as of now the file pattern is GLOB
the problem with glob is always: is it recursive or not? this is always different per platform
@U04V15CAJ if you specify then it is recursive
@U04V15CAJ yeah, put the GLOB in double quotes 😉
@U0P7ZBZCK as of now it is not supported, but there is a Class in Lucene that does just that, so it is possible
@U04V15CAJ for code search I'd suggest to specify the letter
tokenizer, because the default analyzer doesn't split text on .
, which is a bit unexpected IMO, e.g. lmgrep --tokenizer=letter "select-keys" "**.*"
yeah. it would be cool if the score was returned as @U0P7ZBZCK suggests and EDN output would also be nice, so you could sort the results (e.g. pipe the results to babashka and then do some processing)
@U0P7ZBZCK,@U04V15CAJ, I agree that it would be nice to sort on score, but hint me how would you like the output to look like?
probably just maps with :file, :line, :column, the line :text (optionally) and :score?
Got it. So I imagine it will be something like lmgrep --with-score "query" GLOB
, i.e. under a flag
Assuming compatibility with grep isn't a concern and results are sorted by score: [SCORE]:[FILE_PATH]:[LINE_NUMBER]:[LINE_WITH_A_COLORED_HIGHLIGHT]
. I personally don't have much of a preference. I could awk/cut the output easily enough. As @U04V15CAJ suggests, edn output would be super helpful
@U0FT43GKV Maybe you can make this more flexible by allowing a --columns
argument with a comma separated list of options, which also determines the order
or even better, a template:
--template "{{score}}:{{file}},{{line}}:{{column}}:{{text}}"
Yeah, I was thinking about a template or a pattern as an option 👍 left it out for the first iteration
https://github.com/clj-kondo/clj-kondo/blob/master/doc/config.md#print-results-with-a-custom-format
Nice! I'll shamelessly copy it as much as possible 😄
That was fast, @U0FT43GKV! https://github.com/dainiusjocas/lucene-grep/commit/4e4556b7602e21aa4188f30c788e56e98a74220e 🔥
It was not complicated 😄
However, the issue is that with the default Lucene I can get either Scoring of highlighting e
I have to implement a class that does both 🙂
@U0P7ZBZCK the feedback is welcome 😉
🙂 Scoring is more important to me. And normalization of scores too. From my prior experience with Elasticsearch, I remember that scores across indicies were not comparable. I'm hoping scores across different files are 🤞 Again, thanks for taking the time to create lmgrep, @U0FT43GKV
Yeah, with elasticsearch score are not comparable not only between indices but also between fields within an index 🙂
The scoring with lmgrep is with gotchas. As of now, every line is scored separately. Every line is treated as a document with one field. The temporary index is being created with that one document. Then the query is run against that temporary index.
I plan to write a blog post on the details in the coming week
It's that time of year again - the https://www.surveymonkey.com/r/clojure2021 is now open! We would love to get your feedback from all Clojure/ClojureScript/ClojureCLR users. Takes < 10 minutes and we release all the data. Please share with your colleagues who might not be seeing it in forums like these.

May I suggest something too, or would you perfer another way of suggesting an addition?
here's fine
please add that as an Other response - I look at those every year and anything with high responses I add for the next year
Other for that particular question that is
I review all of those from prior year
Insurance was only mentioned 8 times last year in the other responses
if only there was some way you could have been ready for the risk and claimed some kind of compensation (sry, sry)

Seems like I don't know something but it's hard to find out about it. Q24 lists both "Browsers" and "Chromium". Is there something named "Chromium" that's not a browser? Or was the intention to figure out how many people target Chromium specifically? If it's the latter, then why knowing that is important?
as mentioned above, added babashka for consideration next year. I don't think anyone is actually using clojerl in anger.
@U2FRKM4TW I think Chromium can be used independently as a component? David Nolen requested that, can't remember now why
@U050B88UR Could you please comment on the above? I'm genuinely interested but can't find any information.