This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
- # adventofcode (2)
- # announcements (61)
- # babashka (26)
- # beginners (125)
- # calva (63)
- # cider (33)
- # clj-kondo (40)
- # cljs-dev (24)
- # clojure (165)
- # clojure-australia (8)
- # clojure-dev (4)
- # clojure-europe (44)
- # clojure-finland (1)
- # clojure-greece (4)
- # clojure-losangeles (1)
- # clojure-nl (28)
- # clojure-taiwan (3)
- # clojure-uk (64)
- # clojurescript (2)
- # core-async (14)
- # datomic (34)
- # docker (2)
- # fulcro (9)
- # garden (1)
- # jobs (4)
- # jobs-discuss (21)
- # kaocha (3)
- # off-topic (48)
- # pathom (4)
- # practicalli (3)
- # remote-jobs (3)
- # shadow-cljs (46)
- # spacemacs (6)
- # sql (4)
- # tools-deps (22)
- # xtdb (5)
- # yada (2)
https://github.com/dainiusjocas/lucene-grep Lucene based grep-like utility compiled with GraalVM native-image. Grab a binary and tell me what you think. Cheers!
If I do this in a Clojure repo:
should this work? it turns up empty
lmgrep "select-keys" .
the problem with glob is always: is it recursive or not? this is always different per platform
@U0P7ZBZCK as of now it is not supported, but there is a Class in Lucene that does just that, so it is possible
@U04V15CAJ for code search I'd suggest to specify the
letter tokenizer, because the default analyzer doesn't split text on
., which is a bit unexpected IMO, e.g.
lmgrep --tokenizer=letter "select-keys" "**.*"
yeah. it would be cool if the score was returned as @U0P7ZBZCK suggests and EDN output would also be nice, so you could sort the results (e.g. pipe the results to babashka and then do some processing)
probably just maps with :file, :line, :column, the line :text (optionally) and :score?
Got it. So I imagine it will be something like
lmgrep --with-score "query" GLOB , i.e. under a flag
Assuming compatibility with grep isn't a concern and results are sorted by score:
[SCORE]:[FILE_PATH]:[LINE_NUMBER]:[LINE_WITH_A_COLORED_HIGHLIGHT] . I personally don't have much of a preference. I could awk/cut the output easily enough. As @U04V15CAJ suggests, edn output would be super helpful
@U0FT43GKV Maybe you can make this more flexible by allowing a
--columns argument with a comma separated list of options, which also determines the order
or even better, a template:
Yeah, I was thinking about a template or a pattern as an option 👍 left it out for the first iteration
However, the issue is that with the default Lucene I can get either Scoring of highlighting e
🙂 Scoring is more important to me. And normalization of scores too. From my prior experience with Elasticsearch, I remember that scores across indicies were not comparable. I'm hoping scores across different files are 🤞 Again, thanks for taking the time to create lmgrep, @U0FT43GKV
Yeah, with elasticsearch score are not comparable not only between indices but also between fields within an index 🙂
The scoring with lmgrep is with gotchas. As of now, every line is scored separately. Every line is treated as a document with one field. The temporary index is being created with that one document. Then the query is run against that temporary index.
It's that time of year again - the https://www.surveymonkey.com/r/clojure2021 is now open! We would love to get your feedback from all Clojure/ClojureScript/ClojureCLR users. Takes < 10 minutes and we release all the data. Please share with your colleagues who might not be seeing it in forums like these.
May I suggest something too, or would you perfer another way of suggesting an addition?
please add that as an Other response - I look at those every year and anything with high responses I add for the next year
if only there was some way you could have been ready for the risk and claimed some kind of compensation (sry, sry)
Seems like I don't know something but it's hard to find out about it. Q24 lists both "Browsers" and "Chromium". Is there something named "Chromium" that's not a browser? Or was the intention to figure out how many people target Chromium specifically? If it's the latter, then why knowing that is important?
as mentioned above, added babashka for consideration next year. I don't think anyone is actually using clojerl in anger.
@U2FRKM4TW I think Chromium can be used independently as a component? David Nolen requested that, can't remember now why