This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-10-24
Channels
- # aws (7)
- # aws-lambda (3)
- # beginners (65)
- # boot (43)
- # cider (7)
- # cljs-dev (12)
- # cljsrn (15)
- # clojure (284)
- # clojure-austin (32)
- # clojure-brasil (4)
- # clojure-dusseldorf (4)
- # clojure-germany (1)
- # clojure-italy (40)
- # clojure-spec (21)
- # clojure-uk (69)
- # clojurescript (97)
- # core-async (11)
- # cursive (19)
- # data-science (1)
- # datascript (6)
- # datomic (30)
- # dirac (2)
- # emacs (4)
- # events (2)
- # fulcro (76)
- # graphql (38)
- # juxt (1)
- # lein-figwheel (1)
- # leiningen (6)
- # luminus (4)
- # lumo (13)
- # mount (4)
- # off-topic (24)
- # om (28)
- # onyx (32)
- # other-languages (1)
- # parinfer (40)
- # pedestal (1)
- # portkey (47)
- # re-frame (21)
- # reagent (4)
- # ring (4)
- # ring-swagger (3)
- # rum (1)
- # shadow-cljs (115)
- # spacemacs (5)
- # sql (14)
- # unrepl (1)
- # yada (3)
Is there any security/injection concerns when using the fulltext feature of datomic?
be careful with those characters
#{"\t" "\n" "\r" " " "!" "\"" "(" ")" "*" "+" "-" ":" "?" "[" "\\" "]" "^" "{" "}" "~"}
They may cause ParseException '*' or '?' not allowed as first character in WildcardQuery com.datomic.lucene.queryParser.QueryParser.getWildcardQuery (QueryParser.java:982)
There is some place in docs that says that #{"\t" "\n" "\r" " " "!" "\"" "(" ")" "*" "+" "-" ":" "?" "[" "\\" "]" "^" "{" "}" "~"}
could break fulltext searchs? It's on my code.
@mbutler The string given to the fulltext function is actually a lucene query syntax: https://lucene.apache.org/core/2_9_4/queryparsersyntax.html
there's no injection danger, but it is a minilanguage and may not match a user-facing expectation
Thanks, @souenzzo @favila I knew about the lucene query sytnax, I rely on it to handle searching of emails as the tokenizer seems to split on @
(e.g. I turn "
).
Just wanted to check that the worst thing that can happen is a Parse Exception, rather than any security concern.
As i understand its still not possible to change the settings of or use a different tokenizer, but do you know if its possible to escape the input so that i can get the full string "
into the index?
however, what kind of query are you doing? Sounds like exact-match? in which case why use fulltext at all?
I was just using the exact match as the most clear example
you could make your datomic query try exact match (normal indexed field), and use that to boost scores
yeah, I considered/was doing that as some point. Gets a bit messy since I allow a variable number of fields to constrain the search, and since I do that its not a big deal that I have to treat email a bit oddly
Was just hoping that there was some easy answer to the tokenizer problem 🙂
Also considered doing as you said and storing a "normalised" version of email but seemed like more tech debt than it was worth. If current implementation proves to poor a UX ill probably move to doing that.
Thanks for the advice btw 🙂
don't forget about query rules to abstract some of this. e.g.:
'[[(email-search [?email] ?e ?score)
[?e :email-attr ?email]
[(ground 2.0) ?score]]
[(email-search [?email] ?e ?score)
[(fulltext $ :email-attr ?v ?score) [[?e ?v _ ?score]]]]]
Trying to model the behaviour of the query in this case. If you invoke this rule once, its a logical or
right? So in the case the exact match returned it would bind that ?e
and "exit early"?
It would still try both, but you can aggregate :find (max ?score) ?e
to dedup and make exact matches float higher
and not try to do the fulltext.
http://lucene.apache.org/core/3_6_1/api/core/org/apache/lucene/queryParser/QueryParser.html#escape(java.lang.String) there's an escape function too
I thought lucene was an implementation detail though. It would be nice if fulltext escaping was provided by datomic, so you didn't have to depend on this.
awesome @dominicm
(com.datomic.lucene.queryParser.QueryParser/escape "|&&|")
=> "\\|\\&\\&\\|"
Maybe datomic.api could wrap this function (d/fulltext-escape s)
How are folks achieving ordering on cardinality many attributes?