This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-03-26
Channels
- # announcements (3)
- # architecture (53)
- # babashka (6)
- # beginners (101)
- # bitcoin (3)
- # calva (4)
- # cider (3)
- # clara (7)
- # cljdoc (2)
- # cljsrn (14)
- # clojure (104)
- # clojure-europe (96)
- # clojure-germany (21)
- # clojure-nl (6)
- # clojure-serbia (3)
- # clojure-spain (1)
- # clojure-uk (13)
- # clojuredesign-podcast (4)
- # clojurescript (14)
- # cursive (5)
- # data-science (19)
- # datomic (16)
- # emacs (15)
- # fulcro (33)
- # graalvm (5)
- # honeysql (3)
- # instaparse (2)
- # jobs (3)
- # lsp (82)
- # malli (2)
- # off-topic (11)
- # pedestal (4)
- # polylith (62)
- # practicalli (4)
- # shadow-cljs (56)
- # tools-deps (53)
- # vim (17)
- # xtdb (53)
I’m currently writing a ticket for my project, in advance of the next Crux release, to update a few of our queries that currently use eql/project
. Is the new pull
going to be considered more stable going forward or is it slated to be alpha and subject to change again? If it’s the former, I’ll just plan on us fixing/tweaking it for the new api; if the latter, then I at least want to consider the option of removing the use of project/pull from queries so we don’t churn on future updates.
Hey @U01GXCWSRMW we're definitely dropping the "ALPHA" warning with the rename, as per https://github.com/juxt/crux/commit/a0dc81f41e21542b1e370f37a20dbfec2d811309#diff-868d3597c7324a08da9c6f15712d4d98972d2f582e9b1314dcc0c53b5c096fc5L130 ...so you should be able depend on it more even more confidently 🙂
@U899JBRPF Any idea when the next release will be? Looks like lots of changes since the last one…
The timetable is currently looking like Wednesday or Thursday next week - although I can't confirm until Tuesday. Not long though!
Hello! I'm playing with the lucene search (great feature btw! 🙂), and having some issues. Not sure if this should work:
(defn lucene-text-query [title]
(crux/q
(crux/db (-> system :db :db))
'{:find [?e ?s ?t]
:in [?q]
:where [[(lucene-text-search "mext.headline\\/title:%s" ?q) [[?e ?s]]]
[?e :mext.headline/title ?t]]
:order-by [[?s :desc]]}
title))
(defn lucene-text-query [title]
(crux/q
(crux/db (-> system :db :db))
'{:find [?e ?s ?t]
:in [?q]
:where [[(lucene-text-search "mext.headline\\/title:%s" "cov*") [[?e ?s]]]
[?e :mext.headline/title ?t]]
:order-by [[?s :desc]]}
title))
just ignore ?q in this case and passing the value. The doc says it will be passed via format
the query does work if I do the formatting before the param is passed to the query, which is what I would do anyway:
(defn lucene-text-query [title]
(crux/q
(crux/db (-> system :db :db))
'{:find [?e ?s ?t]
:in [?q]
:where [[(lucene-text-search ?q) [[?e ?s]]]
[?e :mext.headline/title ?t]]
:order-by [[?s :desc]]}
(format "mext.headline\\/title:%s" title)))
I had initially tried the Lucene multi-field stuff before switching to wildcard and I had a similar issue (though I wasn't sure if it was the way I was using it or not)... from what I remember of digging through the Lucene docs, that Lucene multi-field string is surprisingly finicky.
The test doesn't actually cover the internal formatting case described in the docs, I just noticed: https://github.com/juxt/crux/blob/master/crux-lucene/test/crux/lucene/multi_field_test.clj
Did you find a test case that provided you an example that helped you get the built-in formatter to work?
I thought it's just the standard clojure format
(and I guess it is) but you are right, there is no tests that covers the same example that exist in the docs
Hey @U050BA9V5 thanks for reporting this, I've now fixed it ahead of the new release today 🙂 https://github.com/juxt/crux/commit/8d07a45deaee2b64f029ba2eadc6c3a23cb597ed
hello! nice! thanks! btw, I had noticed also something weird when was playing with this. Didn't had the time to dig deeper into it so will just quickly explain it before I forget. I was poking with this for a few hours on a db that had really not much data , but I noticed my disk led constantly flashing, and I realized that it was this java process that was doing constantly I/O. This process had around 10G of I/O over a few hours, and I just had 30-50 documents with a few fields. Didn't seem normal 🙂
Oh, hmm, that does sound weird! Which OS are you using? Is that with Rocks for tx-log and doc-store?
I'm on Arch linux. It's Rocks for index and postgres for tx and doc stores. I only noticed that when was poking with the text search.
Exception in thread "crux-polling-tx-consumer" java.nio.channels.ClosedByInterruptException
at java.base/java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:199)
at java.base/sun.nio.ch.FileChannelImpl.endBlocking(FileChannelImpl.java:162)
at java.base/sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:388)
at org.apache.lucene.store.NativeFSLockFactory$NativeFSLock.ensureValid(NativeFSLockFactory.java:182)
at org.apache.lucene.store.LockValidatingDirectoryWrapper.createOutput(LockValidatingDirectoryWrapper.java:43)
at org.apache.lucene.index.SegmentInfos.write(SegmentInfos.java:484)
at org.apache.lucene.index.SegmentInfos.prepareCommit(SegmentInfos.java:804)
at org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:4914)
at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:3308)
at org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:3597)
at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1099)
at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:1140)
at crux.lucene$__GT_lucene_store$fn__6357.invoke(lucene.clj:239)
at crux.bus.EventBus$fn__19497.invoke(bus.clj:104)
at clojure.lang.AFn.run(AFn.java:22)
at crux.lucene$$reify__6355.execute(lucene.clj:237)
at crux.bus.EventBus.send(bus.clj:104)
at crux.tx.InFlightTx.commit(tx.clj:343)
at crux.tx$index_tx_log$fn__20371$fn__20376.invoke(tx.clj:440)
at crux.tx$index_tx_log$fn__20371.invoke(tx.clj:429)
at crux.tx$index_tx_log.invokeStatic(tx.clj:421)
at crux.tx$index_tx_log.invoke(tx.clj:419)
at crux.tx$__GT_polling_tx_consumer$fn__20392.invoke(tx.clj:464)
at clojure.lang.AFn.run(AFn.java:22)
at java.base/java.lang.Thread.run(Thread.java:834)
Suppressed: org.apache.lucene.util.ThreadInterruptedException: java.lang.InterruptedException
at org.apache.lucene.index.IndexWriter$EventQueue.close(IndexWriter.java:369)
at org.apache.lucene.index.IndexWriter.rollbackInternalNoCommit(IndexWriter.java:2300)
at org.apache.lucene.index.IndexWriter.rollbackInternal(IndexWriter.java:2267)
at org.apache.lucene.index.IndexWriter.shutdown(IndexWriter.java:1104)
... 14 more
Caused by: java.lang.InterruptedException
at java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1343)
at java.base/java.util.concurrent.Semaphore.acquire(Semaphore.java:475)
at org.apache.lucene.index.IndexWriter$EventQueue.close(IndexWriter.java:367)
... 17 more
thanks for sharing that detail. It might now be resolved by this change I made anyway https://github.com/juxt/crux/pull/1468
if you have 5m spare you could try to test against the latest dev-SNAPSHOT
we released yesterday, but no pressure 🙂
damn, thanks for trying though. I'll try to repro this today. Can you .close
the node before doing the reset as a workaround?
this project is open source and it should be relatively easy to bootstrap: https://github.com/kongeor/mext/blob/alpine/src/clj/mext/systems.clj if that saves you some time in order to set things up.
btw, that's a toy - almost like a scratchpad - project, I just do experimentation on it 🙂
> disk led constantly flashing That caught me off guard, leading me to realize I haven't actually had a desktop computer in a decade. 😳
Playing around with crux-sql
for the first time today and I’m curious if there’s a way to work around a data problem. The query in the :crux.sql.table/query
needs to include all the attributes of the document I want to express out in the :crux.sql.table/columns
but if I have some records that don’t have one of the columns, they get excluded from the results. Makes perfect sense. I’m not surprised it would do that, since my datalog query doesn’t match the entity. But nullable data is pretty common in sql-land.
I’m admittedly still quite weak at datalog and still feeling my way around with Crux so maybe there’s a simple or clever solution to this. Or maybe not?
That's right, the SQL columns require there to be some value in the index under the given attribute, so you would need to explicitly store nil
if you want the entity to appear in the table. The nil
values are then treated the same as SQL's NULL, see https://github.com/juxt/crux/blob/master/crux-sql/test/crux/calcite_test.clj#L361-L368
This is pretty much the case for modelling with Datalog also, i.e. explicitly storing nils is usually the right strategy. Otherwise certain shapes of queries can require ~exhaustive scanning of indexes
Makes sense. And workable. Thanks.
> This is pretty much the case for modelling with Datalog also, i.e. explicitly storing nils is usually the right strategy. Otherwise certain shapes of queries can require ~exhaustive scanning of indexes (edited) that's surprising, I thought clojure usually doesn't like to model data this way. Could you give an example of a query that would cause scanning?
This is fast, when explicit nil
s are stored, because it can lookup the :att nil
combination in the AVE index:
{:find [e]
:where [[e :att nil]]}
However this version, where nil
s are in the documents or indexes, has to scan through all e's to look for ones that don't contain :att
values
{:find [e]
:where [[e :crux.db/id]
(not [e :att])]}
@U899JBRPF is binding to nil supposed to work? It seems like it is equivalent to not including the clause at all:
(c/q (:app.crux/node integrant.repl.state/system)
'{:find [?link]
:in [x]
:where [[?e2 :embed/content x]
[?e2 :embed/id ?link]]}
nil)
returns empty set, while
(c/q (:app.crux/node integrant.repl.state/system)
'{:find [?link]
:in [x]
:where [[?e2 :embed/content nil]
[?e2 :embed/id ?link]]})
returns everything@U899JBRPF bumping this, not sure if you saw. quick repro:
(c/put node {:crux.db/id 1 :foo nil})
(c/put node {:crux.db/id 2 :foo 2})
@(c/q node '{:find [?e]
:where [[?e :foo nil]]})
;; #{[2] [1]}
@U797MAJ8M I'm opening an issue for this tomorrow, but essentially nil
is being treated like _
in your query. As a workaround you can wrap the nil in a literal set #{nil]
and it should work as you were expecting
> This is pretty much the case for modelling with Datalog also, i.e. explicitly storing nils is usually the right strategy. Otherwise certain shapes of queries can require ~exhaustive scanning of indexes (edited) that's surprising, I thought clojure usually doesn't like to model data this way. Could you give an example of a query that would cause scanning?