Fork me on GitHub
#xtdb
<
2019-06-30
>
jjttjj16:06:33

What am I doing wrong with this query:

(crux/q (crux/db system)
        '{:find  [e t1]
          :where [[e :end-time t1]
                  [(after? t1 t2)]]
          :args  [{:t2 (.minus (ZonedDateTime/now) (Duration/parse "P100D"))}]})
Ignore any of the valid-time crux features for now (I have more questions about that after slightly_smiling_face) I have a field :end-timethat's a java.time.ZonedDateTime. Is it possible to make quereies for times that come after it like this? I'm currently getting
Execution error (ClassCastException) at java.lang.Class/cast (Class.java:3606).
Cannot cast clojure.lang.PersistentList to java.time.chrono.ChronoZonedDateTime

jjttjj17:06:07

oh wait, the t2 expression is quoted. I just needed to move the quotes to the individual find/where clauses instead of the whole query map. Never mind!

refset19:06:10

Cool, I think we've all been there! I added a note in the docs about this exact thing just a couple of days ago 🙂

jjttjj17:06:11

Ok, now for the higher level question 🙂 I have a bunch of stock market time series data in the form of OHLC over time, for a lot of different stocks. This seems like a similar use case to the one described here: https://github.com/juxt/crux/issues/47 I am currently adding a document per stock series, with a “data” attribute that stores just stores the full sorted-map of all the data for that stock. This works fine for now as the data is observed at a daily interval, with a few thousand points per stock, but eventually I want to support more fine grained data, ie a point per minute, and will need to partition the data across documents, probably with a “head” document that will link to the subdocuments where data can be found for that stock at a given time so that I can query for a subset of the data in time. So in short, I have documents that contain a series of data, and keys describing that series’ start and end times, and the one main query type i will need to support is selecting a subset of the data by start/end time. My question is, does it seem like the valid-time features of crux buy me anything for this usage pattern? I could set the valid-time of each series document to the last observed moment in the series, so that I could use the history api to go back to that point in time. But this wont help for series who’s end time is after the queried time, but it contains data the is before the queried time. And even then I would need to manually need to implement the search for a start time. So am I correct that in a situation like this valid-time doesn’t really buy me anything and it’s best to just ignore it and query for start/end times manually? Or is there an obvious gain that I’m missing here?

refset20:06:37

Wow, thanks for all the upfront detail here! I think your logic is generally sound but I will have to spend some time reflecting before I can give a confident answer / recommendation.

refset20:06:08

My instinct says that valid-time really is only useful for this kind of data when you want to make corrections, or when you want to correlate stock series from multiple sources (that are arriving out of order)

refset20:06:49

I could arrange a video call to discuss more easily if you'd be interested - just let me know 🙂 (though I will write back in any case!)

đź‘Ť 4
jjttjj12:07:54

Thanks so much for the response! That neo4j link seems useful and I will dig into that. And thanks for the offer to video chat! I think I'd like to dig a bit deeper into the problem first and then it would be great to discuss it. But I'm traveling this week and might not have much time, so maybe the middle of next week if that's ok with you.

✔️ 4
refset12:07:43

Cool, alright, let's stay in touch! (and I will try to do some more thinking)