Fork me on GitHub
#datomic
<
2017-12-08
>
caleb.macdonaldblack00:12:39

So reading further i learned that [[?title ?box-office]] is input and is a vector of vectors. It is being destructured

caleb.macdonaldblack00:12:06

And the ?box-office value has no effect on the query and is simply being outputted with the title

daveliepmann08:12:56

This indeed looks very weird. I don't see a way to experiment without recreating the entire dataset locally but this example looks half-finished. For instance, all the outputs are also inputs?

manutter5112:12:53

What’s missing is the larger context. It’s been a while since I did that tutorial, but as I recall, this is illustrating how to combine db + external data to answer questions like “Show the title and box-office receipts for each movie directed by a specific director.”

manutter5112:12:32

No one data source has all the information you need: the db is missing the box-office info, and the list of movies is missing the director info.

manutter5112:12:14

This query combines the two in a way that lets you get the specific answer you want.

daveliepmann14:12:54

@U06CM8C3V I see that. But is ?director meant to be input, or output? Without running the query, it's unclear to me that ?director gets used at all except as a filter to verify that titles in the input list have a director at all.

manutter5114:12:23

Director is an input

manutter5114:12:59

You can tell because it’s listed on the :in line

manutter5115:12:57

In datomic you would call it like this:

(d/q '[:find ?title ?box-office
       :in $ ?director [[?title ?box-office]]
       :where
       [?p :person/name ?director]
       [?m :movie/director ?p]
       [?m :movie/title ?title]]
     db "Stanley Kubrick" box-office-receipts)

daveliepmann23:12:09

I think I now see the confusion. I was aware that ?director was listed as an input; what was unclear was whether that was intentional. The phrase "to find box office earnings for a particular director" made me think it was intended to return something like ?director and the sum of box office earnings. I see it's just meant to subset the input list (which already has most of the info needed) based on information that the database has (the director).

sova-soars-the-sora13:12:04

Datomic fam, I have a query for you all 😃 I have a rating system, I want to let users rate things but only show the effects of their rating 24 hrs later, meaning someone upvotes something, someone downvotes something, but the effcts aren't apparent to the score until 24 hrs later. Can i structure a datomic query to get all the results that are at least 24 hours old?

stijn13:12:45

@sova you can query the database 24 hours in the past by using (datomic.api/as-of db some-date) and pass that database to the query

sova-soars-the-sora14:12:28

some things are so simple 😄

kenbier19:12:06

I wonder if anyone has ever had to join datomic data stored on postgres with postgres data, for a very simple join.

kenbier19:12:25

I.e. a postgres table has a unique column with an id, you want to join ON a datomic entity with the same id as a lookup ref, and add some datomic attribute values to the resulting rows.

kenbier19:12:41

In theory you could do it?

kenbier20:12:32

in other words, use SQL to lookup an attribute’s currrent value using only the entity lookup ref.

kenbier20:12:02

It may require hardcoding the attribute id when creating the schema for better performance though.

kenbier20:12:46

I ask this cause some teams are interested in exploring datomic for its audit and read scalability, but they do merge some of their CRUD app data with Postgres tables to fill in missing columns. And these Postgres tables are often owned by other teams.

favila20:12:13

@kenbier are you wanting to do this in a datomic query? in theory you could, just call a function inside your datomic query to lookup postgres

favila20:12:21

use the returned value for further joins

kenbier20:12:08

@favila no, from a SQL query

kenbier20:12:09

the use case would be query a postgres table, but join in some missing columns from datomic data also stored in postgres

kenbier20:12:43

the postgres primary key would be a lookup ref in datomic

favila20:12:08

that's going to be tough

favila20:12:17

it's better just to process the result after

kenbier20:12:23

in memory?

favila20:12:05

are you using the missing columns purely for select, or for where/group-by/etc

kenbier20:12:16

just select

favila20:12:29

then postprocess should be easy, no additional memory burden

favila20:12:09

to do what you want would require issuing a datomic lookup from within the postgresql server itself

favila20:12:40

you could probably write a stored procedure which called out to datomic, but your data fetch is going to leave the posgres query process no matter what you do

favila20:12:25

why burden the postgresql server with that cpu load and additional complexity just so SELECT works?

kenbier20:12:49

because our BI is going to be querying our data warehouse, not just applications.

kenbier20:12:44

they are going to want to join in missing fields like a an entities title based on an id, for example. to generate reports and look into insights for customers

kenbier20:12:17

siloing data from the rest of our large organization really hurts the value prop of adopting a new database engine, even if the data lives on postgres in reality.

kenbier20:12:04

though i am curious how to do the postgres query from datomic, the aforementioned is a more pressing issue for adoption.

favila20:12:00

you would be better off having a derived sql-shaped view of your datomic data I think

favila20:12:46

you can have a process read the datomic tx queue and write the data you want into postgres

favila20:12:56

basically a streaming materialized view of sorts

favila21:12:16

at that point it's just normal postgres data, so you can use it in queries

favila21:12:25

and it's kept up to date automatically

favila21:12:39

the source of truth would still be datomic

kenbier21:12:07

oh wow thats not a bad idea

kenbier21:12:29

what if the postgres server fails over? how would i pickup where i left off in the txn queue?

favila21:12:56

write the transaction T into posgres

favila21:12:53

you can then restart your stream process from there

favila21:12:13

(playing catch-up from datomic's tx-log) before switching to the live-streaming tx-queue

kenbier21:12:42

and the postgres query within a datomic query?

favila21:12:02

datomic queries are run on the querying machine, not the transactor, so they can call anything they want

kenbier21:12:36

as a param i pass in? like the db param but a second as a bunch of rows?

favila21:12:40

:where [(some-postgres-result x) [[?col1 ?col2 ?col3]]]

favila21:12:49

whatever you want

favila21:12:09

you just need a clojure function that returns something shaped appropriately for query destructuring

favila21:12:22

that function can do anything (thread safe) you want

kenbier21:12:32

jdbc does that already i thought? its a vector of vectors?

favila21:12:00

I don't know offhand, depends on what you use

kenbier21:12:17

datomic is back on the menu!

favila21:12:41

you can supply the postgres transaction context as a :in var if you want

favila21:12:14

datomic queries don't care what's in there unless they start with $ or you try to destructure them

favila21:12:32

so you can just give that value to your functions which do your postgres selects