Fork me on GitHub
#datomic
<
2020-02-14
>
Sam DeSota03:02:41

I'm starting to look into full-text search with cloud. I'd like to use a full text search database hosted in the datomic VPC, and sync data via the the log api if that's reasonable. Ideally, I could keep the full-text search in sync with the datomic relatively quickly (< 30s), is there any resources or directions anybody could point me in for working on this?

em04:02:17

I’m really interested in the same thing, and been meaning to build out this functionality in the near future. I believe previous discussion mentioned quite a few people doing it with ElasticSearch, and somewhere (I think?) it was officially suggested to use AWS CloudSearch. The basic implementation idea would probably be sipping the transaction log and publishing directly to the search solution, so I’m pretty sure syncing should be much faster than 30s given the reactive ideas built into datomic

joshkh17:02:02

Can you elaborate on the reactive ideas built into datomic? I thought sipping the transaction log would be more akin to polling the log every n seconds in a loop.

Sam DeSota18:02:45

Right, that's what I'm curious about. I've seen a couple of examples of a utilizing a polling thread with ions to do subscriptions, but if there's some way to get a message queue of all Datomic transactions, that would be ideal

em19:02:56

Ahhh, yeah I think I mixed up on-prem and cloud, been watching too many old Rich Hickey Datomic videos for my source of datomic truth and not so much the documentation. 😛 I was thinking of tx-report-queue which was a big idea in prem for the Peers, which was understandably not supported in the Client API for cloud. Now that Ions are out though and there must be some kind of internal implementation to keep the transaction log synced across query groups, is there a way to access this API? There’s this old slack conversation https://clojurians-log.clojureverse.org/datomic/2018-06-27 between @currentoor and @stuarthalloway where it was mentioned @currentoor we certainly understand the value of tx-report-queue, and will probably do something similar (or better) in Cloud at some point. That said, you should plan and build your app around what exists today. Was wondering if I missed an update since, or what kinds of workarounds “build your app around what exists today” that people have found to work for them?

Joe Lane15:02:18

For both of you asking about search I'm curious, what is the expected size of the ES Cluster you will be running in the VPC?

Sam DeSota18:02:43

For me, just used for a product database of about 100,000 user-generated products (title, description, tags) and an orders / customer database of ~5000 orders a month, just for admin tasks.

Sam DeSota18:02:45

Running in the VPC

joshkh17:02:10

I want to upsert two entities with tuples in the same transaction, where the second entity's tuple references the first entity. An initial transaction works as expected:

(d/transact db
            {:tx-data [{:db/id           "entitya"
                        :feature/id      "SomeId123"
                        :feature/type    "Gene"
                        :feature/type+id ["Gene" "SomeId123"] ; <-- tuple for upsert
                        }

                       {:db/id                        "entityb"
                        :attribute/view               "Gene.primaryIdentifier"
                        :attribute/value              "MC3R"
                        :attribute/feature            "entitya" ; <-- ref back to entitya temp-id from above
                        :attribute/feature+view+value ["entitya" "Gene.primaryIdentifier" "MC3R"]}]})
=> success
But transacting the same tx-data again throws a Unique conflict caused by the second entity, even though I'm including the tuple attribute value (albeit a temporary id):
(d/transact (client/get-conn)
            {:tx-data ...same as above})
            
Unique conflict: :attribute/feature+view+value, value: [47257009761812574 "Gene.primaryIdentifier" "MC3R"] already held by: 27971575810621538 asserted for: 31454828647415908
Should I expect a successful upsert here? • Edit - I'm on the latest version of Datomic Cloud 8846 and client 0.8.81

favila19:02:10

On-prem has the same behavior. I too am curious if this is by design because one of our desired use cases for composite tuples was to have upserting composites.

favila19:02:26

You can use them as upserting only if you explicitly assert the final value of the composite in the transaction. composites don’t seem to be consulted for upserting-tempid resolution, even if no part of the composite involves a ref

favila19:02:02

e.g. transacting {:a eid-of-x :b eid-of-y} with a defined composite upsert attr defined of :a+b may also produce a datom conflict instead of upserting

favila19:02:33

instead we have to do {:a eid-of-x :b eid-of-y :a+b [eid-of-x eid-of-y]} always. And we can’t use tempids or lookup refs for eid-of-x or eid-of-y , only raw entity ids

em19:02:56

Ahhh, yeah I think I mixed up on-prem and cloud, been watching too many old Rich Hickey Datomic videos for my source of datomic truth and not so much the documentation. 😛 I was thinking of tx-report-queue which was a big idea in prem for the Peers, which was understandably not supported in the Client API for cloud. Now that Ions are out though and there must be some kind of internal implementation to keep the transaction log synced across query groups, is there a way to access this API? There’s this old slack conversation https://clojurians-log.clojureverse.org/datomic/2018-06-27 between @currentoor and @stuarthalloway where it was mentioned @currentoor we certainly understand the value of tx-report-queue, and will probably do something similar (or better) in Cloud at some point. That said, you should plan and build your app around what exists today. Was wondering if I missed an update since, or what kinds of workarounds “build your app around what exists today” that people have found to work for them?