This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-08-17
Channels
- # announcements (7)
- # babashka (24)
- # beginners (11)
- # boot (16)
- # calva (46)
- # cider (5)
- # clara (3)
- # clj-kondo (2)
- # cljfx (5)
- # clojure (122)
- # clojure-brasil (26)
- # clojure-dev (20)
- # clojure-europe (20)
- # clojure-germany (1)
- # clojure-nl (1)
- # clojure-norway (54)
- # clojure-uk (2)
- # clojurescript (6)
- # core-matrix (23)
- # datomic (85)
- # graalvm (1)
- # honeysql (9)
- # hyperfiddle (31)
- # lsp (3)
- # malli (9)
- # nbb (2)
- # off-topic (15)
- # pathom (15)
- # pedestal (4)
- # polylith (5)
- # re-frame (5)
- # reitit (9)
- # releases (2)
- # shadow-cljs (63)
- # specter (4)
- # xtdb (7)
A https://clojurians.slack.com/archives/C061XGG1W/p1692258947029469 (translating to English/paraphrasing): > Why is the new Datomic Local based on the Client API (which lacks entity) and not the Peer API? Someone at Cognitect could perhaps shed some light on the reasoning behind this decision.
Datomic with the client API is “only” better than any other database. With the peer API, you also get functional programming nirvana 😄

https://clojurians.slack.com/archives/C061XGG1W/p1692262410699429?thread_ts=1692258947.029469&cid=C061XGG1W (partially in English).
Some background may be useful: Datomic Local was dev-local. dev-local was/is the local development for Datomic Cloud. Which of course uses the Client API
Possibly a dumb follow-up question: Why doesn't Datomic Cloud support the same awesome Peer library as "on-prem" Datomic Pro?
The peer library requires a large runtime and significant network resources (ie a peer with a direct connection to storage and a big object cache) to implement. The client library is designed for use over a network, where the compute and IO needed for query is not supplied by the client (ie the client is not a peer). You can use client in an embedded peer, as is done by ions and local, but the api imposes assumptions that make it possible to run that same code not on a peer. The peer api cannot do this because it cannot be implemented in a lightweight, not-peer way.
The entity api in particular is lazy and synchronous, and amortizes the cost of that laziness with the object cache and log index merging, both of which require lots of memory and network. The same api without those would be intolerably slow, as each key lookup would be a network round trip
In fact it can still be bad even in a peer-the temptation to entity walk and map/filter to “query” instead of using d/q is real and can cause significant performance problems that are hard to detect. It is also unpredictably slow because it can actually do IO, but only sometimes!
I always thought of Datomic Cloud as Datomic Serverless because of this 🙂 The peer is absolutely useless in a serverless/lambda setting
The entity api thus makes it hard to set IO boundaries where you need some guarantee that code won’t be blocking (eg in a cpu bound threadpool) and it leads to spaghetti dependency code where it’s hard to tell what data a call tree needs from the database (which complicates schema changes) and it’s not a normal map, so you can’t augment it with additional keys as you go
So, the entity api is bad actually, it’s only nice in the very simplest, very-local circumstances
assuming your dataset mostly exists in the local peer cache (which is the case for the apps I’ve been working on), the reverse is true though. Instant lazy access to the entire working set 😻
and practically speaking, isn’t the client API essentially RPC-ing a peer that runs elsewhere? So you still have the arhitecture of a peer, with its caching, talking to underlying storage, maintaining a working set LRU cache, etc etc?
It’s not that peers don’t exist, it’s that not everything that queries needs to be a peer
The client api makes it possible that you need not be a peer, but the peer api means you must be a peer
yeah, it doesn’t make a lot of sense to pull an entire database into RAM of a Lambda in AWS and then run queries 😄
So if you want a lambda, a lambda peer is always going to be slower if its lifetime is short, because every time it starts again its cache is cold
does that mean that Datomic Local has a peer available under the hood? If that’s the case, seems like enabling the peer API would be technically feasible, at lest 🙂
I have used the entity API extensively for years, and have not encountered the problems described above. I realize the possibility, it just hasn't been a practical issue for me. Organically navigating entities allows the code to make decisions about what data is needed at the right places (e.g., where it's needed). Pull and queries place the fetch at the top. This has been one of my favorite parts of working with datomic.
For context: our application is 8 years old and the database has over 10 billion datoms. At this age and scale the entity api is definitely a big liability.
Placing fetch at the top means: a fixed IO boundary (which can be instrumented, even better now with io-context), a declaration of data dependencies (for auditing or establishing module boundaries), and making it more evident when you are fetching too much and possibly should use d/q instead.
It also means (drawbacks): you wrote the attributes twice, and you have to keep dependencies in sync with what consumes them, which are often separated. (Macrology can help here by composing pull expressions statically from function trees, but that’s not out-of-the-box)
It’s also really hard to refactor into pull after using d/entity, which is what you want if your team or database gets larger. By that time it’s too late to do easily.
Thank you for the thoughtful and detailed contribution to this discussion, @U09R86PA4! If I understand correctly, and by reading between the lines, you generally advise against using the Peer library and Entity API, except perhaps for "the very simplest, very-local circumstances."
As Nubank/Cognitect focuses on improving the Client API, will directly using the Peer library be discouraged, and is it likely to be deprecated eventually?
I advise against using entity maps, not the peer api. If you are using on-prem, the peer api is what you want--you cannot actually use the client api in-process with on-prem, only via peer-server. Nubank primarily uses datomic on-prem almost exclusively (and I’m pretty sure without peer-server) so I really don’t think the peer api is going anywhere.
However if you take entity maps out, there really aren’t that many differences in capability between the two; if you want maximum future flexibility, stick to the parts of the api that overlap.
>> Why is the new Datomic Local based on the Client API (which lacks entity) and not the Peer API?
The reason is historical: datomic local was made so that cloud customers could do something locally. Before this, anyone using cloud had to have a cloud instance running to do anything
> • Develop and test Datomic Cloud applications without connecting to a server and https://docs.datomic.com/cloud/datomic-local.html#divert-system. Note this bullet point, from https://docs.datomic.com/cloud/datomic-local.html
that makes more sense! I thought you were saying peers were somehow problematic by design. But it sounds like a good idea to avoid the entity API to walk data if you have huge amounts of non-partitioned data
It wasn’t solving a problem on-prem users had, and it wasn’t primarily so people could have an embedded redistributable datomic (that was just a bonus)
and that also makes sense… 🙂 Maybe the primary reason for Datomic Local isn’t a redistributable Datomic with a peer etc, but a new name for the local version of Datomic Cloud
This is also why divert-system exists https://docs.datomic.com/cloud/datomic-local.html#divert-system
none of these are problems on-prem users have--you can already use it in-memory or on dev storage, and have been able to since day one.
I thought the transactor for on-prem wasn’t available in a redistributable version :thinking_face:
the historical order was on-prem first; then people wanted turnkey infrastructure, so they made cloud; then people wanted to do cloud-without-cloud, so they made dev-local
This was on-prem, just with a “special” storage called “free”, which was just dev storage
(I may or may not have used it in production for roughly a decade before getting pro)
but alas, free is gone now (?). So there’s no redistributable peer available these days
so, if I take a step back, the only thing that changed with Datomic Local is that we now also got a free and redistributable version of Datomic Cloud, essentially
and, there’s no reason to use the old H2 based free storage now, as it’s better in every way measurable to just couple your free Datomic Pro transactor with postgresql etc instead, I suppose
ah right, if it’s all on one machine you can just point client/peer and transactor to the same file?
that took a while… phew! Thanks for all the pointers, I think I have it sorted in my head now 😄
@U01PE7630AC just want to confirm from a person on the Datomic team that what Francis said is spot on. The peer api is not going anywhere and will not be deprecated. It is the api used at Nubank for 90% + of Datomic services and is widely used by our enterprise customer base.
Possibly a dumb follow-up question: Why doesn't Datomic Cloud support the same awesome Peer library as "on-prem" Datomic Pro?
something really weird is happening. So inside the execution of the code, when I run a query, I get this exception:
{:type clojure.lang.ExceptionInfo
:message Unable to find data source: $__in__4 in: ($ $__in__2 $__in__3 $__in__4 $__in__5)
:data {:datomic.client-spi/request-id 313fe75e-5afb-422d-ba18-9fba209ba6d3, :cognitect.anomalies/category :cognitect.anomalies/fault, :datomic.client-spi/exception java.lang.Exception, :datomic.client-spi/root-exception java.lang.Exception, :cognitect.anomalies/message Unable to find data source: $__in__4 in: ($ $__in__2 $__in__3 $__in__4 $__in__5)
Which literally tells me nothing. Although if I run the query in the repl, without the main application running, the query completes fine without any issues. Any one has any idea what is going on here?
Thanks!Yes! That was it… Looking at it now, it seems kinda logical, but man, how I wish they would figure out how to give better feedback when things go wrong, that's valid for both Clojure and Datomic. Thanks for the reply @U9MKYDN4Q
I've been playing with Datomic for a while now, if anyone wants to check https://github.com/clojure-indonesia/pedestal-datomic, I'll update it using Datomic Local :smiling_face_with_3_hearts:

Did Datomic On-Prem get rebranded "Pro", again?
@U087E7EJW yes and peer apis are on maven.
Looking into how a #C7Q9GSHFV Electric server could be run on a Datomic Ion. And specifically how the websocket connection should work. Given API Gateway's websocket support, it seems like it should be possible to send and receive WS events from an HTTP Direct Ion. Only problem is that these WS events should only be handled by the same Ion instance that contains the running Electric session state. Is there any way to pin sessions somehow, ensuring an HTTP Direct Ion is called on the same instance each time (given the same API Gateway connection ID). (Or perhaps a way for the Ion to mark the message as being for a different instance.) Even sticky sessions on the load balancer side wouldn't quite fix this, as the WS events aren't coming from a client that could have a cookie but from API Gateway directly.
Theoretically the http direction ion could handle the event by putting it onto a queue that could be picked up by whichever ion instance is currently handling the particular session. :thinking_face:
The more I think about it, the more Redis pub/sub seems necessary. Unless there's some underlying Ion HTTP handling logic that's expose in a way that would make this possible. (Though I'm assuming that layer is just a dumb load balancer.)
I'm not sure about handling of auto scaling though. Specifically when an instance is being scaled down but still has active Electric sessions running. Is there draining logic defined somewhere?
Do Datomic ions provide any lifecycle hooks? (Specifically when autoscaling down?)
I don’t know if AWS API Gateway implemented (and Datomic Cloud added support for) WS Direct connections like it did for HTTP Direct connections. If it didn’t, then API Gateway mediates WS connections for you, and https://clojurians.slack.com/archives/CL85MBPEF/p1676057124117659?thread_ts=1675723956.514279&cid=CL85MBPEF would still be totally applicable in 2023.
As far as I'm aware, direct WS connections are not supported, so I'm assuming API Gateway WS event mediation is necessary
This is very relevant to me, and the subject of a conversation I had yesterday with @U06FTAZV3. Our informal conclusion was, IIRC, very similar to yours: you will need some shared session management (e.g. redis or memcached) plus some API GW configuration to terminate WS. I had forgotten about @U0514DPR7’s wonderfully documented exploration post where the WS<->HTTP translation is needed (unless you want to bypass the entire Ion mechanism and standup a dedicated WS server). I really wish Cognitect had some guidance on this.
@U5JUDH2UE, I’m pretty sure there is nothing in the way of lifecycle hooks for an Ion -neither up nor down.
Hi #C03RZMDSH team! I’m willing to create a new Datomic Cloud system/stack with the latest release (`995-9204` of 2023/06/16), but there is no such version available on the “Software version” list. Is it already available / will be added soon? Or should I just proceed with the 990-9202
and simply point to the latest storage & compute templates URLs later on?
Hey @U01CC82BJKU . With Datomic Cloud Free we are off marketplace completely. No markup fees and no marketplace subscription is required for Datomic Cloud. Latest release template can be had on our releases page: https://docs.datomic.com/cloud/releases.html#current I’ve been in conversations with Marketplace team to have this plastered on our listing (we have to forever keep the older templates there, so taking the entire listing down is not an option)
Do Datomic ions provide any lifecycle hooks? (Specifically when autoscaling down?)