This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-04-22
Channels
- # announcements (8)
- # babashka (4)
- # beginners (164)
- # calva (17)
- # cider (30)
- # cljdoc (4)
- # cljs-dev (6)
- # clojure (103)
- # clojure-europe (63)
- # clojure-nl (1)
- # clojure-norway (1)
- # clojure-portugal (1)
- # clojure-uk (3)
- # clojured (10)
- # clojuredesign-podcast (2)
- # clojurescript (16)
- # conjure (2)
- # core-async (9)
- # cursive (26)
- # datalevin (4)
- # datomic (155)
- # gratitude (1)
- # holy-lambda (8)
- # honeysql (9)
- # hoplon (6)
- # off-topic (55)
- # polylith (14)
- # portal (21)
- # reagent (5)
- # reitit (16)
- # releases (3)
- # shadow-cljs (87)
- # spacemacs (3)
- # tools-deps (25)
- # xtdb (9)
It seems datomic peer still ships with presto 348 judging by the changelog. Are there concrete plans to upgrade and/or migrate to trino?
Context: we have built a BI solution for a customer based on datomic analytics and metabase. Metabase used to implement a custom presto connector (using http directly). In the latest release they have replaced that with a jdbc based connector, but in the process also migrated to trino, so we are currently held back from upgrading metabase.
This is about the datomic analytics product. Prestosql is a sql query engine; datomic has a connector for it so you can sql-query a datomic db. prestoSQL renamed itself to trino because of a competing fork of the engine called prestodb: https://trino.io/blog/2020/12/27/announcing-trino.html 348 is a version number of prestoSQL (released dec 14 2020) from before it’s rename to trino
I am trying to find reverse keys for an entity ("foreign" ref attributes that point to this entity). I have found the following "trick" via google:
=> (.touch ent)
=> (keys (.cache ent))
but it doesn't seem to work. Is there any other way to achieve that?Get the entity db out using d/entity-db then query [_ ?attr ?e] or (d/datoms db :vaet e)
I think touch used to realize Eavt and Vaet, but now only does Eavt. This is a hack anyway: entity is designed for when you know your attributes in code already. D/touch is for dev-time printing of entity maps and such
ah ok, thank you so much, @U09R86PA4!
@jarrodctaylor saw https://max-datom.com/ on the front page of HN. Nice job!
Apologies if this is a little nutty/premature, but if we have #clojuredart apps running natively on the desktop, is it a crazy idea to want to try to put a Datomic client into them so that apps can talk directly to DB / have direct access to the information model? I assume this is a ton of work, but curious if it makes sense as an architectural model in the first place. Curious if this could cut a lot of the intermediate infrastructure out of a frontend app. Any thoughts greatly appreciated!
This is theoretically possible already with clojurescript, but no one has wanted it enough to write a clojurescript client api library, so I’m not sure dart changes that.
I would say this makes sense as a debugging tool or replacement for the (not very good) datomic console, but architecturally you very quickly need more layers to enforce policy, and then the direct connection stops making sense because you need to reduce its power in some way, or allow for interception and transformation.
So I don’t see “datomic client in a dart desktop app” as a game changer; you will need an intermediate with more control before very long, and then you’re back to an intermediate framework or library (which there are already many good ones, e.g. fulcro or reframe)
yeah fascinating thank you! I was kind of hoping perhaps something like Datomic database functions had evolved to be, for example a higher-level intermediary/domain model for transactions against the DB but it sounds like that’s not really it/there yet. very interesting! will start looking into the intermediates, thanks!
By analogy with the SQL world, many sql dbs do have features which look like they might be enough to make them application platforms, ie authentication, stored procedures, and procedure+table/row/column-level access control
so even if datomic did grow similar features, I’m not sure it would be a good or popular choice to use the datomic client api as the client application’s primary api to interact with data
right — there’s usually some kind of application-level API in between the client and the DB it’s interesting to me because sometimes the application-API ends up being REST/RPC/GraphQL that looks basically almost just like the DB, but not the DB. Like it ends up being some kind of higher domain model, with auth, but not exactly the low-level data model, but related to it…
haha indeed =/
even as a backend interface, using datomic directly is becoming a problem for us in some cases. sometimes we need to preserve an attribute with its meaning but not its implementation
okay so, and so therefore if one needs that anyway, might as well put that on a server and then have a client talk to that server
interesting
datomic has a really great attribute-centric data model, but, it is still at the end of the day an implementation specification not a data model
I’m looking at pathom3 very seriously as something that has the attribute model of datomic but more flexibility around evaluation and implementation. And yes, the vast majority of attributes just pass through to datomic
yeah very cool — structurally pathom looks kinda like a federated GraphQL gateway (e.g. https://apollographql.com) except with a logic-programming/datalog/prolog-y query engine instead of a GraphQL-nested-map-join engine Both those plus the new upcoming existence of the HTTP QUERY method make me think the API intermediary clients want is this “application-level set of API functions, but need to be able to express queries more sophisticated than ‘GET resource’”
What does it mean to preserve an attribute with its meaning but not its implementation? to preserve the attribute meaning outside datomic?
Concrete examples: we needed for operational reasons to drop full text from some attributes. Datomic doesn’t let you do that: you need to make a new attr. At our data size this involves a migration (db is using two attrs at once for the same data for a time)
It would have been really nice to hide all this from the code and let it keep using the same attr. It would have saved weeks of dev time
Another example: datomic can (but really should not) store strings larger than a kB or two. The recommendation is to store a key to some other system. We end up with a hybrid encoding for latency where it’s in datomic if short enough. Now the same data is across two concrete attrs. Again, a migration was involved. Even worse, this introduces n+1 problems with the other store without ugly contortions.
Another example: we store stat aggregates (eg counts of x for y) and have them available as attrs, but not be forced to have them be concrete attrs in datomic all the time
All of these come down to: d/entity and d/pull have a fixed implementation that maps an attribute to a datomic attr, and if we want to use attrs as stable interfaces, we need some implementation flexibility that these don’t provide
got it thx, I guess this comes down to how much logic you want to keep in the db vs the app
I don’t think that’s quite right. Being at the mercy of the db can mean (re)writing a bunch more application than you started with
At the risk of taking the thread in a circle, does that mean it’s possibly a reasonable thing to want to put an abstract datalog interface in the client, whose persistent storage backend is an implementation detail? But there is an data-layer-interface with auth directly in the client? (where the abstract interface looks datomic/datalog-like, but may or may not be directly implemented against datomic?)
@U09R86PA4 yeah true
(okay awesome that line of thought is coming together, thank! Totally makes sense that the DB implementation itself is often useful to put behind an abstraction e.g. for auth, policy, facading instead of migrating, abstracting over sharding, etc.)
(Reminds me of this old pattern from long ago: https://en.wikipedia.org/wiki/Data_access_object)
@U01KZDMJ411 yeah that’s my point. Datomic is not magic. It’s implementation is fixed within certain boundaries, like any db.
attributes, pull exprs and datalog are great for data model expression, but d/pull, d/query d/entity are not data models but implementations of them that map to a datomic storage engine in a fixed way
It’s a so much lower friction abstraction with such great sympathy which how Clojure models data that it can be easier to make the mistake that the datomic attrs and the data model are exactly the same
Also you can go a very long way before you hit a painful bit where you realize you need a little indirection
But you’ve already written a bunch of “boundaryless” code by that time and backfilling the abstraction layer you need becomes hard
yeah, the attractiveness of datomic with clojure is how you can keep using the same data model in both but you make a clear point how can the implementation limit the benefits of EAV flexibility
This is probably horrifying but theoretically if the Datomic client interface supported either pre/post hooks / CLOS metaobject-style extension / interceptor middleware / multimethod-or-protocol dispatch, then you could keep the calling code the same but add general-or-casewise behavior modifications Otherwise it seems like all the code needs to be written to an abstract interface/indirection just in case future behavior extension is needed, and otherwise it’s just an empty pass-through layer.
Sort of. It only an attribute/pull expr model (no datalog). You define “resolvers” which declare what they need as input attrs on an entity and what attrs they provide for that entity. You the. Query it by seeding with what data you have and a pull expr and you get a map of the same shape filled out with what you asked for
FWIW, "boundaryless" is what keeps me using datomic for personal stuff, like "look at all the stuff I don't have to write" but can see how that can become a problem at scale
It also has a “foreign interface” where you get an entire query subtree extracted for you (eg, all the datomic attrs that map 1-1) and you just need to return a map in the right shape. This makes it really easy to have the “fall through” cases, and is also a handy way to avoid n+1 problems across process boundaries
Interesting, so the benefits of not having impedance mismatch are somewhat erased by the implementation?
Maybe I'm misunderstanding, but I'm assuming there's no impedance mismatch between clojure and datomic in the data model which makes it sound like everything is going to be smooth sailing but the database restrictions don't make it so
when you say no one would make this mistake in SQL, is it because you are forced there to write some abstraction layer? to isolate the application layer from the DB.
So in many cases you can represent your data model in datomic without much translation. The result of a d/pull is exactly what your domain models would have looked like.
so if one day your data model is not exactly like datomic, you probably didn’t write a layer of indirection in between your domain objects and your d/pulls already, so now you have to retrofit it in.
but in SQL world, the natural mode of expression in SQL is so different that you almost certainly have that layer built already
understood, thx, do you still prefer datomic's data model despite the implementation lack of features / restrictions?
are important though, like the string limit and rigidity of attrs in datomic look jarring
yeah something like TOAST for large values is a curious omission. But I don’t follow on the “rigidity of attrs”. In every way attrs seem more flexible than columns and tables
by rigidity of attrs I mean the implementation not the data model, like you can't disable fulltext
yep. and if datomic could do these things those sources of indirection-need would have been gone. but there’s still stuff like maintaining aggregates, maintaining computed/derived values (materialized or not), etc, that I’m not sure I can reasonably ask datomic to take care of.
There’s also an inherent cost to keeping all transacted data--some stuff really is just ephemeral and high-volume and storing it in datomic forever becomes a chore, and it’s a shame you need to give up the attribute model to do it.
these all speak to an occasional need for some indirection without faulting datomic for being the kind of db it is and not another one.
and there are dbs that support an attribute model but are quite different from datomic: datalevin, xtdb, and many flavors of datascript storage backend
yeah, had a little look at them, xtdb looks more like a document store, different data model than datomic, the others one don't look to serious/ready for production use, but can't tell
I guess for anything serious, like public webapps, a moment will come when you just have to run another DB too, besides datomic
FWIW at Shortcut we use one datomic database as our primary store with dynamo as the backing store, and additional dynamo tables and s3 for stuff that isn’t appropriate for datomic (high write volume, ephemeral data, large blobs)
and I love the peer model--scaling query load with peers is way easier than admistering a cluster
we have a peer-server around, but we don’t use it for sustained load. Again, it’s an indirection problem: d/entity and d/pull can’t transparently be replaced for peer vs client
our biggest headache honestly is dynamodb + memcached. We have serious envy for the cloud’s 3-tier storage and wish on-prem had it too
gotcha. Curious if you can share, a nubank article scared me saying they run more than 2400 transactors, shorcuts looks really cool, how many transactors are running?
hah, no. to be fair, datomic is using dynamo as a blob store. for typical item sizes people use dynamo for (a few kb at most), dynamo may indeed have better variance.
but there are plenty of products that are dynamo/cassandra-like and exist pretty much only to guarantee lower latency and variance, e.g. https://www.scylladb.com/
not for shortcut, but I’ve used mysql as the storage on moderately sized datomic dbs in the past (3+ years ago). It was fine
cool (about mysql), I have setup datomic with postgres for now, since it's a single table, I'm wondering if datomic can really max out postgres, I guess it would require a lot of peers
anyway, if shortcut can run with one transactor which is impressive, then I think I'm going to be ok 😉
the docs indicate it should, curious how they abstract that, it uses some lowest common denominator standard SQL?
any sql that can store moderately sized binary blobs efficiently will do (a few kb to <2mb)
yeah those, pretty basic, I guess the SQL dialect of those don't change between DBs for the very basics
oh neat, sqlite, one machine, less processes, is the java driver solid? It feels like the java world favors stuff java based like h2 more than sqlite
I’ve used the xerial driver, it’s fine https://github.com/xerial/sqlite-jdbc
so the peers need to be on the same instance. That’s fine for bulk workloads but not much else
no? the transactor and peer are always separate processes. You just won’t have an extra storage process
there's is something scary about h2, https://www.h2database.com/html/features.html#connection_modes
In embedded mode I/O operations can be performed by application's threads that execute a SQL command. The application may not interrupt these threads, it can lead to database corruption, because JVM closes I/O handle during thread interruption.
probably? h2 is only used by dev
storage, which is special because the transactor itself exposes an additional port as the storage port (I believe using sql). And you won’t use dev in production anyway.
this sounds like it: http://www.h2database.com/html/features.html#auto_mixed_mode