does this bother anyone else? there are a number of libraries that normaliize eql into a graph (fulcro, pyramid, doxa etc.) the ident is always a vec [table id] , so if you return {:person/house {:house/id 1}} from pathom, the value of :person/house locally is [:house/id 1] . Now there's two types for the value of that attribute depending on the context, which does not seem right. I'm writing my own normalization logic now and I feel tempted to just keep it as a map, any reason not do to this?
there are tradeoffs, one reason people use idents is to be able to differentiate between a value and a reference, the libs you mention mostly use an ident? algorithm to do so (which looks for a vector with 2 elements, where the first is an ident), you can use something else like: a map with a single element, can work the same way.
the point is, which one is more likely to give you a false reference? IME idents tend to have a better detection rate, but it of course can depend on what your data is like
i'm trying to think: why does a reference need to be part of the graph data itself? that knowledge could come from a schema, or you could have the client decide whether it should be treated as a reference like pathom
the point of normalization is be able to use the same entity consistently across different parts of your local data, the only way to make that is to have some way to keep that entity in a place, and make references to it across the rest of the places
of course, I understand that. but xtdb, asami etc. are schemaless don't need to encode references in a special way, all values can be references and whether they're treated as references just depends on whether your query joins on them
e.g. you could return {:person/house {:house/id 1 :house/address "foo"}} and store that as {:person/house {:house/id 1} :house/id {1 {:house/address "foo"}}
they still need to maintain an ID, or some way to reference the data, as databases they are more sophisiticated about it, while Fulcro, Pyramid and others try to keep the format simple, to fit in a single map that represents a graph of data connected that's self-sufficient
its just about how you represent your data to fit your constraints
there is no wrong here, its just that the model used by Fulcro and others have been proven to work nicely for a wide range of situations, them we are back on how to represent a link (and in case of Fulcro and others, doing it without requiring a schema of any sorts)
and a bit of history, all those models came from om.next, which was the first time (afaik) we used these kind of references (via idents) to model graphs in Clojure, and David Nolen has said he got that idea from the library Falcor library: https://netflix.github.io/falcor/
references in Falcor: https://netflix.github.io/falcor/documentation/jsongraph.html
the ident syntax definitely works fine in practice, but there's just that bit of extra mental overhead.
like how would you spec :person/house
depends where you are validating, you can think of two different models, one for the normalized data, and one for the denormalized one. what is the problem you are trying to work on?
eql -> reactive datalog client ui state for electric 🙂
this is just an abstract question though. doesn't that defeat the point of spec
you aren't supposed to have 2 different "spec contexts" for an attribute ever right
kinda of, that's more a philosophical question IMO, and a result of embracing the dynamic nature of the data format, but I understand it makes things more complex
its the tradeoff chosen in this case
well, if it's a tradeoff, there's a cost and a benefit. it's still not clear to me when [:house/id 1] is ever better than {:house/id 1} -- bit of performance maybe
in this normalization world, the same problem occurs for every identifier, given in an entity map their value is some atomic unit, while at root their value is an index-map
to me that tradeoff is to be able to more safely differentiate a reference from an entity
honestly, I can't say I've ever wanted that. I believe in pathom, xtdb, asami this concept of denoting a reference different from an entity doesn't exist at all right? I don't find it hard to reason about their graphs
they are all systems that work differently, Pathom doesn't need any because it never has to store these kinds of things, its like a denormalizer entirely, while the more "database like" have their own ways (sometimes via schema, or internal way to rep it...). its a choice to solve a specific problem (in case of "map dbs", how to share a entity across different parts of the same map), each chooses their tradeoffs
but I encourage you to try the map approach, see how it feels when you apply to some real system, report back the experience 🙂
oh I already have it working with the traditional ident syntax, now I feel a need to move to maps. so I will try 🙂
just feels bad breaking so much convention
yup, consistency is great, but sometimes we gotta explore out, even if its just to improve our understanding of the tradeoffs, if you got the time to do so 🙂
• Gave it some more thought. there are three general approaches to modeling normalized state: ◦ normalized as {table {id {a v}}}, no further indexes (falcor, fulcro, pyramid) ◦ normalized as {e {a v}}, indexes on {a {e v}} and {v e} (datalog) ◦ normalized as {[table id] {a v}}, no further indexes (doxa) • The first group wants to store references as [table id] because that's what uniquely identifies an entity. If it did {table id}, then not only is it no longer a simple get-in (and the whole point is for this path to be fast) but we could also have invalid representations of entities like {table id attr2 val2} -- that makes no sense as it doesn't correspond to a tree path. • With the {e {a v}} databases that's a perfectly valid entity. I don't think I've ever needed the table index, so doxa's approach seems good if you want the simple map approach. I don't know why doxa normalizes entities into the [table id] format, maybe there are some internal optimizations for it. • The drawback to using a datalog db is mainly the query performance, since with react you need to get all the props to pass down every render. This should be mostly irrelevant with reactivity, since you let the reactive engine maintain state and the db is only hit on mount/unmount.