This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-09-27
Channels
- # announcements (2)
- # asami (25)
- # babashka (124)
- # beginners (46)
- # calva (55)
- # cljdoc (70)
- # clojure (68)
- # clojure-australia (2)
- # clojure-dev (63)
- # clojure-europe (38)
- # clojure-nl (1)
- # clojure-spec (1)
- # clojure-uk (8)
- # clojurescript (56)
- # community-development (4)
- # conjure (1)
- # copenhagen-clojurians (1)
- # core-async (1)
- # cursive (3)
- # datahike (5)
- # datomic (183)
- # depstar (2)
- # figwheel-main (10)
- # fulcro (20)
- # honeysql (2)
- # hyperfiddle (1)
- # integrant (68)
- # jobs (6)
- # jobs-discuss (5)
- # juxt (1)
- # malli (13)
- # off-topic (8)
- # pathom (2)
- # rdf (10)
- # reagent (11)
- # remote-jobs (1)
- # rum (1)
- # shadow-cljs (69)
- # spacemacs (1)
- # sql (5)
- # tools-build (51)
- # tools-deps (6)
- # xtdb (24)
Reportedly, Datomic can be quite well also used from Kotlin. (I know it has a Java API, but having an API does not say much about the experience of using it.) (I think I have it from https://www.youtube.com/watch?v=hicQvxdKvnc)
I’m used to understand “value” as the content of a field in a database table, or a variable’s value - that is, something that can change. Now I’m told that a value is “something that does not change”. Although I’m fully into the supporting the notion of immutability - I’m not sure how to comprehend this re-definition of the term “value”. So far, the Hickey videos doesn’t really explain this.. P.S. If I buy into the new meaning, what should I call a field content?
I tend to think about values as elements of a domain. Say, 4 is an element of the domain of integers, or “aaa” is an element of the domain of strings.
Some people tend to call a mutable variable a ‘place’
I don't think you need a new term. I think adding the concept of time would be helpful though. At a certain point in time an attribute has a value, and at another point in time the attribute could have a different value, but the value at these two points in time cannot change.
The field content is a pointer to a value. The shift in thinking is to understand that while the field content can point to different values at different points in time, the value it points to (at any given time) never changes. This is what it means for values to be immutable.
> I’m used to understand “value” as the content of a field in a database table, or a variable’s value I think you said it yourself quite well. The thing that changes is the field in a database table or a variable, not the values that are their content.
An example. Let's say you ask a database, or an object if you're doing OOP, for the value of a field F1 and it returns 2. Now you tell it to update the field to 3. What happened? Did the database somehow change the number 2 into a 3? Will every place in your code that has a "2" now get a "3" instead? Of course not. Numbers are immutable, they're immutable values. But let's say you ask for another field F2 and it retuns a string "abc". You tell it to append "d" to the end. What happened now? Did the value "abc" change into "abcd"? In some languages, that is exactly what happens. The value "abc" no longer exists, it got blown away by appending "d". The bad news is that if any part of your program that referenced the old value "abc" now sees "abcd" instead. This is the cause of a lot of headaches, which having immutable datastructures simply doesn't allow for happening.
We could say that at time T1, the field F2 referenced the value "abc", while at time T2, it referenced the value "abcd". The value "abc" never changed.
What you are saying is that numbers are immutable and strings mutable, right? This, to me, adds another layer to the whole thing.
No, I didn't say that, or make it clear enough. I said that in some languages, which includes Ruby and PHP, strings are mutable. This is bad, which is why Clojure, Java, Python and others made them immutable.
Now Clojure takes everything a step further and makes any kind of value immutable, not just the basic data types like integers and strings.
Which to me suggests that values are not automatically immutable, they are made, treated so, if one makes that decision. So to say that “We should use values!” doesn’t automatically imply that the values are immutable. I keep coming back to the definition of a value - is it immutable, which is how I understand Hickey, or can they be made immutable?
I maybe should point out here that to me this is not playing around with words - I feel it is at the core of trying to gain a better understanding of the whole thing..
Yes, having immutable values in Clojure was a design decision, maybe it's most important one. Their immutability comes from the way they are implemented.
If Hickey were to say that (for example in the “Value of Values” video), it would certainly make more sense to me..
I think of immutable values this way: If X is some immutable value, then I can observe X at any point in time and always see the same thing. Anyone else can also observe X and always see the same thing. If I give you a reference to X, I don't have to copy the value of X before doing so, in fear of you "changing X" in some way. In OOP, we must fear the latter all the time because if I give you X and you do X.set(field, value)
, I can't rely on X being what I think it is.
Which is the same thing that happens in a relational database, at least by default..
(when a field’s content is changed from one value to another)
Yes, exactly. In Datomic, you always run a query against a specific value of the database, giving the nice property that running the same query against the same database always gives the same result.
Now, a “value” becomes somewhat hard to differentiate from an entity I think. They are both stable “items”. One idea is that a value is an entity without attributes, which would mean that as soon as a value should have an attribute attached to it, it should become an entity, and vice versa (maybe never happens though) an entity that has no attribute will become a value. (trying ideas here..)
I don't think that's the right picture. An entity E is a collection of datoms, which are records of the fact that at a certain point in time, an attribute A had a value V.
Now it might happen that the value is a pointer to another entity. But that value (the pointer itself) is still immutable.
Then, could it be that the only difference between values and entities is that entities have attribute-value pairs to them?
For example, is “green” a value or an entity? It depends, right? If you want to attach the values “light” or “dark” to it (green), then green should be an entity instead of a value.
Maybe someone can answer this better at a more fundamental level, but I'll take a shot.
An entity is a collection of attribute-value pairs. A value is something measurable, a numerical quantity, a string etc., more precisely those describe by valueType
in Datomic.
"light green" (the concept, not the string) can definately be an entity. It can be an entity whose color attribute is "green", and whose shade attribute is "light" (I'm making these attribute names up)
Or, the color attribute can be a ref to another entity, let's say C1. C1 could then have the attributes (I'll present it as a map attribute-name -> value)
{:name "green"
:rgb 0x00ff00}
OK, “green”, the literal string is a value and could be the name for an entity, right?
My initial impulse is to have the V position be an entity id all the time.
Which e.g. points to the entity with the name “green”.
Unless you need to know more facts about "the color green" (as an entity), and thus have it be an actual entity, there's no need to. Use literal values when you can.
@U02G1DKNWKT at some point they must bottom out to an actual value. If you care about modeling the life cycle of green (e.g. it can "change", or more precicely have a different set of attribute/values at certain points in time) feel free to do so
Yes, it must all bottoms out in values eventually (unless you are doing something like only modelling the relationship between entities without knowing anything else about them)
You don’t see an obvious disadvantage of doing so here?
as previously said; entites are sets of attributes that evolve over time. Wether you model "green" is an entity or an value depends on what "green" actually is in nyour domain
Thank you for the discussion - I’m slowly moving towards understanding (I hope)..
Good luck, and keep asking if you have more questions. My encouragement is to use literal values as much as possible.
That makes me curious to why you prefer literal values..
Yes it is @U0ESP0TS8
Kind of..
Concerning how to model “green” as a value or as an entity - I’m thinking that unless I make it an entity (the name of an entity), I will have the redundancy of many instances with the value of “green” all over the database instead of everyone pointing to one single centralized instance (the entity). I’m not sure if this applies to Datomic though..
@U02G1DKNWKT it certainly does apply to Datomic as well
OK, so making it a question of wanting redundancy or not is valid then..
Would you worry about the same if you have an orderQuantity
field, with the numbers 1 and 2 recurring very often?
I actually am not quite sure about that - possibly..
That might be extreme..
The flip side is; sometimes you want the "redundancy", as you want the separate entities to hold different values at different times
Just create a new entity for it?
If you're coming from a SQL background this dilemma doesn't change significantly with regards to datomic
Good to know!
A big benefit of having immutable datastructures is that it always is safe for many objects to reference the same value. Since strings (like other values) are immutable, the JVM can optimize for this by only storing one copy of each string.
(actually FileMaker)
The memory thing is not an issue to me at this point..
In datomic, entities "change" over time, meaning they can hold a different set of attribute/values over time.
As this value will never change. What folllows is you can also consideres a given entity of that database value as a value
By redundancy I mean the copying of the hard coded value like “green” in many instances without them being automatically connected..
Possibly. If “green” is regarded as one single “thing”, it should be represented as one single thing (entity) in the database as well.
You are in some sense asking "what is a color?" You will have to design this based on the needs of your app or domain.
For a drawing app, for instance, it would definately make sense to give colors more consideration, and maybe model them as entities. User A has a :user/favorite-color
referencing entity E1, where E1 has attributes
{:color/name "foo"
:color/hue ...
:color/saturation ...
:color/lightness ...}
New example. A date. Is a date a value or an entity? For instance, in FileMaker (a relational database), if I want to get an answer to the question “What events are connected to 2021-09-27?” I better have a DATE table with one record having the “name” “2021-09-27” (the date). Every time I need to connect some entity to a date, instead of having the actual date value in the column, I use a foreign key value to connect to the DATE table and a specific record there. In this way I have a non-redundant system where a specific date is centralized into one single entity. Again, thinking about applying this way of thinking to Datomic as well..
Datomic natively supports values of type java.util.Date
, just making an attribute with value type of :db.type/instant
.
The cause of “converting” a date from being a value into being an entity is not to be able to attach any more attributes to it, but to centralize it.
I'd say there's nothing more centralized than values of immutable types. They are centralized by the language itself.
I kind of get a sense of what that means..
Would you say modelling dates as entities has no benefits in Datomic?
Every reference to the number 1 in your code is a reference to the same thing. It's a reference to the same underlying sequence of bits. The JVM does not make a copy of 00000001
every place you need it. In Clojure the same thinking should be applied to any kind of value: maps, lists, vectors, strings, booleans etc.
A part of data modeling will be to figure out what kind of things you should make entities, and what you keep as values.
Anything you can give some kind of identity should be an entity. Something that can have value X for some attribute at some point in time, but later that value might change to Y.
A date, for example, can never change. The date 2021-01-01 will never change into 2021-01-02, that's just nonsense! But today's date will advance over time.
An identity should only be given to something that has the capacity to change do you mean?
OK - are you saying that a date cannot have an identity?
Which means that you tend to think of a date as an entity?
Again, this depends on your domain. Are you making an app to show what happened on a given date in history? Then a date like 2010-12-24 could be an entity with facts about it
“Anything you can give some kind of identity should be an entity.”
On the top of my head it is hard to think of something that has no identity in general.. Havn’t thought about it too much about it so I maybe shouldn’t say that..
Are you recording the time when a customer placed an order? That's most likely just a literal date. Using literals whenever you can gives you many benefits, for instance you can use any built-in function to compare or transform them.
When I used the word "identity" above, it really meant what I said the following sentence: "Something that can have value X for some attribute at some point in time, but later that value might change to Y."
I imagine asking the question “What happened on date X?” would be easier to answer if each date were modelled as entities..
Then, identity has to do with the ability change (?) was my response to that.. Which seems odd to me..
Again, Fredrik (and Anders and others) - great thing to have the opportunity to discuss here..
Sorry for the confusion, I said "identity" when I instead meant "something whose attributes can change". In the end I guess you can give anything an identity.
@U02G1DKNWKT You keep mixing up a few concepts 🙂
I’m listening!
As long as it’s immutable, as long as it can be compared to other values, it is a value.
So the example you keep returning to: "light-green"
vs {:color "green", :tint "light"}
A value is a piece of immutable data as long as it is immutable?
Yes, as long as the thing you’re talking about is immutable and can be compared it is a value.
So, again, just so we’re 100% clear: In the example of colors that you keep returning to, both of the options that you lay out are values.
You can model this a few ways, but the easiest is to give each entity a unique ID — just like you do with a SQL row!
If you give each entity an ID, then you can easily talk about changes over time for a given entity:
{:id 1 :color "green"}
-> {:id 1 :color "blue"}
-> {:id 1 :color "red"}
Now instead of inferring that we’re talking about the same entity over time, you know for sure that we are, because we use the same ID.
And that’s pretty much it: Values are immutable pieces of data. They can be solo things like strings, numbers, and dates. They can be collections of things like vectors, sets, and hashmaps. Entities are values changing over time. You usually want to have an ID attached to an entity so that you can see that you’re talking about the same entity (e.g. user, document, account) even though their values change over time.
I am for sure in the process of understanding..
Pausing to digest..
@U07S8JGF7 How can values change when they are immutable?
atom
in clojure works like this. You can swap!
in a new value to a memory location, but it’s the same memory location over time (i.e. the same entity over time).
So to say that “Entities are values changing over time.” is a bit dangerous, right? (I get what the intent is though)
What is changing is the entity, not the values.
For an entity you choose a set of immutable values and when you “change” the value you are in fact choosing another value.
Not really. What is kind of odd is that I have yet to learn my first programming language.. I have a sense of what pointers are though - I think of them as references or foreign keys..
Probably better to consider them at a different time then. 🙂 They’re kinda related, but not at all necessary to understand entities and values.
Again, whether “green” is the name of an entity, or a literal value - is a modeling decision, right?
Yes. If you want to record facts in your database about the color green, then make it an entity
To disambiguate: name of an entity means {:name "green"}
and literal value means "green"
There are some questions that can help you make that decision, but it has nothing to do w/ entities and values. It has to do with, “How is this used? What kinds of flexibility do you want to prepare for?”
Right.
Would you say that “Entities are values that might change over time” is more accurate than “Entities are values changing over time” ? It is not mandatory that there is change, just a possibility
I think much of what is talked about here actually has a philosophy aspect to it..;)
I mean, the question of what an entity is opens up a large philosophical discussion, similar to the question of identity. And talking about how Clojure or Datomic deals with these things is important, because it helps understand their design and how they differ from others, but because generally words mean different things in different contexts it makes it hard.
True. My mind immediately went, “Well, all things change given a long enough timescale,” which is probably not helpful in this discussion.
My feeling is that they should be addressed more in detail Fredrik.
For me as a newcomer it would certainly help..
It might help to remember that the two issues we've been discussing are separate from another: Entities vs values, and immutable values.
Which, among other things, brings up the question of the definition of values. Are there such a thing as immutable values and mutable values, or are values always immutable? (outside of Datomic/Clojure)
In Clojure? All the data structures are immutable by default: Numbers, vectors, lists, hash maps, sets etc. are all immutable.
outside of Clojure
in the common understanding of what a value is in programming
Coming back to the example of variables and their values (outside Clojure) - am I changing the value of the variable or am I choosing another stable (immutable) value as the content for my variable? For me, this “nuance” makes a huge difference..
This depends on the language and what type of value we're talking about. Strings in python are immutable, strings in Ruby are not. Furthermore, in both cases the "value of the variable" can have different interpretations, either meaning "the value of what it points to", or "the address in memory of the value it points to". Being immutable implies you can never change the former, only the latter.
@U02G1DKNWKT The definition of value that we’re talking about came from Rich.
Variables (e.g. var i = 0
) in traditional programming languages are not values at all.
Does anyone have a sense of why the triple parts of the Datom are called Entity-Attribute-Value like in EAV and not Subject-Predicate-Object like in RDF (https://en.wikipedia.org/wiki/Resource_Description_Framework) ?
Has anyone here become acquainted with The Associative Model of Data ? (https://web.archive.org/web/20181219134621/http://sentences.com/docs/amd.pdf) It is also based upon triples but uses Source-Verb-Target instead of Entity-Attribute-Value, which in itself is interesting as a comparison.
I guess Rich was aware of most of the common previous research before starting with Datomic. Obviously there are similarities between RDF and Datomic, but also differences, like the time/transaction component. I'm not familiar with what query language is commonly used with RDF but I guess it is not datalog. RDF does AFAIK not have the idea of reified transactions. RDF does also not prescribe a certain data type or ordering for the tuples in the model, but seems to speak about them in more general, mathematical terms. Nothing wrong with that, but things like the transaction log and the entity view of the database is not very clearly outlined as a concept (at least not in the RDF spec).
For me, the interesting comparison between Datomic and RDF is the triple one (Entity-Attribute-Value vs. Subject-Predicate-Object). Considering the full fact (datom) - when presented with the time aspect on top of the basic triple, it is hard to understand why anyone would want to omit time awareness of the facts..
For instance, Subject and Object “feels” more like similar things than Entity and Value.
Going from Predicate to Association/Relationship “feels” more near than Attribute to Association/Relationship…
One thing I’m aiming for here in this reasoning is that Attribute could/should just as well be seen as an Association/Relationship - as a Value Type.
Howdy all! We have released dev-local 1.0.238 with today's release of Dev-tools 0.9.64. https://forum.datomic.com/t/cognitect-dev-tools-version-0-9-64-now-available/1957
Hi, any thoughts on multi-tenancy with datomic? I’ve been searching through the discussions and it seems like there have been changes with cloud that make multiple dbs ok. Any tips would be great. Thanks
The quick and dirty: Multi-tenancy in on-prem is a no-go. There is not an enforced limit on DBs in on-prem but there are operational considerations making it a poor fit. Chiefly because the transactor was designed to serve a single primary DB (some small secondary DBs are OK for operations type tasks), but the transactor has to hold in memory the sum of each DB's memory index
. With large enough DBs this becomes a resource problem. Furthermore there are no per DB stats in Datomic on prem, all DBs compete for space in the object cache, queries and transactions compete for CPUs and garbage collection pauses have impact across all DBs. You can certainly run multiple DBs, but I recommend that any mission-critical DB have their own dedicated transactor and peer processes.
Multi-tenancy in Cloud is fully supported and you can have 100s to thousands of separate DBs on a Datomic cloud system. There are still operational impacts to having so many DBs but you can scale compute nodes to optimize performance, utilize query groups to offload read per DB and have the ability to scale. If you are planning on going this route, I'd love to have a call with you to discuss your specific needs. I can bring along another member of the Datomic team and we can make sure we understand your specific use-case.
If that is something that interests you, let me know and we can arrange a meeting to discuss. Or if you prefer to work async you can write in your questions to <mailto:[email protected]|[email protected]>. Cheers!
Great to know! I planned to use datomic on-prem with multi tenancy 😅 Perhaps good we settled on psql so I didn't run into this. It would be nice if the docs included this (or do they?)
Yeah this is covered to some extent in on-prem docs here: https://docs.datomic.com/on-prem/operation/capacity.html#multiple-databases
@U1QJACBUM thanks so much for the reply. That’s great news. If we need to scale to say 20 or so tenants with a use case of an inventory system for small restaurants (to give a sense of scale) would we need to do anything manually on datomic cloud? Or would that load likely be handled out of the box? Once we get to that level we would have better resources to then start tuning however necessary
@U0AJQJCQ1 Yeah it should absolutely handle that kind of scale easily. And scaling cloud is as easy is adding compute node resources or moving up instance sizes. Caveat: I am imaging the total datoms throughput/ total size being small for these restaurants. I am happy to chat about specifics.