Fork me on GitHub
#datomic
<
2021-09-27
>
Jakub Holý (HolyJak)07:09:19

Reportedly, Datomic can be quite well also used from Kotlin. (I know it has a Java API, but having an API does not say much about the experience of using it.) (I think I have it from https://www.youtube.com/watch?v=hicQvxdKvnc)

Tobias Sjögren11:09:33

I’m used to understand “value” as the content of a field in a database table, or a variable’s value - that is, something that can change. Now I’m told that a value is “something that does not change”. Although I’m fully into the supporting the notion of immutability - I’m not sure how to comprehend this re-definition of the term “value”. So far, the Hickey videos doesn’t really explain this.. P.S. If I buy into the new meaning, what should I call a field content?

Lennart Buit11:09:23

I tend to think about values as elements of a domain. Say, 4 is an element of the domain of integers, or “aaa” is an element of the domain of strings.

Lennart Buit11:09:40

Some people tend to call a mutable variable a ‘place’

tvaughan11:09:40

I don't think you need a new term. I think adding the concept of time would be helpful though. At a certain point in time an attribute has a value, and at another point in time the attribute could have a different value, but the value at these two points in time cannot change.

Fredrik11:09:02

The field content is a pointer to a value. The shift in thinking is to understand that while the field content can point to different values at different points in time, the value it points to (at any given time) never changes. This is what it means for values to be immutable.

Leaf Garland11:09:37

> I’m used to understand “value” as the content of a field in a database table, or a variable’s value I think you said it yourself quite well. The thing that changes is the field in a database table or a variable, not the values that are their content.

Fredrik12:09:40

An example. Let's say you ask a database, or an object if you're doing OOP, for the value of a field F1 and it returns 2. Now you tell it to update the field to 3. What happened? Did the database somehow change the number 2 into a 3? Will every place in your code that has a "2" now get a "3" instead? Of course not. Numbers are immutable, they're immutable values. But let's say you ask for another field F2 and it retuns a string "abc". You tell it to append "d" to the end. What happened now? Did the value "abc" change into "abcd"? In some languages, that is exactly what happens. The value "abc" no longer exists, it got blown away by appending "d". The bad news is that if any part of your program that referenced the old value "abc" now sees "abcd" instead. This is the cause of a lot of headaches, which having immutable datastructures simply doesn't allow for happening.

Fredrik12:09:00

We could say that at time T1, the field F2 referenced the value "abc", while at time T2, it referenced the value "abcd". The value "abc" never changed.

Tobias Sjögren12:09:14

What you are saying is that numbers are immutable and strings mutable, right? This, to me, adds another layer to the whole thing.

Fredrik12:09:20

No, I didn't say that, or make it clear enough. I said that in some languages, which includes Ruby and PHP, strings are mutable. This is bad, which is why Clojure, Java, Python and others made them immutable.

Fredrik12:09:33

Now Clojure takes everything a step further and makes any kind of value immutable, not just the basic data types like integers and strings.

Tobias Sjögren12:09:23

Which to me suggests that values are not automatically immutable, they are made, treated so, if one makes that decision. So to say that “We should use values!” doesn’t automatically imply that the values are immutable. I keep coming back to the definition of a value - is it immutable, which is how I understand Hickey, or can they be made immutable?

Tobias Sjögren12:09:44

I maybe should point out here that to me this is not playing around with words - I feel it is at the core of trying to gain a better understanding of the whole thing..

Fredrik12:09:57

Yes, having immutable values in Clojure was a design decision, maybe it's most important one. Their immutability comes from the way they are implemented.

Tobias Sjögren12:09:37

If Hickey were to say that (for example in the “Value of Values” video), it would certainly make more sense to me..

Fredrik12:09:18

I think of immutable values this way: If X is some immutable value, then I can observe X at any point in time and always see the same thing. Anyone else can also observe X and always see the same thing. If I give you a reference to X, I don't have to copy the value of X before doing so, in fear of you "changing X" in some way. In OOP, we must fear the latter all the time because if I give you X and you do X.set(field, value) , I can't rely on X being what I think it is.

Tobias Sjögren12:09:48

Which is the same thing that happens in a relational database, at least by default..

Tobias Sjögren12:09:43

(when a field’s content is changed from one value to another)

Fredrik12:09:32

Yes, exactly. In Datomic, you always run a query against a specific value of the database, giving the nice property that running the same query against the same database always gives the same result.

Tobias Sjögren12:09:10

Now, a “value” becomes somewhat hard to differentiate from an entity I think. They are both stable “items”. One idea is that a value is an entity without attributes, which would mean that as soon as a value should have an attribute attached to it, it should become an entity, and vice versa (maybe never happens though) an entity that has no attribute will become a value. (trying ideas here..)

Fredrik12:09:56

I don't think that's the right picture. An entity E is a collection of datoms, which are records of the fact that at a certain point in time, an attribute A had a value V.

Fredrik12:09:50

Now it might happen that the value is a pointer to another entity. But that value (the pointer itself) is still immutable.

Tobias Sjögren12:09:35

Then, could it be that the only difference between values and entities is that entities have attribute-value pairs to them?

Tobias Sjögren12:09:40

For example, is “green” a value or an entity? It depends, right? If you want to attach the values “light” or “dark” to it (green), then green should be an entity instead of a value.

Fredrik12:09:55

Maybe someone can answer this better at a more fundamental level, but I'll take a shot. An entity is a collection of attribute-value pairs. A value is something measurable, a numerical quantity, a string etc., more precisely those describe by valueType in Datomic.

Fredrik12:09:05

"green" (the literal string) is a value

Fredrik12:09:32

"light green" (the concept, not the string) can definately be an entity. It can be an entity whose color attribute is "green", and whose shade attribute is "light" (I'm making these attribute names up)

Fredrik12:09:18

Or, the color attribute can be a ref to another entity, let's say C1. C1 could then have the attributes (I'll present it as a map attribute-name -> value)

{:name "green"
 :rgb 0x00ff00}

Tobias Sjögren12:09:54

OK, “green”, the literal string is a value and could be the name for an entity, right?

Fredrik12:09:27

Yes! It could be the value of a name attribute.

Tobias Sjögren13:09:06

My initial impulse is to have the V position be an entity id all the time.

Tobias Sjögren13:09:13

Which e.g. points to the entity with the name “green”.

Fredrik13:09:57

Unless you need to know more facts about "the color green" (as an entity), and thus have it be an actual entity, there's no need to. Use literal values when you can.

anders13:09:00

@U02G1DKNWKT at some point they must bottom out to an actual value. If you care about modeling the life cycle of green (e.g. it can "change", or more precicely have a different set of attribute/values at certain points in time) feel free to do so

Fredrik13:09:16

Yes, it must all bottoms out in values eventually (unless you are doing something like only modelling the relationship between entities without knowing anything else about them)

Tobias Sjögren13:09:19

You don’t see an obvious disadvantage of doing so here?

anders13:09:41

as previously said; entites are sets of attributes that evolve over time. Wether you model "green" is an entity or an value depends on what "green" actually is in nyour domain

Tobias Sjögren13:09:00

Thank you for the discussion - I’m slowly moving towards understanding (I hope)..

Fredrik13:09:56

Good luck, and keep asking if you have more questions. My encouragement is to use literal values as much as possible.

Tobias Sjögren13:09:10

That makes me curious to why you prefer literal values..

anders13:09:00

that is kinda like saying 'why do you prefer columns over tables' in sql

Fredrik13:09:20

Because they are inherently simpler

👍 1
anders13:09:26

it depends on what your modeling requirements are

vlaaad13:09:16

Wasn't it "value of values" talk about these concepts?

vlaaad13:09:55

Values, references, identities..

Fredrik13:09:59

I think we've been trying to unpack a small part of it.

Tobias Sjögren13:09:11

Concerning how to model “green” as a value or as an entity - I’m thinking that unless I make it an entity (the name of an entity), I will have the redundancy of many instances with the value of “green” all over the database instead of everyone pointing to one single centralized instance (the entity). I’m not sure if this applies to Datomic though..

anders13:09:56

@U02G1DKNWKT it certainly does apply to Datomic as well

Tobias Sjögren13:09:52

OK, so making it a question of wanting redundancy or not is valid then..

Fredrik13:09:26

Would you worry about the same if you have an orderQuantity field, with the numbers 1 and 2 recurring very often?

Tobias Sjögren13:09:58

I actually am not quite sure about that - possibly..

Tobias Sjögren13:09:13

That might be extreme..

anders13:09:17

The flip side is; sometimes you want the "redundancy", as you want the separate entities to hold different values at different times

Tobias Sjögren13:09:14

Just create a new entity for it?

anders13:09:23

If you're coming from a SQL background this dilemma doesn't change significantly with regards to datomic

anders13:09:33

This is a modeling exercise

Fredrik13:09:50

A big benefit of having immutable datastructures is that it always is safe for many objects to reference the same value. Since strings (like other values) are immutable, the JVM can optimize for this by only storing one copy of each string.

Tobias Sjögren13:09:55

(actually FileMaker)

Tobias Sjögren13:09:39

The memory thing is not an issue to me at this point..

anders13:09:09

In datomic, entities "change" over time, meaning they can hold a different set of attribute/values over time.

Fredrik13:09:15

My mistake, I'm not understanding what you mean with redundancy then?

anders13:09:37

With datomic, you can get a hold of the database at a given point in time.

anders13:09:58

by doing so, you hold the database at that given poinnt in time as a value

anders13:09:44

As this value will never change. What folllows is you can also consideres a given entity of that database value as a value

Tobias Sjögren13:09:54

By redundancy I mean the copying of the hard coded value like “green” in many instances without them being automatically connected..

Fredrik13:09:20

Are you worried about equality semantics?

anders13:09:23

This is possible as Datomic accretes new facts, but does not forget old facts

Fredrik13:09:34

As in, "do these have the same color"?

Tobias Sjögren13:09:26

Possibly. If “green” is regarded as one single “thing”, it should be represented as one single thing (entity) in the database as well.

Fredrik13:09:41

You are in some sense asking "what is a color?" You will have to design this based on the needs of your app or domain.

👍 1
Fredrik13:09:01

For a drawing app, for instance, it would definately make sense to give colors more consideration, and maybe model them as entities. User A has a :user/favorite-color referencing entity E1, where E1 has attributes

{:color/name "foo"
 :color/hue ...
 :color/saturation ...
 :color/lightness ...}

Tobias Sjögren13:09:25

New example. A date. Is a date a value or an entity? For instance, in FileMaker (a relational database), if I want to get an answer to the question “What events are connected to 2021-09-27?” I better have a DATE table with one record having the “name” “2021-09-27” (the date). Every time I need to connect some entity to a date, instead of having the actual date value in the column, I use a foreign key value to connect to the DATE table and a specific record there. In this way I have a non-redundant system where a specific date is centralized into one single entity. Again, thinking about applying this way of thinking to Datomic as well..

Fredrik13:09:45

Datomic natively supports values of type java.util.Date , just making an attribute with value type of :db.type/instant .

Tobias Sjögren13:09:25

The cause of “converting” a date from being a value into being an entity is not to be able to attach any more attributes to it, but to centralize it.

Fredrik13:09:52

I'd say there's nothing more centralized than values of immutable types. They are centralized by the language itself.

Tobias Sjögren13:09:17

I kind of get a sense of what that means..

Tobias Sjögren13:09:07

Would you say modelling dates as entities has no benefits in Datomic?

Fredrik13:09:58

Every reference to the number 1 in your code is a reference to the same thing. It's a reference to the same underlying sequence of bits. The JVM does not make a copy of 00000001 every place you need it. In Clojure the same thinking should be applied to any kind of value: maps, lists, vectors, strings, booleans etc.

Fredrik13:09:28

A part of data modeling will be to figure out what kind of things you should make entities, and what you keep as values.

Fredrik13:09:48

Anything you can give some kind of identity should be an entity. Something that can have value X for some attribute at some point in time, but later that value might change to Y.

Fredrik13:09:42

A date, for example, can never change. The date 2021-01-01 will never change into 2021-01-02, that's just nonsense! But today's date will advance over time.

Tobias Sjögren13:09:52

An identity should only be given to something that has the capacity to change do you mean?

Fredrik13:09:16

No, I don't think so.

Tobias Sjögren14:09:43

OK - are you saying that a date cannot have an identity?

Fredrik14:09:02

No, I would definately say a date has an identity

Tobias Sjögren14:09:36

Which means that you tend to think of a date as an entity?

Fredrik14:09:38

Again, this depends on your domain. Are you making an app to show what happened on a given date in history? Then a date like 2010-12-24 could be an entity with facts about it

Tobias Sjögren14:09:58

“Anything you can give some kind of identity should be an entity.”

Fredrik14:09:00

I should have said, "and for which the built-in literals don't suffice"

Tobias Sjögren14:09:34

On the top of my head it is hard to think of something that has no identity in general.. Havn’t thought about it too much about it so I maybe shouldn’t say that..

Fredrik14:09:46

Are you recording the time when a customer placed an order? That's most likely just a literal date. Using literals whenever you can gives you many benefits, for instance you can use any built-in function to compare or transform them.

Fredrik14:09:50

When I used the word "identity" above, it really meant what I said the following sentence: "Something that can have value X for some attribute at some point in time, but later that value might change to Y."

Tobias Sjögren14:09:56

I imagine asking the question “What happened on date X?” would be easier to answer if each date were modelled as entities..

Fredrik14:09:47

You can use < and > directly in a query in Datomic to compare dates

Tobias Sjögren14:09:34

Then, identity has to do with the ability change (?) was my response to that.. Which seems odd to me..

Tobias Sjögren14:09:54

Again, Fredrik (and Anders and others) - great thing to have the opportunity to discuss here..

Fredrik14:09:11

Sorry for the confusion, I said "identity" when I instead meant "something whose attributes can change". In the end I guess you can give anything an identity.

potetm14:09:21

@U02G1DKNWKT You keep mixing up a few concepts 🙂

potetm14:09:48

A value is a piece of immutable data. That’s it.

potetm14:09:19

It could be a string, a number, a date, a collection—a list, a set, a hashmap.

Tobias Sjögren14:09:41

I’m listening!

potetm14:09:03

As long as it’s immutable, as long as it can be compared to other values, it is a value.

potetm14:09:58

So the example you keep returning to: "light-green" vs {:color "green", :tint "light"}

Tobias Sjögren14:09:59

A value is a piece of immutable data as long as it is immutable?

potetm14:09:15

Both of those are values.

potetm14:09:35

Yes, as long as the thing you’re talking about is immutable and can be compared it is a value.

potetm14:09:00

So you can compare 2 hashmaps by, say, looking at the key-value pairs.

potetm14:09:56

So, again, just so we’re 100% clear: In the example of colors that you keep returning to, both of the options that you lay out are values.

potetm14:09:34

Entities build on top of values. You make entities out of values.

potetm14:09:59

An entity is a series of values.

potetm14:09:38

{:color "green"} -> {:color "blue"} -> {:color "red}

potetm14:09:19

So that^ is an entity that changes color three times.

potetm14:09:52

You can model this a few ways, but the easiest is to give each entity a unique ID — just like you do with a SQL row!

potetm14:09:29

SQL rows are entities. Each row is a value that changes over time.

potetm14:09:23

If you give each entity an ID, then you can easily talk about changes over time for a given entity: {:id 1 :color "green"} -> {:id 1 :color "blue"} -> {:id 1 :color "red"}

potetm14:09:56

Now instead of inferring that we’re talking about the same entity over time, you know for sure that we are, because we use the same ID.

potetm14:09:52

And that’s pretty much it: Values are immutable pieces of data. They can be solo things like strings, numbers, and dates. They can be collections of things like vectors, sets, and hashmaps. Entities are values changing over time. You usually want to have an ID attached to an entity so that you can see that you’re talking about the same entity (e.g. user, document, account) even though their values change over time.

potetm14:09:03

Does that clarify anything?

Tobias Sjögren14:09:14

I am for sure in the process of understanding..

Tobias Sjögren14:09:25

Pausing to digest..

Tobias Sjögren16:09:59

@U07S8JGF7 How can values change when they are immutable?

Fredrik16:09:41

The values themselves never change.

potetm16:09:57

You change from one value to another value.

potetm16:09:12

But it’s the same entity.

potetm16:09:57

atom in clojure works like this. You can swap! in a new value to a memory location, but it’s the same memory location over time (i.e. the same entity over time).

Tobias Sjögren16:09:07

So to say that “Entities are values changing over time.” is a bit dangerous, right? (I get what the intent is though)

potetm16:09:39

No, I think it’s accurate, but perhaps easy to misconstrue.

potetm16:09:08

More precisely: entities are a series of values over time.

Tobias Sjögren16:09:12

What is changing is the entity, not the values.

Tobias Sjögren16:09:13

For an entity you choose a set of immutable values and when you “change” the value you are in fact choosing another value.

Tobias Sjögren16:09:57

I can notice some progress here..

bananadance 1
Fredrik16:09:59

Are you btw familiar with how pointers work in C or C++?

Tobias Sjögren16:09:32

Not really. What is kind of odd is that I have yet to learn my first programming language.. I have a sense of what pointers are though - I think of them as references or foreign keys..

potetm16:09:45

Probably better to consider them at a different time then. 🙂 They’re kinda related, but not at all necessary to understand entities and values.

Tobias Sjögren16:09:04

Again, whether “green” is the name of an entity, or a literal value - is a modeling decision, right?

Fredrik16:09:52

Yes. If you want to record facts in your database about the color green, then make it an entity

potetm16:09:25

To disambiguate: name of an entity means {:name "green"} and literal value means "green"

potetm16:09:44

And yeah, just a modeling decision. It should be made base on your needs.

potetm16:09:16

There are some questions that can help you make that decision, but it has nothing to do w/ entities and values. It has to do with, “How is this used? What kinds of flexibility do you want to prepare for?”

Tobias Sjögren16:09:32

Would you say that “Entities are values that might change over time” is more accurate than “Entities are values changing over time” ? It is not mandatory that there is change, just a possibility

Fredrik16:09:07

Yes, that's entirely possible

potetm16:09:58

Without delving into philosophy, yeah that sounds right to me 😄

Tobias Sjögren16:09:48

I think much of what is talked about here actually has a philosophy aspect to it..;)

Fredrik16:09:52

I mean, the question of what an entity is opens up a large philosophical discussion, similar to the question of identity. And talking about how Clojure or Datomic deals with these things is important, because it helps understand their design and how they differ from others, but because generally words mean different things in different contexts it makes it hard.

potetm16:09:48

True. My mind immediately went, “Well, all things change given a long enough timescale,” which is probably not helpful in this discussion.

Tobias Sjögren16:09:47

My feeling is that they should be addressed more in detail Fredrik.

Tobias Sjögren16:09:44

For me as a newcomer it would certainly help..

Fredrik16:09:14

It might help to remember that the two issues we've been discussing are separate from another: Entities vs values, and immutable values.

Tobias Sjögren16:09:25

Which, among other things, brings up the question of the definition of values. Are there such a thing as immutable values and mutable values, or are values always immutable? (outside of Datomic/Clojure)

Fredrik16:09:26

In Clojure? All the data structures are immutable by default: Numbers, vectors, lists, hash maps, sets etc. are all immutable.

Tobias Sjögren16:09:51

outside of Clojure

Tobias Sjögren16:09:07

in the common understanding of what a value is in programming

Tobias Sjögren16:09:00

Coming back to the example of variables and their values (outside Clojure) - am I changing the value of the variable or am I choosing another stable (immutable) value as the content for my variable? For me, this “nuance” makes a huge difference..

Fredrik16:09:52

This depends on the language and what type of value we're talking about. Strings in python are immutable, strings in Ruby are not. Furthermore, in both cases the "value of the variable" can have different interpretations, either meaning "the value of what it points to", or "the address in memory of the value it points to". Being immutable implies you can never change the former, only the latter.

potetm17:09:11

@U02G1DKNWKT The definition of value that we’re talking about came from Rich.

potetm17:09:30

Everyone else uses the term loosely or not at all.

potetm17:09:35

Variables (e.g. var i = 0) in traditional programming languages are not values at all.

potetm17:09:01

Like, Fredrik said, whether that variable points to a value depends on context.

Tobias Sjögren13:09:04

Does anyone have a sense of why the triple parts of the Datom are called Entity-Attribute-Value like in EAV and not Subject-Predicate-Object like in RDF (https://en.wikipedia.org/wiki/Resource_Description_Framework) ?

Tobias Sjögren17:09:39

Has anyone here become acquainted with The Associative Model of Data ? (https://web.archive.org/web/20181219134621/http://sentences.com/docs/amd.pdf) It is also based upon triples but uses Source-Verb-Target instead of Entity-Attribute-Value, which in itself is interesting as a comparison.

Linus Ericsson18:09:07

I guess Rich was aware of most of the common previous research before starting with Datomic. Obviously there are similarities between RDF and Datomic, but also differences, like the time/transaction component. I'm not familiar with what query language is commonly used with RDF but I guess it is not datalog. RDF does AFAIK not have the idea of reified transactions. RDF does also not prescribe a certain data type or ordering for the tuples in the model, but seems to speak about them in more general, mathematical terms. Nothing wrong with that, but things like the transaction log and the entity view of the database is not very clearly outlined as a concept (at least not in the RDF spec).

Tobias Sjögren20:09:15

For me, the interesting comparison between Datomic and RDF is the triple one (Entity-Attribute-Value vs. Subject-Predicate-Object). Considering the full fact (datom) - when presented with the time aspect on top of the basic triple, it is hard to understand why anyone would want to omit time awareness of the facts..

Tobias Sjögren20:09:37

For instance, Subject and Object “feels” more like similar things than Entity and Value.

Tobias Sjögren20:09:56

Going from Predicate to Association/Relationship “feels” more near than Attribute to Association/Relationship…

Tobias Sjögren20:09:36

One thing I’m aiming for here in this reasoning is that Attribute could/should just as well be seen as an Association/Relationship - as a Value Type.

jaret18:09:03

Howdy all! We have released dev-local 1.0.238 with today's release of Dev-tools 0.9.64. https://forum.datomic.com/t/cognitect-dev-tools-version-0-9-64-now-available/1957

👍 2
az19:09:34

Hi, any thoughts on multi-tenancy with datomic? I’ve been searching through the discussions and it seems like there have been changes with cloud that make multiple dbs ok. Any tips would be great. Thanks

jaret14:09:49

The quick and dirty: Multi-tenancy in on-prem is a no-go. There is not an enforced limit on DBs in on-prem but there are operational considerations making it a poor fit. Chiefly because the transactor was designed to serve a single primary DB (some small secondary DBs are OK for operations type tasks), but the transactor has to hold in memory the sum of each DB's memory index. With large enough DBs this becomes a resource problem. Furthermore there are no per DB stats in Datomic on prem, all DBs compete for space in the object cache, queries and transactions compete for CPUs and garbage collection pauses have impact across all DBs. You can certainly run multiple DBs, but I recommend that any mission-critical DB have their own dedicated transactor and peer processes. Multi-tenancy in Cloud is fully supported and you can have 100s to thousands of separate DBs on a Datomic cloud system. There are still operational impacts to having so many DBs but you can scale compute nodes to optimize performance, utilize query groups to offload read per DB and have the ability to scale. If you are planning on going this route, I'd love to have a call with you to discuss your specific needs. I can bring along another member of the Datomic team and we can make sure we understand your specific use-case.

jaret15:09:15

If that is something that interests you, let me know and we can arrange a meeting to discuss. Or if you prefer to work async you can write in your questions to <mailto:[email protected]|[email protected]>. Cheers!

Jakub Holý (HolyJak)19:09:14

Great to know! I planned to use datomic on-prem with multi tenancy 😅 Perhaps good we settled on psql so I didn't run into this. It would be nice if the docs included this (or do they?)

jaret13:09:15

Yeah this is covered to some extent in on-prem docs here: https://docs.datomic.com/on-prem/operation/capacity.html#multiple-databases

🙏 1
az16:09:44

@U1QJACBUM thanks so much for the reply. That’s great news. If we need to scale to say 20 or so tenants with a use case of an inventory system for small restaurants (to give a sense of scale) would we need to do anything manually on datomic cloud? Or would that load likely be handled out of the box? Once we get to that level we would have better resources to then start tuning however necessary

jaret16:09:41

@U0AJQJCQ1 Yeah it should absolutely handle that kind of scale easily. And scaling cloud is as easy is adding compute node resources or moving up instance sizes. Caveat: I am imaging the total datoms throughput/ total size being small for these restaurants. I am happy to chat about specifics.