Fork me on GitHub
#datomic
<
2021-09-16
>
stuartrexking01:09:39

Does anyone have any advice on managing application lifecycle events with Ions? I want to be able to start and stop Kafka consumers in a reliable way.

2
stuartrexking01:09:33

Is there an idiomatic way of managing services, like RDB connection pools etc.

steveb8n08:09:39

Q: I had prod downtime with Datomic Cloud today. first time ever. I’m still trying to figure out what went wrong but killing the instances made it come back up. Trying to learn how to diagnose this…

steveb8n08:09:40

the only thing I can see is the indexer memory is high but normally it drops down when it reaches its ceiling. after the restart it jumped straight back to the prev level. Is this normal or should it restart at a min level (similar to jvm free)?

steveb8n08:09:13

other than that, I’m still combing through logs to try and find a signal that explains it

steveb8n08:09:02

just found “Too many open files” warnings in the logs. I am using aws client lib in Ions. could this be some kind of leak?

steveb8n09:09:59

pretty sure it’s a socket leak. probably due to aws client mis-use on my part. I wonder if there is a way to measure open “files” as a metric so I can monitor this

steveb8n09:09:25

any advice you might have on how to find the leak would be much appreciate

steveb8n09:09:18

another useful idea would be a way to cause cloud instances to be killed automatically when the “Too many open files” point is reached. suggestions for this would also be much appreciated

steveb8n09:09:44

I will try and find the leak myself but suggestions are welcome

steveb8n09:09:39

I’m assuming this is an aws client problem but is there any other reason that “Too many files” warnings can happen in cloud?

jaret13:09:04

Some folks encountering this issue on the forums have been attempting to isolate by using a file leak detector on the box and isolating the box to a standalone query group. https://forum.datomic.com/t/transactor-stops-responding-with-too-many-open-files-error/1863/3

jaret14:09:27

I'd also put in that you have access to me with Datomic support. I'd be happy to help diagnose and troubleshoot. If you want to log a case shoot an e-mail to <mailto:[email protected]|[email protected]> or <mailto:[email protected]|[email protected]>

steveb8n22:09:56

thanks @U1QJACBUM I’ve logged a ticket

vlaaad11:09:16

I thought databases are values…

(let [db (get-db ...)]
    (= 
      (d/as-of db #inst "2020")
      (d/as-of db #inst "2020")))
=> false

vlaaad12:09:00

(let [conn (get-conn ...)
      db1 (d/db conn)
      db2 (d/db conn)]
  [(= db1 db2)
   (= (.-client db1) (.-client db2))
   (= (.-conn db1) (.-conn db2))
   (= (.-info db1) (.-info db2))])
=> [false true true true]
why?

Fredrik12:09:03

Which Datomic are you using? datomic-free, datomic-pro, client?

vlaaad12:09:38

it confirms, although not explains the behavior

favila12:09:59

The point of db as a value is that data retrieval from it is (mostly) repeatable. I’m not sure what the value of db equality would be because what kind of equality you want depends on what you are doing. Eg suppose dbs could compare, what would you use that for?

emccue13:09:59

maybe caching - if you asked for this value/query on the same db value, don't ask datomic we already know

favila13:09:13

Dbs have a unique id, basis-t, ishistory, and optional as-of-t, since-t, and filter. You could say that if these are all equal, the dbs are equal, but that means some “equivalent” dbs are not equal (eg if one uses as-of to travel back to the same t as another unfiltered later db)

emccue13:09:54

i mean yeah - equals in the more general sense should return equals if all observable properties of a thing are definitely equal

favila13:09:24

Maybe caching, but how likely is it that you have an equal but not identical db in this caching context?

emccue13:09:36

its more or less accepted that if you don't know if they are equal or not - or don't want to guarantee it - they aren't

emccue13:09:18

but if all those things are equal you would know for sure they are equal so its odd

emccue13:09:01

> how likely is it that you have an equal but not identical db in this caching context? lets say they query like graphql.

favila13:09:36

But the basis-t in a production db is constantly advancing

emccue13:09:11

user_as_of(year: 2020) {
   name,
   age
}

user_as_of(year: 2020) {
   name
   friends {
      name
   }
}

emccue13:09:35

you could get db/as-of 2020 for each of these

emccue13:09:54

big caveat is that i'm stupidly new to this so i'm not sure what basis-t is exactly

favila13:09:32

Basis-t is the latest T the db contains

emccue13:09:16

okay so if a db contains that the utility might go down somewhat

favila13:09:41

A db must necessarily contain that

emccue13:09:42

i dunno how likely it is to do 2 (d/db ...) calls before a write happens

emccue13:09:03

but it could be likely enough to warrant caching - idk

favila13:09:28

Depends on your system, but I’d say not likely enough to make caching useful

emccue13:09:28

okay - but i think you'll agree that there is at least hypothetically a use

emccue13:09:12

and datomic isn't widely or openly developed on so its kinda hard to get a feel for probabilities

favila13:09:37

Well I’ll put it this way, I’ve used datomic on-prem in production for over 5 years on massive 10bil datom+ databases. I think this equality makes sense in an abstract way, but I haven’t encountered any practical use for it. In practice, you’ll have execution contexts share an identical db anyway

emccue13:09:52

also if you read the clojure docs through as to "what is a value" and then see this "value" be not-a-value it would throw you for a loop

favila13:09:12

I think there’s the opposite problem too, which is they may assume too much of db inequality or equality

favila13:09:20

Eg that equal dbs produce equal projections always (they may not due to non-determinism in how the projections are made) or that unequal dbs produce unequal values

emccue13:09:34

> non-determinism in how the projections are made Wait, what

favila13:09:12

Eg pull expressions limits, calling impure functions in your queries, hash changes among versions, that sort of thing

favila13:09:13

Equal db guarantees that the datom sets you would get are the same, but most of what we do (query and pull) is projecting out of datoms and not part of the db per se, and the guarantees get weaker for all the usual reasons that (f x)=>y may not produce the exact same y for all time in all places

vlaaad14:09:11

btw graphql and d/as-of is exactly the right context for my question

favila14:09:50

The as-of of a “fresh” db (the result of (d/db conn) ) is always nil.

favila14:09:24

and the basis-t of the db advances with each write

favila14:09:42

are you using an as-of db with graphql resolvers? My expectation is that a d/db is established in the resolver context at the beginning of the request and is the same through the entire request

vlaaad14:09:57

I do “changelog” that returns a series of items from different as-of dbs

vlaaad14:09:36

so no, in my case db is not in a context

vlaaad14:09:38

I haven’t implemented it yet, but I think it might be possible for me to ensure identical as-of dbs for different items sharing the same tx…

favila15:09:05

Could you explicitly use T as a cache key? note the “effective T” of a database is (or (d/as-of-t db) (d/basis-t db)) iff since-t, isHistory, and filter are all unset and as-of-t >= basis-t

vlaaad12:09:07

Strange there is no explanation for that, because there is even a talk by Rich — Database as a Value…

greg13:09:05

https://github.com/Datomic/mbrainz-sample/blob/master/schema.edn At the top of the file defining schema, authors made a note about "enum attributes" and "super-enum attributes". Sounds like a best practice. I read the definition posted there, in that file, but I can't get it. What is the difference between these two? Could you give me some examples?

greg14:09:42

In another repo, https://github.com/Datomic/mbrainz-importer/ I've found two dataset files that refer to the same terminology: https://github.com/Datomic/mbrainz-importer/blob/master/subsets/batches/super-enums.edn https://github.com/Datomic/mbrainz-importer/blob/master/subsets/batches/enums.edn Looking at these files, it looks like there are only two differences: • super enums are just more numerous then simple enums • simple enum hold only name attribute while super-enum holds more of them Is that correct distinction between these two?

Fredrik14:09:06

Super enums are "global" entities referenced by several types. For instance, :artist/country and :label/country can point to the same country entity. Each regular enums is only referenced by a single type, for instance artist/gender is only referenced by an artist.

👍 2
mokr14:09:55

Hi, I know I can return a collection of values from a query with [?name …] , but what do I do if I have two such variables and want the concatenated result? Illustrated by this contrived example:

[:find [?friend-name ?sibling-name ...]  ;; Just one of the many alternatives I've tried
 :in $ ?person-id
 :where
 [?person :person-id ?person-id]
 [?person :friend ?friend]
 [?person :sibling ?sibling]
 [?friend :name ?friend-name]
 [?sibling :name ?sibling-name]]
In other words I’m trying to follow two many-relationships and return the same attribute from all those entities. Any help appreciated as I’m out of variations to try and I only seem to find documentation for the single variable variant.

schmee14:09:27

:find ?friend-name ?sibling-name will return a set of [friend-name sibling-name] tuples, is that what you’re looking for?

mokr15:09:07

Hmm, thanks, that was simple. Maybe I’ve been experimenting with bad data here. So, I will get a tuple of two collections that I can then concat afterwards, right?

schmee15:09:58

you’ll get an array of tuples: [[f1 s1], [f1 s2], [f2 s2]... and so on

mokr15:09:40

But all in all it will be the names of all the people that are either a sibling, friend or both of the person identified by person-id?

Fredrik15:09:41

This will return all the tuples [friend-name sibling-name]

Fredrik15:09:01

It's is maybe not what you want?

mokr15:09:21

It’s always tricky to use a contrived example to try to illustrate. What I need is to follow two or more refs from an entity, where both refs are cardinality many. From the entities those links/edges leads to I need to extract an attribute.

Fredrik15:09:01

What if friend and sibling both point to the same person? Do you want a query that excludes this possibility?

mokr15:09:48

No, exclusion is not needed.

mokr15:09:26

The refs essentially represents different reasons for targeting entities. Adding another example might make it worse, but here goes: A PC-technician gets the task to fix some computers. The task entity has refs to computer entities. Refs can be eg. :broken-down or :user-complaint, but all the tech needs is to get :serial from all the referenced computers to know which ones to work on. In this case a broken down computer could have a user complaint, but both leads to the same serial.

Fredrik15:09:22

Could you use an or clause?

(d/q '[:find [?id ...]
       :in $ ?person
       :where
       (or [?person :person/sibling ?friend-or-sibling]
           [?person :person/friend ?friend-or-sibling])
       [?friend-or-sibling :person/id ?id]]
     db (d/entid pro-db [:person/id "1"]))

Fredrik16:09:28

This will find those entities which satisifes either of the clauses (or both), and returns a vector of their value for some attribute (in this case :person/id )

mokr16:09:18

Thanks, that looks like exactly what I was after. Reads better as well

👍 2