Fork me on GitHub
#asami
<
2022-01-25
>
lilactown17:01:20

I maintain a library called pyramid which provides the ability to do Datomic pull-ish queries on Clojure maps, with some extra addons. I recently added the ability to extend the pull query engine via protocols to other types. Here's an example of extending it to DataScript db values: https://github.com/lilactown/pyramid/blob/main/scratch.clj#L107-L120

lilactown17:01:06

the query syntax is a little different than datomic pull. it's based on https://github.com/edn-query-language/eql which is a standard used by several other libraries, e.g. pathom

quoll17:01:00

I recall you pointing me towards it

lilactown17:01:06

I know that asami doesn't have a pull API yet. I was thinking of working on a separate package to extend pyramid to asami, wdyt?

quoll17:01:54

Various people have been asking for a pull API, so it’s about time it happens

quoll17:01:40

I’ve been taking a break from Asami since late last year (just… tired, so I’ve been doing fun things for a little while), but I need to get back into it myself

lilactown17:01:04

programming should be fun, right? 😄

lilactown17:01:04

i.e. I only work on my OSS projects when I'm feeling it. I totally get it.

quoll17:01:02

I’m actually working on extending cljs-math to implement BigInteger (as a stepping stone to implementing BigDecimal)

quoll17:01:19

so I have a weird definition of “fun”

lilactown17:01:41

y'know, some people play golf. some people read the IEEE floating point spec and implement a cross-platform math library

lilactown17:01:38

the one thing pyramid needs is the ability to take a lookup ref, like [:person/id 1], and turn that into a map representing that entity. it doesn't need to be deep, as long as it has refs contained in the data. does asami have an API for doing that?

quoll17:01:01

asami.graph/resolve-triple (part of the Graph protocol) is the best way for doing this. For in-memory graphs, it turns into a map lookup For local-storage it does the appropriate lookups in the maps, and converts the data into what’s appropriate.

quoll17:01:56

The Graph protocol is a reasonably direct mechanism for accessing storage. “reasonably” is the caveat here because it does run a function across the args to map it into a pattern for looking up the correct function

quoll17:01:29

Thinking about it… it may be possible to have the multimethod dispatch to named functions, so that use cases (like this) that know exactly which function they want can skip the dispatch step

lilactown17:01:02

I'll start there and see if maybe I can unroll it for my specific use case

lilactown17:01:32

is there a way to create an in-memory asami db w/o a connection?

quoll17:01:01

Yes. Give me a minute to get back to my keyboard

lilactown17:01:39

ok, there's no rush on this at all 🙂 company-wide meeting this morning so I'm multitasking 😂

quoll18:01:09

asami.index/empty-graph It’s an object. Just assert statements into it and you have your graph. Databases and Connections are wrappers around this. If you want a Database object, then it comes with a Connection (sorry). You can do the wrapping with: asami.core/as-connection

quoll18:01:54

There’s a diagram that explains what Databases and Connections are doing to wrap graph objects: https://github.com/threatgrid/asami/wiki/Dev:-1.-Code-Layout#asamimemory

quoll18:01:58

You’ll see that Connection and Database are actually really light. The history vector looks like it’s large, but it’s actually just keeping pointers to each Database which a updates to previous Databases, which in turn point to Graph instances that are updates to previous Graph versions. These are immutable objects with structural sharing, so it’s not expensive to keep that history

lilactown18:01:46

hmm asami's information model is very different than datascript & pyramid. have to think a bit how I want to bridge the two

quoll18:01:00

TBH, I need to spend more time on Datascript to see how that works

quoll18:01:13

I worked very hard to keep to the sort of semantics that we usually see from immutable data structures. The same happens with durable storage

quoll18:01:48

One unexpected side-effect is that historical graphs can be added to. This is explicitly prevented in durable storage, but there is no reason it can’t happen.

quoll18:01:45

I didn’t think anyone would care about it, but people have expressed interest in treating it like git. i.e. multiple branches

quoll18:01:18

If that happened, then merging or rebasing would be an interesting project

quoll18:01:04

Do Datascript or Pyramid allow for something like that @U4YGF4NGM?

lilactown18:01:11

I'm not sure. I don't think that datascript has any thing to reason about automatically reconciling changes other than transacting new data.

lilactown18:01:11

pyramid is all about indexing data into and selecting data out of maps. it's not really meant to completely replace something like asami

lilactown18:01:01

you'd need some strategy to reconciling conflicts over time a la CRDTs in either case. def outside the purview of datascript and pyramid

quoll19:01:55

This is exactly why it’s interesting. Having a different model allows for doing different things 🙂

lilactown19:01:48

in datascript and pyramid, there's a schema which allows someone to assert that [:person/id 0] is a reference to some entity. usually it also has an index to look it up quickly. asami being schemaless obviously doesn't have such a thing. makes it trickier to load and query the same data that I would in pyramid and ds

quoll19:01:05

The main difference is that looking up [:person/id 0] is not guaranteed to be unique

quoll19:01:20

Though it can still be fast

quoll19:01:15

I mean… it is fast

quoll19:01:53

You look up the POS index with (-> pos (get :person/id) (get 0))

quoll19:01:13

When I return to it, I have a project that I’m working on that I want to finish, but then the next thing I thought I should pick up was working on schemas

lilactown19:01:22

pyramid's query engine assumes that looking up a reference returns a single entity

lilactown19:01:40

trying to decide if I should bend pyramid or how people write queries 😛

lilactown19:01:05

a pull query needs a place to "start from." I guess you could do something like [{[:db/ident :tg/node-26575] [:person/id :person/name]}]

quoll19:01:14

For now, I’ve been planning on temporary schemas (they apply during transactions). But I can store them too, then load and enforce them if they’re present

quoll19:01:37

Well, I’d just go with the presumption that your identifying property is unique. The pull operation then retrieves it, and then just works with the first one returned. If you broke your implicit schema and added more than thing, then you’ll be getting a random object, but if you stick to your own rules, then it’ll work without incident

1
lilactown19:01:43

asami's sechameless-ness is interesting and makes it stand out from other dbs like datomic, datascript and its derivatives

lilactown19:01:51

temporary schemas sound cool

quoll19:01:04

Schemas don’t matter too much, except if you’re putting in data that is unique for an object. In that case, you need to see if the property already exists, and if so, issue a delete/insert to replace it. Right now, that’s controlled by an annotation on the attribute, but if the schema is described in a structure, then just look inside the structure instead of looking for the annotation.

quoll19:01:21

I was planning on providing a schema as an attribute on the tx-data map (datomic documents that it’s a map, but the only field they support is tx-data. Why not allow for more? 🙂 )

quoll19:01:10

But when a connection is established, why not read a schema and keep it in memory?

quoll19:01:29

so a schema could be transacted in as well, without much change

quoll19:01:41

Regardless, I plan to: a) keep schemas optional b) default everything to multi-arity c) default to untyped attributes

4
lilactown19:01:25

so what I have so far then is (scratch code):

(ag/resolve-triple (a/graph (a/db conn)) '?e :person/id 0)
;; => ([:tg/node-26575])

(ag/resolve-triple (a/graph (a/db conn)) :tg/node-26575 '?a '?v)
;; => ([:person/id 0] [:person/name "Rachel"] [:tg/owns :tg/node-26581] [:tg/owns :tg/node-26583] [:tg/owns :tg/node-26585] [:tg/owns :tg/node-26577] [:tg/owns :tg/node-26576] [:tg/owns :tg/node-26579] [:friend/list :tg/node-26576] [:db/ident :tg/node-26575] [:tg/entity true])

lilactown19:01:47

I think I'll elide the tg/owns attributes. I then need to look through each value and discern if it's a reference to another node or not, and if so resolve that

quoll19:01:14

I’d filter out the :tg/owns properties. They’re internal book-keeping to connect to nested objects

quoll19:01:38

sorry… I didn’t type as fast as you did 🙂

gotta_go_fast 1
lilactown19:01:42

in this case, :friend/list is a collection, not an entity. so I'd need to look up the value of :tg/node-26576 and determine that it's a collection, and then resolve the collection

quoll19:01:48

Is it a list?

quoll19:01:21

This is where a schema (implicit or explicit) is useful, since it tells you to just look for :tg/contains

lilactown19:01:25

ah ok so I don't need to walk the :tg/first :tg/rest chain?

quoll19:01:04

correct. That’s why that property was created 🙂