asami

lilactown 2022-01-25T17:24:20.002600Z

I maintain a library called pyramid which provides the ability to do Datomic pull-ish queries on Clojure maps, with some extra addons. I recently added the ability to extend the pull query engine via protocols to other types. Here's an example of extending it to DataScript db values: https://github.com/lilactown/pyramid/blob/main/scratch.clj#L107-L120

lilactown 2022-01-25T17:26:06.003700Z

the query syntax is a little different than datomic pull. it's based on https://github.com/edn-query-language/eql which is a standard used by several other libraries, e.g. pathom

quoll 2022-01-25T17:27:00.004400Z

I recall you pointing me towards it

lilactown 2022-01-25T17:30:06.006600Z

I know that asami doesn't have a pull API yet. I was thinking of working on a separate package to extend pyramid to asami, wdyt?

quoll 2022-01-25T17:30:54.007300Z

Various people have been asking for a pull API, so it’s about time it happens

quoll 2022-01-25T17:31:40.008200Z

I’ve been taking a break from Asami since late last year (just… tired, so I’ve been doing fun things for a little while), but I need to get back into it myself

lilactown 2022-01-25T17:32:04.008400Z

programming should be fun, right? 😄

lilactown 2022-01-25T17:33:04.009300Z

i.e. I only work on my OSS projects when I'm feeling it. I totally get it.

quoll 2022-01-25T17:36:29.009700Z

This is why https://github.com/quoll/remorse exists.

quoll 2022-01-25T17:36:44.009900Z

And https://github.com/quoll/cljs-math

quoll 2022-01-25T17:38:02.010900Z

I’m actually working on extending cljs-math to implement BigInteger (as a stepping stone to implementing BigDecimal)

quoll 2022-01-25T17:38:19.011200Z

so I have a weird definition of “fun”

lilactown 2022-01-25T17:43:41.011900Z

y'know, some people play golf. some people read the IEEE floating point spec and implement a cross-platform math library

lilactown 2022-01-25T17:46:38.013300Z

the one thing pyramid needs is the ability to take a lookup ref, like [:person/id 1], and turn that into a map representing that entity. it doesn't need to be deep, as long as it has refs contained in the data. does asami have an API for doing that?

quoll 2022-01-25T17:53:01.014900Z

asami.graph/resolve-triple (part of the Graph protocol) is the best way for doing this. For in-memory graphs, it turns into a map lookup For local-storage it does the appropriate lookups in the maps, and converts the data into what’s appropriate.

quoll 2022-01-25T17:54:56.016400Z

The Graph protocol is a reasonably direct mechanism for accessing storage. “reasonably” is the caveat here because it does run a function across the args to map it into a pattern for looking up the correct function

quoll 2022-01-25T17:56:29.017400Z

Thinking about it… it may be possible to have the multimethod dispatch to named functions, so that use cases (like this) that know exactly which function they want can skip the dispatch step

lilactown 2022-01-25T17:57:02.017800Z

I'll start there and see if maybe I can unroll it for my specific use case

lilactown 2022-01-25T17:57:32.018200Z

is there a way to create an in-memory asami db w/o a connection?

quoll 2022-01-25T17:58:01.018800Z

Yes. Give me a minute to get back to my keyboard

lilactown 2022-01-25T17:59:39.019500Z

ok, there's no rush on this at all 🙂 company-wide meeting this morning so I'm multitasking 😂

quoll 2022-01-25T18:03:09.021300Z

asami.index/empty-graph It’s an object. Just assert statements into it and you have your graph. Databases and Connections are wrappers around this. If you want a Database object, then it comes with a Connection (sorry). You can do the wrapping with: asami.core/as-connection

quoll 2022-01-25T18:07:54.022500Z

There’s a diagram that explains what Databases and Connections are doing to wrap graph objects: https://github.com/threatgrid/asami/wiki/Dev:-1.-Code-Layout#asamimemory

quoll 2022-01-25T18:12:58.025100Z

You’ll see that Connection and Database are actually really light. The history vector looks like it’s large, but it’s actually just keeping pointers to each Database which a updates to previous Databases, which in turn point to Graph instances that are updates to previous Graph versions. These are immutable objects with structural sharing, so it’s not expensive to keep that history

lilactown 2022-01-25T18:45:46.026500Z

hmm asami's information model is very different than datascript & pyramid. have to think a bit how I want to bridge the two

quoll 2022-01-25T18:49:00.026900Z

TBH, I need to spend more time on Datascript to see how that works

quoll 2022-01-25T18:50:13.027700Z

I worked very hard to keep to the sort of semantics that we usually see from immutable data structures. The same happens with durable storage

quoll 2022-01-25T18:51:48.027900Z

One unexpected side-effect is that historical graphs can be added to. This is explicitly prevented in durable storage, but there is no reason it can’t happen.

quoll 2022-01-25T18:52:45.028100Z

I didn’t think anyone would care about it, but people have expressed interest in treating it like git. i.e. multiple branches

quoll 2022-01-25T18:53:18.028300Z

If that happened, then merging or rebasing would be an interesting project

quoll 2022-01-25T18:54:04.028500Z

Do Datascript or Pyramid allow for something like that @lilactown?

lilactown 2022-01-25T18:55:11.028700Z

I'm not sure. I don't think that datascript has any thing to reason about automatically reconciling changes other than transacting new data.

lilactown 2022-01-25T18:57:11.028900Z

pyramid is all about indexing data into and selecting data out of maps. it's not really meant to completely replace something like asami

lilactown 2022-01-25T18:59:01.029100Z

you'd need some strategy to reconciling conflicts over time a la CRDTs in either case. def outside the purview of datascript and pyramid

quoll 2022-01-25T19:19:55.047100Z

This is exactly why it’s interesting. Having a different model allows for doing different things 🙂

lilactown 2022-01-25T19:02:48.031300Z

in datascript and pyramid, there's a schema which allows someone to assert that [:person/id 0] is a reference to some entity. usually it also has an index to look it up quickly. asami being schemaless obviously doesn't have such a thing. makes it trickier to load and query the same data that I would in pyramid and ds

quoll 2022-01-25T19:04:05.032500Z

The main difference is that looking up [:person/id 0] is not guaranteed to be unique

quoll 2022-01-25T19:04:20.033Z

Though it can still be fast

quoll 2022-01-25T19:05:15.033900Z

I mean… it is fast

quoll 2022-01-25T19:05:53.034600Z

You look up the POS index with (-> pos (get :person/id) (get 0))

quoll 2022-01-25T19:07:13.036100Z

When I return to it, I have a project that I’m working on that I want to finish, but then the next thing I thought I should pick up was working on schemas

lilactown 2022-01-25T19:07:22.036300Z

pyramid's query engine assumes that looking up a reference returns a single entity

lilactown 2022-01-25T19:07:40.036900Z

trying to decide if I should bend pyramid or how people write queries 😛

lilactown 2022-01-25T19:08:05.037500Z

a pull query needs a place to "start from." I guess you could do something like [{[:db/ident :tg/node-26575] [:person/id :person/name]}]

quoll 2022-01-25T19:08:14.037700Z

For now, I’ve been planning on temporary schemas (they apply during transactions). But I can store them too, then load and enforce them if they’re present

quoll 2022-01-25T19:10:37.040400Z

Well, I’d just go with the presumption that your identifying property is unique. The pull operation then retrieves it, and then just works with the first one returned. If you broke your implicit schema and added more than thing, then you’ll be getting a random object, but if you stick to your own rules, then it’ll work without incident

👍🏻 1
lilactown 2022-01-25T19:11:43.041100Z

asami's sechameless-ness is interesting and makes it stand out from other dbs like datomic, datascript and its derivatives

lilactown 2022-01-25T19:11:51.041300Z

temporary schemas sound cool

quoll 2022-01-25T19:14:04.043100Z

Schemas don’t matter too much, except if you’re putting in data that is unique for an object. In that case, you need to see if the property already exists, and if so, issue a delete/insert to replace it. Right now, that’s controlled by an annotation on the attribute, but if the schema is described in a structure, then just look inside the structure instead of looking for the annotation.

quoll 2022-01-25T19:15:21.044500Z

I was planning on providing a schema as an attribute on the tx-data map (datomic documents that it’s a map, but the only field they support is tx-data. Why not allow for more? 🙂 )

quoll 2022-01-25T19:16:10.045300Z

But when a connection is established, why not read a schema and keep it in memory?

quoll 2022-01-25T19:16:29.045600Z

so a schema could be transacted in as well, without much change

quoll 2022-01-25T19:17:41.046700Z

Regardless, I plan to: a) keep schemas optional b) default everything to multi-arity c) default to untyped attributes

✅ 4
lilactown 2022-01-25T19:20:25.047600Z

so what I have so far then is (scratch code):

(ag/resolve-triple (a/graph (a/db conn)) '?e :person/id 0)
;; => ([:tg/node-26575])

(ag/resolve-triple (a/graph (a/db conn)) :tg/node-26575 '?a '?v)
;; => ([:person/id 0] [:person/name "Rachel"] [:tg/owns :tg/node-26581] [:tg/owns :tg/node-26583] [:tg/owns :tg/node-26585] [:tg/owns :tg/node-26577] [:tg/owns :tg/node-26576] [:tg/owns :tg/node-26579] [:friend/list :tg/node-26576] [:db/ident :tg/node-26575] [:tg/entity true])

lilactown 2022-01-25T19:21:47.048700Z

I think I'll elide the tg/owns attributes. I then need to look through each value and discern if it's a reference to another node or not, and if so resolve that

quoll 2022-01-25T19:22:14.049300Z

I’d filter out the :tg/owns properties. They’re internal book-keeping to connect to nested objects

quoll 2022-01-25T19:22:38.049700Z

sorry… I didn’t type as fast as you did 🙂

1
lilactown 2022-01-25T19:23:42.050700Z

in this case, :friend/list is a collection, not an entity. so I'd need to look up the value of :tg/node-26576 and determine that it's a collection, and then resolve the collection

quoll 2022-01-25T19:25:48.051100Z

Is it a list?

lilactown 2022-01-25T19:32:20.051500Z

yeah

quoll 2022-01-25T19:33:21.052200Z

This is where a schema (implicit or explicit) is useful, since it tells you to just look for :tg/contains

lilactown 2022-01-25T19:35:25.052600Z

ah ok so I don't need to walk the :tg/first :tg/rest chain?

quoll 2022-01-25T19:38:04.053Z

correct. That’s why that property was created 🙂