Fork me on GitHub
#asami
<
2020-12-05
>
jfntn20:12:06

👋 was wondering if asami had support for datalog rules?

quoll20:12:24

It’s done through Naga

quoll20:12:49

Asami was originally part of Naga. It got extracted as a graph DB as it gained more features

jfntn20:12:30

Nice, are there any examples on how to integrate the two?

quoll20:12:18

By default, it’s going to use Asami

quoll20:12:36

It takes a little effort to ask for Datomic, though it works with that as well

quoll20:12:08

It’s not really a command line tool. It’s designed as an API

quoll20:12:59

The Command line program was built as an example of how to use it. Have a look at the run-all function for this: https://github.com/threatgrid/naga/blob/main/src/naga/cli.clj#L74

jfntn21:12:23

Interesting, do I understand correctly that in this examples the idb facts derived by naga get transacted back to asami and will be available for both a) querying idb facts directly in asami, and b) deriving more facts in another pass of naga?

jfntn21:12:14

Also curious about time-space complexities, I assume each naga execution will compute rules against all facts in asami? I.e. there’s no rete-style network that would only consider facts inserted since the previous run?

quoll21:12:43

On the first question… a) The data is transacted back into Asami and can be accessed via querying. b) It iterates until completion. You won’t get any more facts unless you input more data. That second one raises an important issue: Naga extends Datalog to allow declarations of new entities. This means that you can shoot yourself in the foot if you create a loop. For instance, don’t say:

Parent(X,Y) :- Person(X).
Person(Y) :- Parent(X, Y).
i.e. All people have parents, and all parents are people. This will result in an OOM

quoll21:12:29

> I assume each naga execution will compute rules against all facts in asami? I.e. there’s no rete-style network that would only consider facts inserted since the previous run? This is incorrect. There is a rete-style network. However, instead of populating each node of the network with data that flows through it, it uses the indexes to do this work instead. i.e. The memory required for each node in the network is actually a slice out of one of the indexes

jfntn21:12:41

> There is a rete-style network. However, instead of populating each node of the network with data that flows through it, it uses the indexes to do this work instead Very cool, is this what the queues and constraint data achieve with the dirty tracking?

quoll23:12:23

I gather that you’re talking about the code in naga.engine?

quoll23:12:35

Quick overview… 1. The initialization find the connection between productions (or the heads) of rules, and the dependencies (or the bodies) of other rules. 2. Every rule get scheduled by putting them on the queue, with each element of the bodies marked as “dirty” 3. The engine picks up the head of the queue, and checks each element in the body to see if they’re dirty. If the queue is empty, then the engine is finished. 4. If everything is clean, then the engine moves onto the next rule. 5. Otherwise, the engine checks the dirty elements to see if the data they rely on has actually changed (“dirty” just means that they potentially changed). 6. All “dirty” elements that hasn’t experienced a change will be marked as clean. If there have been no changes, then move onto the next rule. 7. If there were any changes, then execute the rule, inserting the productions. 8. Mark all downstream dependencies as dirty, and schedule those rules by adding them to the queue. 9. Return to step 3

quoll23:12:16

There are a couple of features not being used… • Salience. “Way back when…” I was thinking that it would be nice to offer some rules priority over other rules. This is the “salience” value. Higher values get inserted into the queue before anything of a lower value. This works, but for now everything has the same salience value, so it just works as a FIFO. • Productions. At the moment the only kind of production is to insert data. The plan was always to offer a function that could be called with the data that led to the production. The most flexible thing here would be a message queue or a pub/sub on Redis. This should still happen, but I’m still waiting for someone to ask for it 🙂

quoll23:12:52

I gave a description of how all of this works at Clojure/conj 2016: https://youtu.be/8rRzESy0X2k?list=PLZdCLR02grLofiMKo0bCeLHZC0_2rpqsz

jfntn15:12:06

Thanks for the detailed explanation! What’s the best way to learn about what’s supported in rules? I was looking at the schemas in zuko but I’m still not quite sure if what I’m trying to do is supported.

jfntn15:12:41

(I was looking to implement a rule whose head would increment a counter matched in the body, but not sure those eval style patterns are valid)

quoll15:12:52

Zuko was just a place to put internal libraries that were shared between Naga and Asami so that there wasn’t an interdependency between them (no one wants to load up Asami if they’re doing rules on Datomic)

quoll15:12:13

That is possible, actually

quoll15:12:21

Using the rule macro (so not in predicate form): [?entity :value’ ?new-v] :- [?entity :type “counter”] [?entity :value ?v] [(inc ?v) ?new-v)]

quoll15:12:36

Pulling that apart: [?entity :type "counter"] [?entity :value ?v] This part just finds the entity (I just chose to use “type counter”, but you may have an ID, or something else), and pics up the numerical value in ?v [(inc ?v) ?new-v] This increments the value of ?v and binds it to ?new-v Then the head of the clause uses: [?entity :value' ?new-v] Note how the :value property is annotated with a trailing quote character? This means “update” the property, rather than just asserting it. If it were simply asserted, then the entity would have 2 :value attributes (which is legal), and you’ve have to look for the max of them. But by using the update annotation it will remove the old value and add the new one

quoll15:12:57

The problem here is how often the rule runs! This rule will be triggerable by: • another rule setting an entity to be a counter (unlikely) • an entity being given a new value

quoll15:12:56

Updates are breaking the seminaïve reasoning mechanism, so if anyone else updates their :value then it won’t retrigger this rule (hey, that’s lucky!)

quoll15:12:39

(when I said “entity being given a new value” I meant a value that didn’t exist before. Updates won’t affect it)

quoll15:12:54

In general, you could only expect that rule to be run one time per engine-invocation. I’m just explaining the edge cases to cover my bases 😉