Fork me on GitHub
#asami
<
2021-03-19
>
quoll02:03:35

Alpha5 is now done. Same as Alpha4, but queries are significantly faster on large datasets

Craig Brozefsky13:03:47

putting Alpha5 thru some paces

šŸ˜¬ 4
šŸ‘ 4
quoll13:03:38

Well, itā€™s alpha so we can find the big problems and address them before itā€™s called a ā€œreleaseā€ šŸ™‚

Craig Brozefsky13:03:04

gosh I'm rusty at clojure 8^)

Craig Brozefsky13:03:48

also, do not buy Brach's 24 Flavours tiny Jelly Beans

Craig Brozefsky13:03:25

their attempt at Jelly Belly knockoffs... It's like they didn't realize that the flavors must harmonize when you shovel a handful into your maw

šŸ˜† 4
Craig Brozefsky13:03:37

ok, think I broked it

quoll13:03:51

Yup? Whatā€™s happened?

Craig Brozefsky13:03:58

Locked up importing a few thousand objects

Craig Brozefsky13:03:12

I'll break up the txn

Craig Brozefsky13:03:31

and we'll see what's happening. I must first eliminate my own stupidity...

quoll13:03:46

Actually, breaking up the transaction is a bad thing to do. How big is the file that youā€™re importing?

quoll14:03:40

(bad, because you end up expanding the indexes significantly)

quoll14:03:02

Also, if youā€™ve made a mistake, Iā€™d like to know about that too. I should document gotchas, and mitigate some of the more obvious ones

Craig Brozefsky14:03:43

yah, so it's not locked up, but just slow. Mind you I'm throwing a lot of large complex objects at it

Craig Brozefsky14:03:54

I'll get data to you shortly

quoll14:03:50

ā€œlarge complexā€ is going to be an issue. Zuko (the module that breaks it up into triples) is now faster than it used to be, but thereā€™s still a lot of work for it to do

Craig Brozefsky14:03:08

Yah, I'm thinking it's a chance to intrument the whole thing with metrics data

Craig Brozefsky14:03:31

Importing o1 into asami Importing to asami "Elapsed time: 53402.866849 msecs" Imported 4503 Importing o2 into asami Importing to asami "Elapsed time: 644524.77233 msecs" Imported 41484

Craig Brozefsky14:03:22

FAIL in (load-test) (core_test.clj:21) Test loading and parsing expected: (= (count o1) (count (:tempids tx1))) actual: (not (= 271 4503)) lein test :only netgrok.core-test/load-test FAIL in (load-test) (core_test.clj:22) Test loading and parsing expected: (= (count o2) (count (:tempids tx2))) actual: (not (= 2570 41484))

Craig Brozefsky14:03:51

So the failures are me expecting the entity count to be the input object count. The difference tells you just how complex some of the objects are, with many nested entities...

quoll14:03:28

It creates lots of temporary IDs, but unless you ask, I would generally think theyā€™d match the provided objects. :thinking_face:

quoll14:03:55

Iā€™m assuming that some or all of this data can be shared?

Craig Brozefsky14:03:11

not sure. it's packet dumps form my home network

Craig Brozefsky14:03:10

I will find some representative data

Craig Brozefsky14:03:38

The :tempids in the tx would include the nested objects right?

quoll14:03:32

I didnā€™t think so (unless you asked it to). So unless Iā€™ve forgotten something itā€™s a problem

quoll14:03:34

It creates lots of IDs, but that map is supposed to just be for the top level entities, and things youā€™ve provided your own temporary IDs to

Craig Brozefsky14:03:44

yah, so that seems wrong then

Craig Brozefsky14:03:58

I provided no temp IDs for anything

Craig Brozefsky14:03:14

So, I ran the same thing with in mem db

Craig Brozefsky14:03:16

lein test netgrok.core-test Preparing o1 "Elapsed time: 0.005525 msecs" Preparing o2 "Elapsed time: 8.82E-4 msecs" Importing o1 into asami Importing to asami "Elapsed time: 262.289293 msecs" Imported 4503 Importing o2 into asami Importing to asami "Elapsed time: 2622.369329 msecs" Imported 41484

Craig Brozefsky14:03:32

Is zuko involved in that too?

quoll14:03:06

Itā€™s a library that pulls entities apart into triples

Craig Brozefsky14:03:14

ok, so it's not in zuko then eh

quoll18:03:34

BTW, the large number of entities in tempids was expected, but Iā€™m revisiting it, and I think they should not be included.

quoll18:03:43

So Iā€™m going to update Zuko to remove them

Craig Brozefsky13:03:50

New times: lein test netgrok.core-test Importing o1 into asami Importing to asami "Elapsed time: 19520.553168 msecs" Imported 0 Importing o2 into asami Importing to asami "Elapsed time: 208748.131329 msecs" Imported 0

Craig Brozefsky13:03:04

So yah, I can confirm your estimate on perf win with Alpha6

quoll13:03:03

Iā€™m guessing that youā€™re saying ā€œimported 0ā€ to mean a count of tempids?

Craig Brozefsky13:03:03

yah, just ignored that this time since I havent' updated my tests yet -- still drinking first cup of coffee

quoll13:03:25

Have a look at the count on tx-data. Thatā€™s the number of statements inserted

quoll13:03:07

If you want the number of entities insertedā€¦ do a count on your input šŸ˜Š

quoll13:03:48

The tempids, is so you can provide a negative number for :db/id on an entity and it will generate an ID for you and tell you what your negative number got mapped to (like Datomic)

Craig Brozefsky13:03:17

I have some utility functions for exploring the shape of the data and the schema

Craig Brozefsky13:03:51

next step is to do some query clause generators

Craig Brozefsky13:03:22

for functional composition of where clauses...

Craig Brozefsky13:03:48

just doing export-data a bunch helped me grok what is happening

quoll14:03:39

export-data gives you a view of everything, but if you insert individual things (or just small numbers of entities) then have a look at the contents of tx-data in the results of the transaction. That shows you the triples that were generated and inserted.

quoll14:03:17

Iā€™m curious how many triples you got from your data that took 3m28s to load. (I have to work to improve this)

Craig Brozefsky14:03:20

data coming up...

Craig Brozefsky14:03:50

ein test netgrok.core-test Importing o1 into asami Importing to asami "Elapsed time: 19489.224782 msecs" Imported 31881 statements Importing o2 into asami Importing to asami "Elapsed time: 209601.057757 msecs" Imported 296271 statements

quoll14:03:29

Thanks for that

Craig Brozefsky14:03:02

heading out for brunch

quoll13:03:41

For anyone wondering, Craig is allowed to push me around in here. Heā€™s no longer at Cisco, but it was his bright idea that I write my own graph database.

Craig Brozefsky15:03:35

So I am coercing string keys in JSON to keywords... OUt of.. habit?

Craig Brozefsky15:03:51

Would I be violating any assumptions of Asami if I did not do that?

quoll15:03:12

orā€¦ I hope not šŸ™‚

Craig Brozefsky17:03:21

I need to check my undertanding here:

Craig Brozefsky17:03:38

[:tg/node-929806 "ip.flags" "0x00000040"]
 [:tg/node-623767 "layers" :tg/node-623768]
 [:tg/node-623767 :db/ident :tg/node-623767]
 [:tg/node-623767 :tg/entity true]

Craig Brozefsky17:03:58

netgrok.core> (d/entity (d/db (conn)) :tg/node-623767)
{}

Craig Brozefsky17:03:05

I would not expect that to be an empty entity

Craig Brozefsky17:03:06

the triples are from: (d/export-data (d/db (conn)))

Craig Brozefsky17:03:15

this is asami alpha5 running in memory

Craig Brozefsky17:03:08

(conn) is just (d/db-connect URI) ...

Craig Brozefsky17:03:26

so I'm making a new connection using the DB uri, and a new DB...

Craig Brozefsky18:03:39

Ah, ok, if I don't coerce keys to keywords in the structs, entiy loading fails

Craig Brozefsky18:03:43

So I think that's a boog?

Craig Brozefsky18:03:38

this is in memory DB

quoll18:03:41

Iā€™m working with the cleaned up data right now, so the attributes are all keywords. Iā€™ll try with the strings shortly

quoll18:03:51

Oh, thatā€™s interesting too

Craig Brozefsky18:03:25

yah, since the entity func is basically per storage...

Craig Brozefsky18:03:40

export-data is my new pal

Craig Brozefsky18:03:08

yah, so the file I sent you, I beleive has string keys

quoll18:03:50

it does, yes

Craig Brozefsky18:03:47

:smiling_face_with_3_hearts: this is a pleasant way to explore data

Craig Brozefsky21:03:15

ok, gotten more familiar with the query language. Able to identify all the devices on my network, and start digging into their behavior

Craig Brozefsky22:03:53

The query language is impressie Paula

šŸ’– 3
Craig Brozefsky22:03:13

calling it a day tho, so I stop obsessing over it