aatree 2015-12-20 | Slack Archive

micha03:12:38

hello aatree world

micha03:12:36

is there an example application that uses aatree? or rather calf etc

laforge4903:12:32

Yeah, look in the test files.

laforge4903:12:25

You want to look at yearling. this is the database that builds over virtual aatree structures. An OODBMS for EDN.

laforge4903:12:03

@micha Thanks for asking.

micha03:12:04

this is something i have been looking for for a long time

laforge4903:12:47

Using the Datomic analogy, I want the hoplon clients to be the peers with their own caches and 80% of the database code running in the client. Unfortunately, it is the 80% that has not yet been converted from java.

micha03:12:58

dude yes

micha03:12:13

so can i ask a sort of general kind of question?

micha03:12:25

maybe off topic

laforge4903:12:25

of course

micha03:12:32

it seems to me that we use relational databases to store our application state only because storing only facts in sorted sets achieves the laziness, like you don't need to fetch an entire tree or sequence of things from storage to perform a query

micha03:12:01

but in all other ways relational databases are terrible, because you need to build your trees and maps and whatnot via joins

laforge4903:12:09

Even my sequences are indexed!

micha03:12:12

and the semantic of the join cannot exist in the database itself

micha03:12:21

it's just something you have in your head

micha03:12:39

so people make orms as a way to form abstractions that encompass the joins

laforge4903:12:10

My db is very close to datomic, with the same indexes and everything. (my java db, not what I have in clojure so far)

laforge4903:12:33

My starting point was to build a db for semantic relationships.

laforge4903:12:49

A knowledge db, rather than a data db.

micha03:12:13

what's the distinction?

laforge4903:12:46

everything is about entities, attributes, values and time.

laforge4903:12:53

knowledge exists in time.

laforge4903:12:21

attributes are dynamic

laforge4903:12:53

attributes are largely used to define relationships between entities.

laforge4903:12:10

So it can be viewed as a graph database too.

micha03:12:30

do you do extra indexing of the graph edges?

laforge4903:12:33

Dealing with knowledge always seems to require multiple perspectives.

laforge4903:12:52

nothing more in that area than datomic.

laforge4903:12:00

but I keep a lot more metadata!

micha03:12:19

alan and i experimented with building a thing on top of datomic

micha03:12:25

to store clojure data

laforge4903:12:31

i.e. relationships are also entities!

micha03:12:51

we ran into the need to run our own indexes of which refs point where

micha03:12:55

and stopped

laforge4903:12:01

Datomic makes it difficult to deal with changes over time, though I think it can still be done.

laforge4903:12:53

I'm doing the reverse. First implement virtual clojure data. Then add support for changes over time. Then build the indexes on top of that.

laforge4903:12:27

This gives me a rich mix. I can have some things that are not tracked over time, for example.

micha03:12:06

we had a specific problem at hand

micha03:12:14

and we had a datomic license so we thought why not

laforge4903:12:29

sounds like a lot of fun

micha03:12:34

the problem was this huge php application that we needed to debug in production

laforge4903:12:44

OUCH!

micha03:12:46

legacy code, 250K lines of it

micha03:12:59

so we wanted to log data to datomic

laforge4903:12:09

wild

micha03:12:12

so we could get a historical view of the data

micha03:12:22

and automate as much as possible

micha03:12:32

so we made entities for all the types

micha03:12:46

and had a system for interning them

micha03:12:55

and making refs to build the things

laforge4903:12:04

relationships should also be entities.

micha03:12:13

ah interesting!

micha03:12:27

we couldn't avoid following refs in code

laforge4903:12:33

gives you the edges.

micha03:12:37

this would allow you to avoid that

laforge4903:12:54

Now I am very excited about doing the om next thing, but with haplon and my own db. And put the application code in the browser!

micha03:12:55

do you think you'll be able to write a "peer" that can run in a browser?

laforge4903:12:02

why not?

micha03:12:10

i know datomic can't do it

laforge4903:12:20

it just needs to keep a cache.

micha03:12:34

browsers don't have a big cache

laforge4903:12:39

but my transactor/storage IS the web server.

micha03:12:43

you can count on maybe 5M

laforge4903:12:49

And it has the big cache.

laforge4903:12:56

the client just needs a little.

micha03:12:17

this would be the final piece of the puzzle

laforge4903:12:40

a mobile client is very focused so it does not need a lot of cache

laforge4903:12:52

--small working set

micha03:12:06

right now the thing that takes our dev time is incidental complexity involved with shuffling things in and out of memory in the client, making ajax requests etc

micha03:12:18

it's like a virus infecting all your abstractions

micha03:12:26

becaue you can't keep a reference to anything

laforge4903:12:34

I am a systems dev, not an app dev.

laforge4903:12:46

so my approach is a tad different

laforge4903:12:23

you can keep entity names, attribute names, times.

micha03:12:33

in the client?

laforge4903:12:58

yes. some of them would be constants for a given app.

laforge4903:12:36

I did access control. So users can have a private part of the database for managing their apps.

laforge4903:12:10

And it would all exist virtually in the mobile. 😄

micha03:12:12

or an application on the server that prunes the tree, making a view basically

laforge4903:12:24

why?

micha03:12:24

like a proxy between the client and db

laforge4903:12:38

Put the app logic in the mobile, except for updates!

micha03:12:46

interesting

micha03:12:14

this is the cqrs approach

micha03:12:21

which is A+

laforge4903:12:31

And even updates on the server would be generic and access controlled. So it is really all on the mobile. the logic anyway. most of the logic is data that can be moved anywhere.

micha04:12:06

have you seen castra by the way?

laforge4904:12:20

Needless to say, I need to build a team to do all this. I am just happy that clojure is denser than java.

laforge4904:12:46

I updated castra chat today and put it in my github repository.

micha04:12:53

oh right, derp

laforge4904:12:02

--rebuilt it on top of the new templates.

laforge4904:12:20

I dug into the default app. Tomorrow I do chat.

micha04:12:41

so is the aatree interface very different than that?

micha04:12:54

i mean just the interface between server and client

micha04:12:17

i.e. you have opaque functions you can call to update the state

laforge4904:12:31

aatree is missing a layer--the entitys/attributes/values and indexes are all built on top.

micha04:12:32

and a repository of state that may change from time to time

laforge4904:12:51

But at least what I have is persistent.

laforge4904:12:05

yeah I have that much

laforge4904:12:17

but I still need to add logging among other things.

micha04:12:17

i mean you know, in your vision

laforge4904:12:28

always been there, yes

micha04:12:13

i will clone the repo and see how far i can get with the tests

micha04:12:59

i would enjoy learning about how databases work

laforge4904:12:13

With hoplon methinks I need to add subscriptions to notifications soon. To allow clients to maintain a cache. 😄

micha04:12:45

here is a thing i ran into in the app i'm working on now

micha04:12:51

it's a real puzzle

micha04:12:10

at adzerk we have a pretty good api, and people can use it to create a lot of a thing

micha04:12:28

like if someone wants to make 100k ads that's ok with us

micha04:12:44

so the ui has to deal with that

micha04:12:49

which is a really interesting problem

laforge4904:12:50

micha04:12:02

consider a simple dropdown menu

laforge4904:12:10

so you had to introduce some abstractions, hmm?

micha04:12:26

we have these things called channels, and those have many priorities

laforge4904:12:58

I've always had forward and reverse pagination for all the query results in my ui.

micha04:12:11

yeah but you can't paginate a tree

micha04:12:20

i mean not in a nice way

laforge4904:12:25

laforge4904:12:00

you have trees that get large?

micha04:12:05

huge yes

micha04:12:16

our solution was to split it into two parts

micha04:12:29

the channel and the priority as separate typeahead widgets

micha04:12:37

but one is dependent on the other

laforge4904:12:48

so something of a matrix view

micha04:12:57

well it's a tree

laforge4904:12:59

or matrix selection actually

micha04:12:05

but you can't keep the tree in memory

laforge4904:12:11

a partitioned tree?

micha04:12:19

so if you want to validate that the priority really belongs to the channel, you can't

micha04:12:51

i mean you can, but you need to make a call to the backend to fetch the part of the tree where it lives

laforge4904:12:59

sounds like you really want vm

vm?

virtual memory

oh right

yes

or virtual tree structures

laforge4904:12:20

😄

micha04:12:24

it seems like aatree aims to provide this?

laforge4904:12:35

already does in yearling

laforge4904:12:48

it just needs a whole lot of love

laforge4904:12:23

but as it stands, it ONLY supports trees!

laforge4904:12:37

graphs come in the next layer

micha04:12:38

haha that's all we need in lispland

laforge4904:12:50

well enjoy then

laforge4904:12:06

If you like it, I can provide support.

laforge4904:12:23

--I am not expensive when it comes to working on my own stuff!!!

laforge4904:12:35

Or I can just help you learn it.

laforge4904:12:42

--no charge

laforge4904:12:58

I am full time on this

micha04:12:26

i would enjoy helping out wherever i can

laforge4904:12:34

cool

micha04:12:45

maybe i could help with the hoplon stuff

laforge4904:12:55

fer sure!

micha04:12:40

is it a goal of the project to eventually support many peers?

micha04:12:53

and large datasets?

micha04:12:00

i mean reasonably large

micha04:12:30

i guess i meant to ask about the intended use

laforge4904:12:30

I've added you as a collaborator

laforge4904:12:52

it depends on the interests of the team we need to put together.

laforge4904:12:09

I want to restrict initial use to 1TB on a single host.

micha04:12:36

cool, that's reasonably large

laforge4904:12:35

Once we have a reasonable team, we can fork out in different directions. With developers contributing to multiple forks. One size, after all, does not fit all. I think that is necessary to achieve real scale.

laforge4904:12:38

My basic design for the dbs is composits of bags. keeps things flexible. A VERY open architecture!

micha04:12:40

a lot of documentation here

laforge4904:12:04

I hate docstrings. Of all the code, docstrings are the most difficult to refactor.

laforge4904:12:30

There are at least tools for updating variable names!

micha04:12:55

hahah

laforge4904:12:12

Documents, on the other hand, are allowed to be a bit more abstract so they do not need to be constantly updated.

micha04:12:39

i'm going to make a thing tomorrow that will let me generate markdown api docs

micha04:12:45

for boot

laforge4904:12:54

Super!

micha04:12:03

docstrings are nice there because you can attach them to vars

laforge4904:12:11

laforge4904:12:28

and it is a good time. things are getting more stable

micha04:12:53

boot has a ton of functions in the api

laforge4904:12:11

I'm sure

micha04:12:17

so it seems like it's finally necessary to generate api docs

laforge4904:12:47

but don't let it get like lein--which has more issues than the next 100 most popular projects combined.

micha04:12:51

i don't like the things that generate html because while it might look fancy, it's harder to version with the code

micha04:12:19

yeah i mean boot is different than lein, because it's just a library really

micha04:12:37

a collection of functiona and macros you can use to make a build program

laforge4904:12:49

with javadocs I always turned off the timestamps so versioning would be more reasonable.

micha04:12:25

yeah i was thinking i'd generate the api docs as a precommit hook

micha04:12:40

so you can go to a branch or a tag in github and see api docs for that commit

laforge4904:12:00

wild

micha04:12:01

the docs would be committed in the repo of course

micha04:12:37

i guess then you could even git diff the api docs and see a changelog

micha04:12:39

heh

laforge4904:12:07

I wasted a lot of space in the past as I included a copy of the javadocs in each release.

micha04:12:27

clojure is much better for docs

micha04:12:39

with java you really need all the hyperlinks and frames and whatnot

laforge4904:12:51

yeah 😞

micha04:12:51

because you have hundreds of classes in various packages

laforge4904:12:00

tell me about it!

laforge4904:12:12

I got so tired of it all

micha04:12:21

yeah it makes me angry

micha04:12:30

like honestly angry

laforge4904:12:32

I am happy with clojure

micha04:12:43

clojure is great!

laforge4904:12:49

and excited about all the doors hoplon can open for me

micha04:12:15

i've been obsessed with making web apps for a while now

laforge4904:12:21

Using pure clojure script would be much harder than .hl.

micha04:12:28

because it seems so simple, like too simple to even be computer science

micha04:12:42

but look at any real project, most of the work is wasted on trying to get a ui working

laforge4904:12:25

so lets close the loop and give the client in the browser access to the virtual data structures.

micha04:12:34

yeah that's the missing part

micha04:12:00

also i have been trying to design a sort of "state machine zoo"

micha04:12:22

a way to form abstractions around distributed systems problems you encounter in the client

laforge4904:12:24

I am impressed with no-virtual-dom reactive web--so light weight!

laforge4904:12:52

I think we will have a lot of fun

laforge4904:12:04

complimentary strengths and all that

micha04:12:05

by expressing them as a combination of datastructures stored on some persisted storage and the state machines that mediate access and modification of them

micha04:12:28

like submitting a form in a web page

micha04:12:44

that's really part of a state machine that is manipulating some data structure

micha04:12:53

and the semantics of retries, failure modes, etc

laforge4904:12:04

I did some distributed stuff. Once you get the basics down, it is pretty simple. Just a tad subtle. Quarm logic on the bottom is key.

micha04:12:38

i have this feeling that there are a relatively small number of state machines that can describe any kind of distributed system interaction

laforge4904:12:51

And google has done some good work in this area.

micha04:12:24

what i'm specifically interested in is developing a set of abstractions that clearly define the tradeoffs involved

laforge4904:12:44

better to say that there are a relatively small number of state machines that can be used to describe any kind of distributed interaction. I.E. You only need a few.

micha04:12:19

the key thing is to clearly delinieate the tradeoffs and ensure correctness within the constraints of those

micha04:12:58

it would be great if anyone could use these in a simple and easy to understand way

laforge4904:12:58

But for the foreseeable future I plan to focus on plain old client server. Too much stuff left to migrate from Java.

laforge4904:12:21

Yes. Constraints are what it is all about.

micha04:12:46

clojure is the inspiration, with the stm

laforge4904:12:46

Most developers chaff against them, but really they are enablers.