Fork me on GitHub
#asami
<
2020-11-17
>
quoll16:11:21

The plan at this point is to implement the block abstraction over various types of storage. Mapped block files are specifically for a single-machine setup on the JVM. IndexdDB are how we’re doing a single machine setup on Javascript.

quoll16:11:34

My hope is that if we implement blocks on a system that does distribution, then scaling out gets managed for us.

borkdude16:11:04

I'm not deeply familiar with it, just wondered, as this is used by datahike

quoll16:11:25

Nope 🙂 But I can

quoll16:11:06

The design was informed by changes I was trying to make in Mulgara, along with some thinking about what Datomic does with provisioning

quoll16:11:13

One issue is that querying has been written synchronously. That’s making working with things like IndexedDB awkward. I see that konserve is doing everything asynchronously

mpenet17:11:40

Yes datahike is full async by default.

mpenet17:11:02

If i recall correctly they have flags to make the runtime sync/async (at hh-tree level at least) , then use macros to do error handling & co since they cannot rely on just one way of doing things

mpenet17:11:28

Add cljs support into the mix and that can get quite confusing (but necessary)

mpenet17:11:22

Doing io with some async facade is a good approach in this context tho

whilo19:11:51

Hey @quoll, @borkdude and @mpenet. This is a good point to introduce myself. I am Christian, one of the architects of Datahike. We, the team at lambdaforge, have had a look into your storage layout whitepaper lately and your work looks very good. We have some questions about your design choices and would love to compare and combine your AVL ideas with the hitchhiker-tree concepts to mutually improve our storage layers. Regarding konserve, we have designed it exactly to be a minimum viable portable abstraction for asynchronous storage. Since you map your AVL tree with binary layouts directly, konserve's edn serialization facilities might not be needed, but other than that it should be usable and is a potential point of collaboration. As @mpenet has pointed out, to port our stack we have introduced a restricted core.async based monadic async-await alike asynchronous DSL that we can compile away for the JVM because there synchronous IO turns out to be much faster. Based on this abstraction @grounded_sage has a running prototype of Datahike in cljs since yesterday and we are very interested in your thoughts and work on asynchronous programming in cljs.

whilo19:11:05

Since we seem to have very similar long-term goals in terms of reach and functionality and our core objective is not the promotion of the current implementation details of Datahike, we are in fact open to collaborate on any level you see fit including a potential joint project and shared funding.

👍 12
quoll19:11:38

I haven’t been on the CLJS work recently. That’s being worked on by @noprompt. I’m trying to wrap up the JVM code so that I can join him

quoll19:11:11

Well, in my case, a lot of it grew out of frustration with the Mulgara codebase (since it is all Java), and wanting to make some changes to the design. I have a strong bias to basing this work on the Mulgara architecture, since that implementation has a record of being very fast. In particular, the AVL trees turned out to be a very effective choice

quoll19:11:18

I saw David Greenberg’s talk on hitchhiker trees at Strangeloop, and I’ve wanted to try them out. I haven’t yet though

whilo19:11:47

That makes sense. @mpenet has helped to improve the performance of the hitchhiker-tree quite a bit btw.

quoll19:11:21

Right now, my opportunity to work on these has been fortuitous for me. I was doing it in my evenings and weekends. But then I mentioned it at work, and my manager, and eventually HIS manager thought it sounded like a great idea, and told me to work on it

whilo19:11:36

Since we already also have our implementation we can compare it with the AVL tree.

👍 3
quoll19:11:24

It stores the node balance in the top two bits of the “left” pointer. I’m thinking I should probably change this to the topmost bit of both the left and the right. That’s because these numbers will never be negative, so there’s no issue with using those bits, and it makes it easier to switch between 64 bits and 32 bits if we want to

whilo19:11:05

Does this mean you are constraint with what Asami can become as a project by your management?

quoll19:11:46

Sort of. It’s a weird situation 🙂

quoll19:11:12

The story is that I was building a rules engine, and showed my manager. He loved it and told me to work on it during working hours. I said that it was Open Source, and he was happy to do that. That became Naga. He also said that he wasn’t interested in using a commercial database (I was planning on Datomic as my first back-end). “Can you build your own?” So I built a minimal in-memory store. Then he asked for more features. One of them was to port to ClojureScript. Then more features. And more. Eventually, I decided to pull it out and turn the storage into its own project. That is Asami. Other members of the team use Asami, and I try to be as responsive as possible to them. Working with them it became clear that they were more familiar with the Datomic API, and so I wrapped the Graph/query API in a new namespace that started to present something similar to what Datomic looks like. Most of this is being driven by what I want to do next. But my primary focus is always to make it useful for my team.

whilo19:11:24

Ok, that sounds reasonable to me.

quoll19:11:30

Right now, the primary focus is durability. As soon as I can get a release out for that, I’ll be moving back to the public API, for functions like with, and also to clear the bug/feature backlog

whilo19:11:24

Would you be interested then in comparing the durability bits and see whether we can help each other there?

whilo19:11:30

Which format of discussion would be most appropriate in your opinion? We are currently doing a lot of shared programming/discussion sessions and we could do one of those together, for example.

quoll19:11:58

It will depend on timing 🙂 I’m on the East coast of the USA (near Washington DC) (UTC-5)

noprompt19:11:43

West Coast PST

whilo19:11:33

I am in Vancouver, BC, (PST) and the rest of the team is in Europe (CET). So mornings in PST work well at the moment, e.g. 8 or 9 am.

whilo19:11:26

@noprompt Would this work for you?

noprompt19:11:39

Unfortunately, no. I have 3 children I’m responsible for at those times. 🙂

noprompt20:11:27

However, I am content to communicate asynchronously here and elsewhere. Paula and I are also on the same team.

grischoun20:11:29

Hi. I am Chrislain and also a member of the lambdaforge team. I’ll be happy to join the call.

noprompt20:11:01

Will konserve stick to a hard dependency on core.async?

whilo21:11:34

The interface can be made callbacks (which is general) easily (also with core.async). If you do not want to the internals to use core.async then we would need to rewrite everything in a CPS/callback style. Which programming model would you prefer?

noprompt21:11:16

Personally, I prefer CPS/callback style and in particular the pattern of using promise style [resolve reject] as it is trivial to implement combinators, etc. while minimizing logic common when using “handlers” i.e.

(fn [error value] (if error ,,,))

whilo21:11:34

I see. Have you had bad experiences with core.async?

noprompt22:11:46

Yes, however, those experiences are few.

noprompt22:11:04

My perspective is that of a consumer. It is occasionally not desirable to as a consumer to take on core.async as a dependency.

borkdude22:11:35

Fully agree!

borkdude22:11:13

Build the tooling in a low level way that optionally people can build core.async on top of that. Callbacks are fine for this.

noprompt22:11:44

Also, CPS/Callbacks merely rely on functions and assume little else which is very inclusive.

👍 3
borkdude22:11:44

I made the "mistake" to couple babashka.pods async functions to core.async, but I changed my mind and switched to callbacks. I don't want to force core.async on consumers.

👍 3
borkdude22:11:40

Core.async is quite heavy, you're pulling in tools.analyzer, etc. And who knows, a few years from now there's going to be another Clojure async thing. Callbacks will still be there.

borkdude22:11:59

Maybe project loom will bring interesting things in this regard

mpenet22:11:52

Loom is jvm only. I quite like core.async personally but for konserve callbacks might be good enough. Callbacks are ok as long as you're not doing a lot of composition with async values, when you are things become hairy fast imho.

mpenet22:11:25

That said konserve just needs some form of Promise, could be completablefuture on jvm and js/Promise on cljs otherwise. Or just callbacks

borkdude22:11:34

I quite like core.async btw, that's not the point

noprompt22:11:46

Speaking from experience, I was confronted with a decision earlier this year to base a workflow interpreter we use on top of promises but ultimately decided that functions accepting resolve and reject callbacks gave use the most cross platform flexibility because, again, just functions.

noprompt22:11:17

Yeah, that’s my feeling too @borkdude. It’s a great dependency for an app but not necessarily for a library unless, say, it was specifically meant augment/enhance.

mpenet22:11:14

@borkdude not sure what you mean, I think we're all saying essentially the same thing

mpenet22:11:06

I think on hh-tree the discussion about this same issue was that we actually needed better ways to compose async values, but that's not needed at konserve level, hh-tree/datahike could turn whatever konserve returns into what it wants.

whilo05:11:50

Let me first provide some background on my position before replying to async for konserve. I agree that callbacks are more portable and can be made composeable (continuations are a universal framework of computation), but they still require a lot of things to be defined to have similar semantics between libraries and runtimes, i.e. error handling, the way the event loop schedules them (on the JVM a threadpool integration) and how synchronisation happens atomically between two different processes, which is still implicitly in the responsibility of the library maintainer porting code. I think the relationship of the Clojure community to core.async is a bit contradictory in this respect, because these things are language level abstractions that need to be standardized or you always need to manually glue the interfaces together and make sure that the different forms of parallelisms between libraries do not deadlock or screw each other over. In that sense I think Clojure should have got first class async semantics in its compiler instead of bailing out to CPS as the general interface. Even the language of "lean" callbacks, JS, has now done that with async and await. The biggest annoyance with callbacks for me is still the inversion of control flow that is forced on me as a programmer (few programmers like to program directly in CPS style) and most importantly the forced sequential nature of a monadic promise based approach. With core.async I can write down nested compositions of functions doing aggregated immutable reads concisely, which I did a lot for replikativ and kept the codebase very concise and easy to verify that way, e.g. for fetching tree fragments over the wire or mutating durable storage in a critical section, while the callbacks would have cluttered a lot of my code. In that sense I am not happy with having to use callbacks as a library developer. I know this suggestion has been made by the Clojure core team and I think it is debatable that this is a good approach because applications tend to factor out and become libraries. I think it is just evading the problem from their side to be honest and I initially thought 7 years ago when it came out that core.async would have been deeper integrated by now.

whilo05:11:51

Having said all this konserve was initially implemented deliberately with callbacks. By now the control flow in the filestore became a bit more complicated with using the asynchronous NIO API, so we decided to use core.async internally to make the error handling easier and have better understandable sequential code. But if you would like to not have the core.async dependency in general we can definitely remove it from the interface and either factor out the filestore with the dependency on it or maybe @noprompt can help me to port it to promises, if this makes sense.

whilo19:11:21

@noprompt, @U8KKDKPG8 Besides removing the core.async dependency we might need to add additional protocols to facilitate low-level block based access to konserve.

alekcz19:11:39

Removing core.async shouldn't be an issue. The code is structures well enough for it to be done painlessly. I think it'd be worthwhile. Though providing an optional namespace to provide that kind of interface for current users.

alekcz19:11:16

@whilo @noprompt I'd need some guidance with the block based storage bit. I'm not yet familiar with how asami's storage works.

whilo19:11:00

Yes, we should discuss this together, I think.

grounded_sage21:11:02

I’m happy to join the call as well.

grounded_sage21:11:49

@noprompt as far as I am aware we are open to async alternatives. Our main focus has been keeping the codebase cross platform with as little divergences as possible and good error handling. But I would defer to @whilo for a more detailed answer to your question.

👍 3