Fork me on GitHub
#squint
<
2022-08-27
>
lilactown01:08:35

I'm all in on lazy iterables. forget transducers πŸ˜„

😎 1
borkdude09:08:32

Now running JSX tests with babel/core :)

borkdude09:08:20

That made me wonder: I think it would be useful if we would have a babel plugin for clava perhaps

borkdude10:08:58

@lilactown I went ahead and merged your work on lazy seqs. Since I also made a few improvements, I thought it would be too complex to wait with merging. Now can incrementally move forward. E.g.: partition and partition-all should be made lazy now

Prabhjot Singh11:08:07

Hi @U04V15CAJ I can work on this tomorrow if no one has picked it up yet.

borkdude11:08:11

Since all tests still worked, I thought it would be ok to merge and go forward

borkdude12:08:52

I guess we should document the caveat that a lazy iterator, when used multiple times, is re-calculated (similar to eduction):

(def seq1 (map (fn [i] (prn i) i) [1 2 3]))

(prn seq1)

(def seq2 (map inc seq1))

(prn seq2)

1
core.js:287 2
core.js:287 3
core.js:287 [1,2,3]
core.js:287 1
core.js:287 2
core.js:287 3
core.js:287 [2,3,4]

lilactown14:08:35

since we don't cache them, their underlying (mutable) structure can change and it will reflect those changes

lilactown14:08:41

we could add caching

borkdude14:08:13

If we would add caching it would behave less surprising for CLJS users maybe. But I don't know what would be the cost of adding that

borkdude14:08:24

Could add an array which contains the realized elements or so?

borkdude14:08:54

And grow that array when a non realized element is requested

borkdude14:08:00

But that would result into GC problems because even when you don't use the front of the seq anymore you would hold on to the entire array which is bad

borkdude14:08:22

Unless you splice the array etc. Not sure how performant this is compared to a linked list

borkdude14:08:44

And before you know it we’re re-implementing all of CLJS (which might not be that bad if you get back good treeshaking and improved interop)

borkdude14:08:03

But just leaving this as it is now would also be fine maybe

borkdude14:08:13

With the current implementation you can decide where to cache yourself by using vec

lilactown15:08:01

rest also caches, since destructuring returns an array

lilactown15:08:33

let [_, ...rest] = iterable(coll)

borkdude15:08:19

@lilactown That seems like a bug to me (with the recent changes):

$ ./node_cli.js -e '(first (rest (range)))'
hangs

didibus15:08:47

If you cache I believe you need locals clearing take it work no?

borkdude16:08:35

does CLJS have locals clearing?

didibus20:08:34

Seems it's rare that in-browser people process large data, so holding on the head and consuming more memory then needed might not be a big problem

Chris McCormick12:08:24

> Seems it's rare that in-browser people process large data When procedurally generating audio I frequently create gigantic arrays of data. Just one use-case to be aware of. Another place is crypto stuff.

borkdude12:08:27

I think when dealing with these gigantic mutable arrays you would usually not use them in combination with lazy seqs anyway

Chris McCormick13:08:28

Yes probably true I don't think I've done that.

Chris McCormick13:08:07

The smart thing to do is probably create a fixed Float32Array or similar and write into it but I think I have use vec before too.

borkdude13:08:43

yes, (update-in! matrix [0 0 0 1] inc) :)

Chris McCormick13:08:31

Wow that's going to take me a few days to parse. Will get back to you!

Chris McCormick13:08:49

Presumably another bordude tip that flips my coding practice on it's head!

Chris McCormick13:08:35

Ok wait I thought matrix was a funciton here. It's getting late. πŸ˜…

Chris McCormick13:08:03

I should go to sleep ha ha. πŸ‘‹

borkdude13:08:39

hehe. This does work right now with clavascript:

$ ./node_cli.js -e '(prn (update-in [[0 0] [0 0]] [0 0] inc))'
[[1,0],[0,0]]
but it creates new arrays, which isn't that performant. This is why we also should have update-in!

Chris McCormick13:08:27

Very nice. I'm amazed by how fast you are all moving on this.

borkdude15:08:52

maybe rest should be something like:

function* rest(x) { let iter = iterator(x); iter.next(); yield* iter; }
but this has the problem that the first element will always be held on to - or not?

lilactown15:08:21

that has the problem that it would return an iterator, which would be mutable

borkdude15:08:05

well, you have a solution for that, I guess. the thing I was wondering about is how to "let go" of the first

lilactown15:08:35

sorry, not trying to nit pick

lilactown15:08:42

export function rest(coll) {
  return new LazyIterable(function* () {
    let first = true;
    for (const x of iterable(coll)) {
      if (first) first = false;
      else yield x;
    }
  });
}
I think might work

lilactown15:08:27

it's basically the same as dropping the first

borkdude15:08:46

hmm nice. how can we verify that GC works as it should?

lilactown15:08:42

there's probably ways to inspect GC in chrome

lilactown15:08:13

I think that we should look at LazyIterable as a whole. this would I imagine have the same behavior as any other iteration

borkdude16:08:48

@lilactown I've done this test in node:

// run with: node --expose-gc mem.js

import { rest, first } from './core.js';

var x = new Array(10000000).fill(0); // about 85mb

global.gc();

console.log(process.memoryUsage());

var y = [x,1];

y = rest(y);

console.log(first(y));

x = null;

global.gc();

console.log(process.memoryUsage());
Unfortunately, the memory big array doesn't seem to be freed

borkdude17:08:39

But so isn't it in cherry, so I could be doing something wrong here

borkdude17:08:59

it does work as expected in cherry when I use a cons so it might be the wrapper array not being garbage collected:

var y = cons(new Array(10000000).fill(0), cons(1, null));

borkdude21:08:39

I played with this in cherry: to use immutable.js instead of the clojure data structures to get the same value semantics, while having a smaller bundle (65kb or so vs 300kb). I didn't go through with that since immutable.js probably has some differences to CLJS, but for something like clavascript this difference might be ok (since we don't promise compatibility). If we would adopt that as the standard thing, we could get a lot of things back what we had in CLJS (immutability, value semantics). The major thing that bothers me about that is interop: you can't just pass an immutable map to another API, but you first have to convert it with toJS etc

borkdude21:08:46

The immutableJS lib has a (lazy) seq data structure that we might want to borrow, even if maps will still be mutable in clava. (See GC issue above)

borkdude22:08:13

I tried it and the best you can get is all 60kb. Maybe it's best to go back to the state we were before and base stuff on concrete arrays in order to stay as close to JS without surprises (GC/recalculations when you use a result multiple times, etc) and a complex/laborious + big stdlib. I bet most of these lodash-like libs do this. Thoughts, @lilactown?

borkdude22:08:31

iterable in -> array out. just need to document why we went back to this approach, if we do, so we remember why we did so

borkdude22:08:44

that, or we could go back to the "map returns iterable you can only use once" approach

lilactown22:08:17

I've used ImmutableJS in apps and hated it exactly for the whole toJS fromJS dance

lilactown22:08:41

I think we should try adding a cache to the LazyIterable

lilactown22:08:51

the only "surprising" (?) thing there is if you mutate the underlying collection, e.g.

let a = [1, 2, 3]

let b = rest(a);

prn(vec(b));

a.push(4)

prn(vec(b));
will print [2, 3] both times. whereas if you put the .push before the first realization of it, they would both print [2, 3, 4]

didibus23:08:34

Maybe it would be easier to start with transducer no? Iterables are closer to transducers it seems. With transducer it's all just adding up the set of transforms and when ready the iterables is passed to the chain and collected into a concrete collection of the user's choosing.