Fork me on GitHub
#asami
<
2020-11-19
>
quoll07:11:32

That’s Thanksgiving Day in the USA. My family would kill me!

whilo08:11:55

Oh, I am sorry, I was not aware of that, Thanksgiving is on another day in Canada. Should we do it the week after then?

quoll14:11:20

I should be available, though later in the day would be better, since @U06MDAPTP can’t attend if it’s early (he is in California and has young children to get to school)

noprompt17:11:35

I’d say just ping me and I will join if I can.

whilo05:11:25

Ok, I can do later in the day, it is just bad for others in Europe. Let me ask who else wants to join and then we can see whether we can do it later.

quoll07:11:03

Meanwhile… I’m a little disappointed at the speed, but right now I’m able to take a document, split it by spaces, and then index every resulting string: • Document size: 725060 bytes • 117797 strings • Index size: 885455 bytes On my notebook computer: Time to index: 21 seconds (yes, this is disappointing) Time to rebuild document from the index: 3 seconds This is my first attempt. I want to tweak the index a little, to see what speed/size changes I get. The tree nodes are currently large, which I thought would be OK, but maybe not.

quoll07:11:26

This is exercising the Data Pool. Now that this works correctly, I’m moving onto the Triple Store (OK, it’s a quad store. Shh)

quoll11:11:09

To explain the operation above… for every string to be inserted, the code looked it up in the index. If it was there, then the appropriate ID for the string was returned. If not, it was inserted, and a new ID was created. This was done at a rate of about 5600 times per second, or ~180μs each time.

quoll11:11:13

This is a little bit vague, because sometimes a string was converted into a number instead of being stored. I am thinking of turning that feature off to get a benchmark on the storage.

whilo08:11:55

Oh, I am sorry, I was not aware of that, Thanksgiving is on another day in Canada. Should we do it the week after then?