Fork me on GitHub
#parinfer
<
2016-02-23
>
cfleming00:02:46

Actually, that case is not bad because the paren trail doesn’t get extended over the unmatched ]. But I suspect a case like (foo (bar)]]) would cause problems. I have all the indent mode tests working now so there are no cases like that, I’ll try to write some around the edge cases.

cfleming02:02:32

@shaunlebron: I have all the test cases passing with the delta algorithm now.

cfleming02:02:42

I’ll tidy it up and send a PR.

cfleming03:02:45

This is parinfer master on my machine:

~/d/p/lib (master)> node test/perf.js
Processing long_map_with_strings : 303 lines, 4380 chars
indent: 10.331ms
paren: 4.725ms

Processing really_long_file : 2865 lines, 112431 chars
indent: 15.639ms
paren: 21.972ms

Processing really_long_file_with_unclosed_paren : 2865 lines, 112432 chars
indent: 11.439ms
paren: 9.827ms

Processing really_long_file_with_unclosed_quote : 2865 lines, 112433 chars
indent: 8.420ms
paren: 8.317ms

cfleming03:02:16

This is the delta algorithm:

~/d/p/lib (master)> node test/perf.js
Processing long_map_with_strings : 303 lines, 4380 chars
indent: 11.210ms
paren: 6.438ms

Processing really_long_file : 2865 lines, 112431 chars
indent: 16.363ms
paren: 19.291ms

Processing really_long_file_with_unclosed_paren : 2865 lines, 112432 chars
indent: 10.691ms
paren: 9.168ms

Processing really_long_file_with_unclosed_quote : 2865 lines, 112433 chars
indent: 9.460ms
paren: 9.965ms

cfleming03:02:26

It’s actually about the same speed - I’m surprised.

shaunlebron05:02:51

i’m not understanding how the cases you mentioned would cause problems

shaunlebron06:02:43

every edit essentially introduces a new coordinate space

shaunlebron06:02:46

so an edit should be in the coordinate space that results from the previous edit

shaunlebron06:02:39

the edits wouldn’t be in source order; they would just be in edit order

shaunlebron06:02:33

in the case of (foo (bar)]]), the edits would be:

(foo (bar)]])
          ^ 1. remove index 10

(foo (bar)])
          ^ 2. remove index 10

(foo (bar))
         ^^ 3. replace indices 9-10 with `))`   (oops, we shouldn't replace this if it's already there)

shaunlebron06:02:28

i may not be understanding your delta algorithm though, so I’ll look at the code when you’re done

cfleming07:02:07

No, at the moment, all edits are in source document space.

cfleming07:02:14

It might have been easier to do them with a new coordinate space, I’m not sure.

cfleming07:02:55

I’ll check tomorrow if the algorithm could be easily modified to do that.

cfleming07:02:23

I actually still have a single breaking indent case, I had some tests commented out accidentally.

cfleming09:02:46

I remember the problem with the coordinate space per edit idea.

cfleming09:02:39

The issue is that you’re always reading from the source coordinate space, since the edits are not actually applied. So somewhere you have to maintain the delta between source and destination spaces, and that depends on where the edits have been made already.

cfleming09:02:09

What I have now works. Basically I make sure that the edits are always in source document order, and the only cases I’ve found where they overlap is the case I showed earlier where individual unbalanced close parens are inside a paren trail that is later removed. So when making a replacement, I just remove any current edits which are completely overwritten by the new replacement. So if the whole paren trail is replaced, the individual close paren deletions are simply removed because they’re completely overwritten by the later change.

cfleming09:02:21

I’ll fix the one remaining case tomorrow morning and push it to my repo, and you can take a look and see what you think. TBH I’m not sure that this change makes sense for the main parinfer implementation, but it does for me - I think I’ll base the Cursive implementation on it.

cfleming09:02:57

I don’t know enough about other editor implementations to know whether applying deltas would work better or not.

cfleming09:02:10

One nice property of this new algorithm for me is that it translates naturally to using lexer tokens instead of chars. I profiled this, and nearly all the time is spent in processChar and updateParenTrailBounds - this would be faster using the lexer probably since the paren trail would only be updated on token boundaries.

snoe20:02:24

So I worry that the delta approach sounds similar to the stateful change diff api that parinfer originally had. In that the structure and application of the diff was built on applying it to code mirror, so porting to vim involved adding a lot of code to handle the structure and build what code mirror gave you for free. This isn't to say that it wouldn't be worth it but any editor that accepts a different way of encoding deltas (if they do at all as in vims case), would require, potentially, a good chunk of code.

cfleming22:02:49

@snoe: What is the complication with vim? Does it not have a function to replace from char x to char y with string z?

cfleming22:02:30

Or do you have to handle it manually for some reason?

cfleming22:02:46

I also have a function which you can use to get the complete replacement text, given the original text and the deltas, so at the worst with one more function call you can get what exists right now.

cfleming22:02:49

(I’m not familiar with parinfer’s original interface)

snoe22:02:01

Not through the rpc interface that my neovim plugin uses, I suspect doing deltas would either involve building out the edit commands to execute the delta as if typed or some regex magic. Having that function to maintain the existing functionality would be desirable from my perspective

cfleming22:02:59

@snoe: Ok, thanks, that’s interesting.