Fork me on GitHub
#parinfer
<
2016-04-04
>
sekao13:04:25

@shaunlebron: the indentation behavior you describe in the video is something i’ve been thinking about a lot lately. currently my editors are just indenting/dedenting by two spaces but i’ve definitely noticed a need for something more intelligent. i’m going to tackle that next.

cfleming22:04:22

There’s a couple of bugs in the current release, I’m hoping to make a release fixing them today.

shaunlebron22:04:09

I’ll put that together tonight if we can agree it’s okay

cfleming22:04:30

@shaunlebron: In the example there, the let binding vector should not be a tab stop, I think.

cfleming22:04:47

It’s shadowed by the (foo bar)

cfleming22:04:41

Here’s how I calculate it:

cfleming22:04:12

Mine is interactive on the tab action, so I work backwards from the caret line until I find a line with content at offset 0, i.e. a top level form.

shaunlebron22:04:22

(corrected the example)

cfleming22:04:39

Then I work out the list of tab stops for that line.

cfleming22:04:13

Then, for subsequent lines I trim all tab stops that are greater than the indent level for that line, and add any new tab stops.

cfleming22:04:37

Once I reach the caret line, that’s my list of stops.

shaunlebron22:04:01

> I trim all tab stops that are greater than the indent level for that line

shaunlebron22:04:11

that seems wrong I think

shaunlebron22:04:27

unless I’m misreading

cfleming22:04:12

Isn’t that the rule? When would you want a tab stop that appears to the right of the subsequent line’s indent?

shaunlebron22:04:50

nevermind, yeah that’s it

cfleming22:04:25

I also return extra stops as I mentioned in my comment, they’re all trivially able to be calculated except for the one where the user may want to align parameters.

shaunlebron22:04:04

as in checking the previous sibling line?

cfleming22:04:14

here’s an example:

shaunlebron22:04:18

or subsequent sibling line too

cfleming22:04:37

(println x
         y)

cfleming22:04:52

Here, you want a tab stop where the parameters appear.

shaunlebron22:04:55

oh yeah, first arg alignment

cfleming22:04:16

That’s the only case the parinfer caller won’t be able to trivially add themselves.

cfleming22:04:45

Adding another inside the collection for the various open delimiters is trivial, obviously.

shaunlebron22:04:52

i suppose they could run a regex for the first space after the open-paren I would return them

shaunlebron22:04:11

well, first space outside a string

shaunlebron22:04:32

and outside any collection in the first position...

shaunlebron22:04:40

yes, non-trivial

cfleming22:04:48

Yeah, I guess that’s trickier for you, I’m doing this based on a lexer so I just look to see if after the delimiter I have symbol/keyword then whitespace then something else on the same line.

cfleming22:04:05

And if so, I add the start offset of the something else.

shaunlebron22:04:37

awesome, so you’re using cursive’s lexer here

cfleming22:04:53

I’m actually about to re-work the Cursive lexer, the current one is a horrible JFlex monstrosity.

cfleming22:04:18

Once I have it, I think it’ll actually be some pretty simple Java code. It might be worth porting to JS and using that for parinfer.

shaunlebron22:04:42

yeah, we should talk about standardizing the method here

cfleming22:04:44

When I profiled parinfer, a lot of the time was spent updating all the state per char - by using a lexer, you could do it per token.

cfleming22:04:09

And you also would remove all the inString, inCode, inComment stuff totally.

shaunlebron22:04:26

right, I consider that stuff a mistake

cfleming22:04:39

Well, you have to do it, but if you used a lexer it wouldn’t be required.

shaunlebron22:04:03

have to do what now?

cfleming22:04:33

Since you’re processing per-char, you have to track your current state - are you in a string, or a comment, etc.

shaunlebron22:04:03

I wonder how switching to a lexer would affect compatibility with different lisps

cfleming22:04:15

Yeah, that would be an issue.

shaunlebron22:04:37

I straight up don’t care about anything except strings and collections and comments

cfleming22:04:48

Parinfer actually broke my code the other day, I haven’t had time to reproduce it, but it was using an sexp comment.

shaunlebron22:04:04

sexp comment?

cfleming22:04:21

I’m not sure if that’s a bug in the Cursive implementation or not, I would have thought that vanilla parinfer should have handled it.

cfleming22:04:36

I had some code I had commented out with #_#_

cfleming22:04:50

That pushed the code over to the right, and indent mode broke it.

cfleming22:04:17

What I’m not sure of is why paren mode didn’t fix it - I wanted to reproduce it to see what was going on there.

shaunlebron22:04:45

#_(defn foo [a b])
  (+ a b)

shaunlebron23:04:12

yeah, that is actually the case that made me start thinking about paren mode

cfleming23:04:17

But I didn’t make the change in parinfer, it was old code, so the paren mode cleanup should have fixed it.

cfleming23:04:00

So either it wasn’t run correctly, or it didn’t fix it for some reason.

shaunlebron23:04:03

paren mode cleanup, not sure what you mean byt that

cfleming23:04:13

When paren mode is run upon opening the file.

cfleming23:04:32

To correct the indentation.

shaunlebron23:04:49

oh, yeah, paren mode should fix something that would look like this:

#_(defn foo [a b]
  (+ a b))

shaunlebron23:04:06

before indent mode ever corrupts it

cfleming23:04:27

It didn’t, for some reason, but I haven’t had time to investigate why.

shaunlebron23:04:51

just pasted that code into the demo editor just to verify

shaunlebron23:04:15

so I wonder what happened

shaunlebron23:04:24

let me know if you see it again I guess

cfleming23:04:41

It’s entirely possible it’s a bug in my integration.

shaunlebron23:04:43

(demo editor corrected it after pasting)

cfleming23:04:59

BTW I’ve been thinking about how a potential hybrid mode might work.

cfleming23:04:14

I’m going to try to implement this once I get the current implementation working.

cfleming23:04:40

I think one of the biggest problems for people with parinfer is that the changes have to be global.

cfleming23:04:05

Several people have commented to me that they can’t use it at work because it modifies the whole file, not just where they’re working.

shaunlebron23:04:35

oh sorry, before we talk about this, is the tabstops implementation an agreeable thing, since @sekao will be needing something like that too?

cfleming23:04:43

And in particular, the fact that paren mode tidies up hanging closing parens is a problem.

cfleming23:04:20

Yes, definitely - if you can provide that I think it will be very helpful. I definitely think you’ll need the first parameter align, but I’m not sure how to implement it as parinfer is right now.

cfleming23:04:50

I think a regex is probably your best option, ideally parinfer would do that internally.

cfleming23:04:02

There are lots of edge cases there.

shaunlebron23:04:16

I’ll have to leave that out until a lexer makes it into the implementation I think

shaunlebron23:04:32

but I wouldn’t be returning the extra tab stops for 1-space or 2-space indentation after open-parens

cfleming23:04:51

Yeah, that’s fine - those are easy to add, and may depend on the user’s config in the editor anyway.

shaunlebron23:04:02

that’s entering into convention territory that parinfer isn’t touching right now

shaunlebron23:04:26

alright, I’ll add that in

cfleming23:04:38

i.e. whether you want one or two space indents for lists. I’m not handling that case right now, nor am I handling the two-space-for-everything indent case.

cfleming23:04:48

But they’re both quite easy to add.

shaunlebron23:04:44

interesting that you mention people commenting on global changes

shaunlebron23:04:56

valuable feedback

cfleming23:04:08

Yes, it’s an issue.

shaunlebron23:04:55

yeah, and I suppose paren mode is sort of a linter when it comes to hanging close-parens

cfleming23:04:03

In particular, people use dangling closing parens… right.

cfleming23:04:22

I was going to ask if that was an essential part of the algorithm, but indent mode would tidy them anyway.

cfleming23:04:56

So here’s what I’m planning.

shaunlebron23:04:10

I’d have to revisit to see if it’s essential, but go ahead, yeah

cfleming23:04:14

In Cursive, when changes happen to the document, I get before and after ranges for each modification. I use that to determine the total range affected by a particular action, but I’m not using that final range right now.

cfleming23:04:40

However using the lexer, I can use that range to find the innermost form which completely covers the change.

cfleming23:04:21

If the change is to top-level forms there may be multiple forms affected by a change, but that’s fine. Basically, I get a minimal range of the document I have to run parinfer over.

cfleming23:04:03

On any change, I’m going to run a slightly modified paren mode, so the indentation will always be maintained.

cfleming23:04:22

And the number of doc changes I’ll have to make will be minimal.

cfleming23:04:20

I can actually check - if none of the before or after ranges for a particular action contain any of ()[]{}”\n then the change only affects a single line and doesn’t affect indentation - I can optimise that case and just insert spaces.

cfleming23:04:45

This covers the very common cases of typing, backspace, delete etc.

cfleming23:04:31

If any of the changes do contain those characters, then I’ll run Cursive’s indentation on the range of lines affected by the innermost form I worked out previously.

cfleming23:04:40

So the indentation will be semantically accurate.

shaunlebron23:04:49

can you describe the before and after ranges more?

cfleming23:04:35

If you type a space, I’ll get a change saying something like: at offset 10, old-length 0, new-length 1, old-text “”, new-text “ "

cfleming23:04:14

If you then backspace over that, I’ll get the opposite change: at offset 10, old-length 1, new-length 0, old-text “ “, new-text “"

shaunlebron23:04:04

perfect, yeah

cfleming23:04:26

Or if you select a word and paste a new word in it’s place, you might get: at offset 10, old-length 4, new-length 3, old-text “foos”, new-text “bar"

shaunlebron23:04:55

that’s inline with the change object in codemirror, cool

cfleming23:04:25

There may be multiple changes like that for a particular action, but I use those deltas to find the total range in the resulting document affected by the change.

cfleming23:04:10

Using that and the lexer, I can then fairly trivially find the innermost form enclosing the whole change.

cfleming23:04:38

After any change, I’ll run my modified paren mode, so I’ll maintain indentation with minimal changes.

shaunlebron23:04:12

I’m with you so far

cfleming23:04:16

Then I’ll have explicit indent and dedent actions, using tab and shift-tab as I have now.

cfleming23:04:39

Those actions also affect a range, since they affect either the current line or the range of selected lines.

cfleming23:04:21

So on those actions, I’ll work out that range. I’ll then run vanilla paren mode over it, to ensure that the indentation is within the bounds required for indent mode.

cfleming23:04:35

Note that the indentation only has to be correct for that range.

cfleming23:04:48

And I can correct it on the action, not at file open.

cfleming23:04:31

If the indentation requires correction, I’ll do that instead of the indent action, and provide an “indentation corrected” action to the user.

cfleming23:04:52

Otherwise I’ll indent the lines, and run indent mode over the range to correct the parens.

cfleming23:04:35

So I’m basically treating parens as the source of truth, but allowing the user to explicitly use indentation to adjust scope when they want to.

shaunlebron23:04:36

this is intereseting

cfleming23:04:52

And only affecting the part of the doc that the user is touching.

snoe23:04:00

@cfleming: the cursive lexer would be amazing to have in js, but I'm afraid it's kind of your secret sauce simple_smile

shaunlebron23:04:02

so tab will dispatch to paren mode if indentation is incorrect, but if it’s correct, it’ll run indent mode

cfleming23:04:07

It only makes very local changes

cfleming23:04:39

@snoe: I’m planning to OSS it, it’s not that secret and it would be really useful for this sort of thing.

cfleming23:04:31

I think that will provide the best of both worlds in a single mode, and also address a lot of the complaints I’ve seen about parinfer.

shaunlebron23:04:32

I think this is promising. I wonder about auto-closing of parens

cfleming23:04:55

Yeah, there are probably edge cases to be considered, I’ll try it out.

shaunlebron23:04:13

it’s weird because indent mode is really doing two things

shaunlebron23:04:44

correcting structure based on indentation, and auto-closing parens as you type them (based on indentation)

cfleming23:04:44

You do lose some things from vanilla parinfer: being able to indent using spaces, and the slurping/barfing.

shaunlebron23:04:05

(and of course, it’s good at breaking expressions inside a line in certain cases…)

cfleming23:04:39

I can probably fix the first by checking if the user has hit space, backspace or delete inside the leading indent of the line, and doing the indent/dedent action in that case.

cfleming23:04:30

I’m also starting to think that a lot of this functionality should be in composable pieces rather than all-in-one modes.

cfleming23:04:47

Currently I have a parinfer mode, a paredit mode, and a none mode.

cfleming23:04:12

But I had to make a change because users missed some of the electric handling of open parens when in parinfer mode.

shaunlebron23:04:25

oh, I see. yeah, it was my hope that parinfer would be compatible with paredit

cfleming23:04:39

And really, parinfer mode is mostly the backspace handling over closing delimiters at that point, since the actions work in any mode.

shaunlebron23:04:07

electric, like moving close-parens when pressing enter?

cfleming23:04:16

So with this proposed change, the actions would always work anywhere, and users would just choose the options they want.

cfleming23:04:39

In parinfer mode, if you backspace before a closing paren, it jumps over it rather than deleting it.

cfleming23:04:58

Which is wonderful or maddening, depending on preference.

shaunlebron23:04:27

end-of-line close-parens, yes

shaunlebron23:04:39

which I why I think it’s important to dim them in the editor to communicate their nature

cfleming23:04:56

Sorry, I meant in paredit mode there.

shaunlebron23:04:13

parinfer too, in indent mode

cfleming23:04:23

That’s something I explicitly added rather than something that falls out of the parinfer algorithm.

shaunlebron23:04:46

it works that way in the demo editor

cfleming23:04:18

So instead of the mode switcher I have right now, I’m imagining that popping up a palette of options - want tab bound to scope manipulation, want auto-indent, want clever close-paren handling, etc

cfleming23:04:26

And there are no explicit modes at all.

cfleming23:04:03

Users will be able to mix and match the parts they like of each mode.

shaunlebron23:04:46

so, to recap:

cfleming23:04:14

I’d need to try it - I suspect that removing the global nature of parinfer is a good thing, but I’m not sure until I experiment.

shaunlebron23:04:45

it’s all about representing the intention of the user

shaunlebron23:04:51

in parinfer, that intention is in modes

shaunlebron23:04:28

with this new mode, intentions are represented from a mapping of keys -> actions

shaunlebron23:04:45

tab or shift-tab -> scope manip

shaunlebron23:04:12

or, not so much mapping of keys as mapping of the types of changes being made by the user

shaunlebron23:04:41

so I insert ( -> auto-insert ) somewhere

shaunlebron23:04:08

of if I add or remove spaces at the leading indentation point of a line -> scope manip

shaunlebron23:04:39

so you can mix and match these things

cfleming23:04:12

Hopefully, yes

cfleming23:04:51

I think it will be possible to get something which is as intuitive for the user as parinfer right now, but without some of the confusing edge cases and without the whole-doc modification.

shaunlebron23:04:18

about the whole-doc modification

cfleming23:04:20

I do have some users who don’t like any sort of paren matching at all, I’d have to ensure that even when the doc is fairly broken these operations do something sensible.

shaunlebron23:04:34

I thought the diffs might be a problem for people at work, yeah

shaunlebron23:04:03

but I also thought it was making the code follow uncontroversial standards across all Lisps

cfleming23:04:25

One thing I have learned about lisps is that there are no uncontroversial standards simple_smile

shaunlebron23:04:27

it’s like running a linter on your code base, there’s be some diffs, but only once

cfleming23:04:01

So for example, one user wrote to me saying that they have some big file which for whatever reason is only ever added to.

shaunlebron23:04:02

I mean, hanging parens, that’s done mostly out of convenience for line insertion

cfleming23:04:34

So they always maintain a blank line at the end so that they don’t get diff conflicts when someone adds to it.

cfleming23:04:51

Parinfer messed that up, so the guy always had to remember to turn it off before opening that file.

cfleming23:04:26

Another common problem is forms with hanging parens with comments inside them, ; new tests go here or whatever

shaunlebron23:04:26

yeah, @chrisoakman was thinking about that when proposing standard comment directives for disabling parinfer

cfleming23:04:47

The parens get pulled over the comments.

shaunlebron23:04:04

yeah, that’s a tricky one

cfleming23:04:48

The other nice thing about all this from the implementer’s point of view is that it all happens as a direct result of user action. The async stuff plays hell with undo-redo.

shaunlebron23:04:47

oh, I remember warning about managing the history stack with parinfer

cfleming23:04:57

Yeah, it’s really hard.

shaunlebron23:04:19

because of the async?

shaunlebron23:04:27

if the ops were done in sync, maybe not?

cfleming23:04:53

I think Chris had problems with that, and it was killing me in IntelliJ. I actually found a much better way of handling the changes which alleviated that somewhat, but it’s very editor dependent.

cfleming23:04:09

Right, depending on the editor it’s easier. I found a listener in IntelliJ which allows me to insert new changes into an undoable change so they all get synchronously packaged up.

shaunlebron23:04:17

i think it was something like replacing the top of the undo stack when parinfer runs, or something

cfleming23:04:20

And the fact that it’s synchronous helps responsiveness a lot.

shaunlebron23:04:27

but I’m missing the subtleties obviously

cfleming23:04:51

Well, I’d guess it’s editor dependent if you can even do that.

cfleming23:04:05

I don’t think IntelliJ allows it, but I may be wrong.

cfleming23:04:24

Anyway, think about it a bit, feedback very welcome.

cfleming23:04:49

I’ll try to implement that at some point relatively soon, although I have some other fixes that need some attention too.

cfleming23:04:18

I’m going to re-work the lexer soon and benchmark it against the old one, and I’ll OSS that - that’ll be Java but porting to JS should be easy.

shaunlebron23:04:45

not sure I can use a clojure lexer for parinfer

shaunlebron23:04:05

it’ll be interesting to see though

cfleming23:04:07

Yeah, I guess it’ll depend on whether you want to support other lisps or not.

shaunlebron23:04:18

definitely do

shaunlebron23:04:31

it’s working for racket, and guile so far

cfleming23:04:32

In that case it’s difficult, yeah.

cfleming23:04:56

Do no other lisps have tricky string handling?

shaunlebron23:04:25

they seem to have the same strings I think

cfleming23:04:26

i.e. allowing strings delimited by single quotes, or something?

shaunlebron23:04:33

that would suck

cfleming23:04:40

Or chars like ‘c'

shaunlebron23:04:10

luckily I think that’s reserved for quoting forms

cfleming23:04:23

Oh, of course

cfleming23:04:42

(no tea yet this morning)

shaunlebron23:04:03

well, I just burned my last hour at work thinking about parinfer

shaunlebron23:04:20

I’ll keep thinking about the auto-mode, I think it sounds cool

shaunlebron23:04:26

can’t wait to try

shaunlebron23:04:01

i’ll add the tabstops thing too

cfleming23:04:29

(Full disclosure requires me to point out that it’s basically what the Emacs parinfer-criticism doc suggested)

shaunlebron23:04:37

oh, and especially the locally-bound changes sounds really useful