Fork me on GitHub
#parinfer
<
2016-02-04
>
chrisoakman00:02:03

thanks for setting that up

chrisoakman00:02:14

I was planning on playing around with it, but will focus on something else

chrisoakman00:02:03

haha - I like that you kept lots-o-clojure

cfleming00:02:12

Definitely simple_smile

chrisoakman00:02:14

man - JS is beating the pants off this JVM version

chrisoakman00:02:32

but now we know that with confidence

chrisoakman00:02:34

and can optimize

cfleming00:02:43

What’s the timing for JS on that test?

cfleming00:02:57

Is this the equivalent of the really_long_file values?

chrisoakman00:02:06

indent mode for really_long_file is around 12ms, 23ms for paren mode

cfleming00:02:15

Interesting

cfleming00:02:29

There are definitely improvements to be made, I’m working on them now

chrisoakman00:02:54

I am going to set up travis-ci to run the test suite on every commit

chrisoakman00:02:00

I think that should be not-too-hard to set up

chrisoakman00:02:10

wonder what happens to the performance if you put SENTINEL_NULL back in?

cfleming00:02:36

Hmm, not sure

cfleming00:02:51

I’d be surprised if it made much difference, but I was surprised with JS too simple_smile

cfleming00:02:05

You could run the bench against the original version I guess

chrisoakman00:02:39

I might give that a try; going to set up travis-ci first

cfleming00:02:46

~/d/p/parinfer-test (master)> lein with-profile bench run
Evaluation count : 4920 in 60 samples of 82 calls.
             Execution time mean : 12.537784 ms
    Execution time std-deviation : 436.185958 µs
   Execution time lower quantile : 11.935113 ms ( 2.5%)
   Execution time upper quantile : 13.447069 ms (97.5%)
                   Overhead used : 1.707955 ns
Evaluation count : 5280 in 60 samples of 88 calls.
             Execution time mean : 11.775496 ms
    Execution time std-deviation : 408.468619 µs
   Execution time lower quantile : 11.253086 ms ( 2.5%)
   Execution time upper quantile : 12.753575 ms (97.5%)
                   Overhead used : 1.707955 ns

Found 5 outliers in 60 samples (8.3333 %)
	low-severe	 5 (8.3333 %)
 Variance from outliers : 20.6507 % Variance is moderately inflated by outliers

cfleming00:02:47

I replaced a few of your functions with stdlib equivalents that are more efficient, especially around repeated String concatenation.

cfleming00:02:59

Perhaps V8 is better at optimising that?

chrisoakman00:02:21

oh man 😄 😄 😄

chrisoakman00:02:30

this is quite exciting

cfleming00:02:28

Is it true that http://result.ch is always a character, or empty?

cfleming00:02:11

Thinking about it, parinfer doesn’t do anything special for #( and #{, right?

chrisoakman00:02:29

I'm not sure; Shaun would know the answer to that question

chrisoakman00:02:37

I know he had to make some change in order to support Racket

cfleming00:02:38

Are there tests around those cases?

chrisoakman00:02:47

there might be

chrisoakman00:02:06

it's his first week working at Stripe, so he's not around as much

chrisoakman00:02:20

although I spoke with him last night; he is pretty excited about parinfer-jvm as well as the JS speed-up

cfleming00:02:20

Fair enough simple_smile

cfleming00:02:44

Here’s hoping Stripe will soon be running on Clojure and CLJS simple_smile

chrisoakman00:02:02

here's hoping

cfleming00:02:05

Then I can sell them tons of licences - booohahahahaha

chrisoakman00:02:16

gotta add parinfer to cursive first!

chrisoakman00:02:23

how hard do you think that will be?

cfleming00:02:27

I’m getting there, I’m getting there...

chrisoakman00:02:36

there can be some trickiness adding parinfer to editors

cfleming00:02:06

I can probably do a naive impl relatively easily, but there’s a few bits around the edges making it user friendly.

cfleming00:02:24

And I suspect I can make it more efficient with the lexer I use internally

chrisoakman00:02:28

yeah - are you familiar with the "parent expression" hack that atom-parinfer uses?

chrisoakman00:02:56

Shaun and I were talking about this last night; with the performance we're at now we might be able to just take it out

chrisoakman00:02:00

it's so fast already

cfleming00:02:32

Yeah, I am. I’m actually planning to benchmark just parsing the top level forms from the top of the file until, say, the last one in really_large_file, and then parinfer’ing that.

chrisoakman00:02:55

haha - mental slip

cfleming00:02:56

So no hack, but parsing to find the top level forms is faster.

chrisoakman00:02:59

you named a folder "paredit"

chrisoakman00:02:21

I will fix it; I have to commit something in order to test travis-ci

chrisoakman00:02:03

there might be a flurry of small commits in order to test that actually - nothing major

chrisoakman00:02:29

is it a convention to use the performance namespace?

chrisoakman00:02:01

I was going to change that to parinfer-perf/core ?

cfleming00:02:18

Sure, go for it.

chrisoakman00:02:31

wasn't sure if there was a convention there; I assume there is for tests

cfleming00:02:49

does have DOUBLE_SPACE assigned to it, so it’s not just a char.

chrisoakman00:02:16

it would be faster if it were just a Char on the JVM?

cfleming00:02:11

Not sure, I assume so, since that’s basically just a native int, not an object

cfleming00:02:57

Looks like that’s only used for replacing tabs

cfleming00:02:00

Ok, lunchtime, back in a bit.

cfleming01:02:43

I just pushed the perf improvements.

chrisoakman01:02:57

that test takes forever to run on my systme

chrisoakman01:02:02

but I get similar results

chrisoakman01:02:27

and your commit incidentally tested the build script

chrisoakman01:02:33

which worked perfect the first time

chrisoakman01:02:42

so we now have an automated build / test

cfleming01:02:32

Yeah, the benchmark takes ages to ensure that the JIT has stabilised

chrisoakman01:02:19

should we mark a v0.2.0 with your changes?

cfleming01:02:18

Yeah, that’s probably most of the low-hanging fruit.

chrisoakman01:02:22

do I need to add that number to build.gradle ?

chrisoakman01:02:43

in JS and Python people just include the file in their own builds

chrisoakman01:02:58

in Java people use actual packages and .jar files, right?

cfleming01:02:16

version '1.0-SNAPSHOT'

cfleming01:02:22

Is what you need to change

chrisoakman01:02:36

and is it ok to follow semver format there?

chrisoakman01:02:45

version '0.2.0'

cfleming01:02:56

Sure, that’s what Maven usually uses

chrisoakman01:02:14

I suppose another question I have is: what should the public API look like?

chrisoakman01:02:25

right now it's ParinferKt.indentMode()

cfleming01:02:29

I’m actually not sure why it started at 1.0 and didn’t use three digits

chrisoakman01:02:31

drop the Kt ?

cfleming01:02:45

Yeah, I think so - no-one using it cares

chrisoakman01:02:00

because it has to pass the type system I suppose

chrisoakman01:02:05

that sort of thing is pretty important in JS

chrisoakman01:02:17

also I went with 3-arity options instead of a hashmap of options

cfleming01:02:21

In fact, that class name is automatically generated from the filename, looks like.

chrisoakman01:02:24

since Java doesn't have literal data types

cfleming01:02:38

I’ll check what the best way to expose that API is.

cfleming01:02:47

(I’m still fairly new to Kotlin too)

cfleming01:02:08

And while we’re confessing, that was the first Gradle file I ever wrote simple_smile

chrisoakman01:02:14

I guess the question is: if this already existed and you were just adding it to Cursive, would what you want the API to look like?

chrisoakman01:02:31

lots of firsts for us both here simple_smile

cfleming01:02:39

I think that the concept (a public object with two static methods) is fine.

cfleming01:02:05

I think we should probably create the object explicitly so we can control the name, rather than have it be autogenerated.

chrisoakman01:02:15

yeah - I saw there was a way to do that in Kotlin

chrisoakman01:02:26

but I couldn't figure it out

cfleming01:02:55

No, I think we can just use:

public object Parinfer {
  … api goes here …
}

cfleming01:02:31

I’ll double check and make that change if you like.

cfleming01:02:01

It’s interesting that in the V8 version, the paren mode is much slower

cfleming01:02:09

I wonder why

chrisoakman01:02:18

yeah - I was going to look at that tonight

chrisoakman01:02:37

that discovery yesterday gave me the itch to squeeze out more performance

chrisoakman01:02:48

it's not too often that that sort of thing is the best way to spend your time

chrisoakman01:02:58

working on perf like that vs working on features or design

chrisoakman01:02:12

(at least in my experience)

chrisoakman01:02:15

but in this case it's worth it

chrisoakman01:02:20

parinfer has to run constantly as you're typing

chrisoakman01:02:29

speed is essential

cfleming01:02:11

Yeah. I’ll have to check how fast it is in Cursive because it’s not just manipulating strings, it’s fiddling with documents that have locks and so forth.

chrisoakman02:02:05

Will you make that API change? I think I prefer just Parinfer to ParinferKt

cfleming02:02:14

Yeah, I’ll make it.

cfleming03:02:31

In the end, this is a cleaner solution:

@file:JvmName("Parinfer")

cfleming03:02:07

Then from Clojure it’s just (Parinfer/indentMode …)

cfleming03:02:21

And from Java Parinfer.indentMode(…)

cfleming03:02:44

Unfortunately getting this into Cursive is a little trickier than I thought.

cfleming03:02:57

I’ll need some help from someone at JetBrains I think.

cfleming04:02:23

In Atom, are files ever changed in the background? There’s nothing like refactorings or anything like that, right?

cfleming04:02:42

So these changes are only ever applied to a document while a user is interactively editing it?

cfleming04:02:59

Actually, what about if a file is updated from VCS, which is effectively in the background as far as Atom is concerned?

shaunlebron04:02:46

hey guys, just started catching up on messages here

shaunlebron04:02:36

@snoe: I like that idea you had yesterday! tracking here: https://github.com/shaunlebron/parinfer/issues/89

shaunlebron04:02:02

@thomas: by “block comment” do you mean “comment the selected lines” or “comment the lines of the cursor’s sexp”?

shaunlebron04:02:57

@cfleming: parinfer looks at all parens outside of comments, strings, and character literals. so it defaults to treating #{, #[ and #( correctly.

shaunlebron04:02:14

it doesn’t actually know anything about the preceding #, so it’s treated as if it wasn’t there

shaunlebron05:02:33

and secondly what you said here is good

shaunlebron05:02:35

I’m actually planning to benchmark just parsing the top level forms from the top of the file until, say, the last one in really_large_file, and then parinfer’ing that.

shaunlebron05:02:24

like the activity happening here!

chrisoakman05:02:52

Colin is kicking total JVM ass; the benchmark on parinfer-jvm is cut in half from my original implementation

shaunlebron05:02:23

I saw it was around 40ms for indent-mode in really_long_file?

chrisoakman05:02:51

that's old news; now it's down to ~12ms

shaunlebron05:02:24

that is good news

shaunlebron05:02:30

i can’t find the confetti emoji

chrisoakman05:02:40

I'm playing around with more JS optimizations

chrisoakman05:02:48

I'm a little obsessed with it now

chrisoakman05:02:06

little worried that the "clean implementation" is going to have more JS-specific stuff

chrisoakman05:02:12

that might not translate as easily to other languages

chrisoakman05:02:22

but that's not really a big deal; we can always link people to an older version

shaunlebron05:02:34

i think it’s fast enough

chrisoakman05:02:54

just seeing very little difference between indentMode and parenMode on the JVM

chrisoakman05:02:04

and thinking that probably that can be achieved in JS too

shaunlebron05:02:49

I think in reality locating the parent expression is the important part

shaunlebron05:02:09

and I feel like that circuit is finally closing

shaunlebron05:02:26

colin can just use his lexer to locate that

shaunlebron05:02:40

emacs already has fast support for that as well

shaunlebron05:02:19

and for editors without their own native support for locating it, I can put the fast reader back in the API

shaunlebron05:02:04

parinfer without all the transformation business can run through really_long_file in about 7ms

shaunlebron05:02:59

idk why i didn’t think of this earlier. chatting with the emacs guy today helped I think

snoe05:02:21

@shaunlebron: I added a couple examples to the issue.

chrisoakman05:02:08

@cfleming: in atom, if a file is changed in the background, atom-parinfer won't run until the user does something to trigger it, like press any key on the keyboard

snoe05:02:27

( 🎉 btw)

shaunlebron05:02:29

ah, thanks it’s :tada: 🎉

shaunlebron05:02:50

I think we’re moving into paredit territory

shaunlebron05:02:04

from your example @snoe, you want an indent-next-sexp function, not a indent-next-line function, since it indents the child expressions with it

shaunlebron05:02:43

secondly, you want it to indent two spaces

shaunlebron05:02:31

the offset from the tabstop would have to be configurable

shaunlebron05:02:11

I think the parinfer API can provide some primitives that allow parinfer plugins to handle this behavior themselves

chrisoakman05:02:24

I just figured out the problem with paren mode simple_smile

chrisoakman05:02:43

parinfer.js is about to get faster

snoe05:02:28

I agree, maybe parinfer could return something like indentMode(..., {...}) => {:text "..." :tab-stops [1 4 8 11]}?

snoe05:02:21

or maybe a separate function 😕

snoe05:02:57

The make edit, goto next line, select form, indent/dedent selection dance is still something I find myself doing all the time. I wonder if others do the same thing?

chrisoakman05:02:55

that must have been preventing some v8 optimizations

shaunlebron05:02:06

@snoe: I think that info will be enough for one-space indentation

shaunlebron05:02:11

you’ll need more info for ( to be two-space, and for (expr arg to be n-space indent

snoe05:02:29

yeah, the important part is making the structural change. formatting can/should be handled outside.

shaunlebron05:02:36

and whatever rules required for determining that stuff

shaunlebron05:02:34

I see what you’re saying, so do a separate auto-indent operation after parinfer moves it to one-space indentation

shaunlebron05:02:34

@snoe, seems like we need either indent-selected-lines option instead of a single line

snoe05:02:56

right, I'm not sure if there's a real difference between forms and lines

shaunlebron05:02:58

or some way to specify the lines of an sexp

snoe05:02:13

Maybe : 1) go to next line -> 2) find first char -> 3) keep going down until you find a char before step 2

chrisoakman05:02:10

gah - I am super-happy about that find

chrisoakman05:02:33

I had a gut feel it was a JS thing

chrisoakman05:02:46

parinfer.py and then parinfer-jvm showed that the algorithm should be returning similar result times for both functions

chrisoakman05:02:53

something was clearly happening in the JS version

chrisoakman05:02:27

I'm not sure we're going to get faster than that; I was playing with replacing the if/else with dispatch and such

chrisoakman05:02:39

but doing object key lookup in JS isn't free; in most cases it was hurting perf

shaunlebron05:02:44

@chrisoakman: cool, I wish we had something Criterium for node

chrisoakman05:02:44

it's v8 optimizations

chrisoakman05:02:51

that's what this boils down to

shaunlebron05:02:03

i can’t figure out why the last two tests are any faster than the previous

chrisoakman05:02:09

I considered changing the perf.js to run multiple iterations and then show the average, etc

chrisoakman05:02:31

I only included one result on that PR, but ran it several times on my machine

shaunlebron05:02:45

it might be saving some data from the first test

chrisoakman05:02:47

and it was consistently faster when Object.preventExtension was removed

shaunlebron05:02:47

some caching I mean

chrisoakman05:02:09

to be honest, I "discovered" that earlier today on a Windows machine running an older version of node

snoe05:02:10

Have you guys run parinfer.js through the the chrome profiler?

chrisoakman05:02:18

saw a huge bump when I removed that

chrisoakman05:02:34

yes - but I can't seem to make heads or tails out of the .log files it produces

chrisoakman05:02:56

in both of these last cases of "discovering" big speed improvements, it's clear they are v8 optimizations

chrisoakman05:02:13

not fundamental algorithm or data structure changes

chrisoakman05:02:29

I did see some perf improvement in parinfer.py by changing that if/else chain to a Dict lookup

chrisoakman05:02:52

but when I did that in JS it was slower across the board

chrisoakman05:02:16

I'm looking at Python string concatenation now

snoe05:02:05

@chrisoakman: you were asking about the next language to tackle maybe if you do C/C++ you could then use emscripten for js (or c node module)

chrisoakman06:02:11

Shaun already went down that road a bit

chrisoakman06:02:18

and decided to scrap it

chrisoakman06:02:27

because the JS version is fast enough now

chrisoakman06:02:36

I'm thinking emacs lisp should be the next target

chrisoakman06:02:52

I don't use emacs, nor terribly wish to write the plugin

chrisoakman06:02:00

but I think writing the parinfer algorithm would be fun

chrisoakman06:02:22

I also want to do it in F#

chrisoakman06:02:29

but only because I want to do something in F# 😉

chrisoakman06:02:57

I don't know what editor would benefit from having an F# implementation; I don't think there are too many people using MS Visual Studio to edit Lisp code

snoe06:02:32

simple_smile it could be the reason they start

chrisoakman06:02:08

another benefit of writing Parinfer in emacs lisp is that I get to use parinfer to write it in !

chrisoakman06:02:26

which is cool and recursive

shaunlebron06:02:48

not sure if I added enough context about finding the parent expression in a file

shaunlebron06:02:37

but cursive, emacs, and parinfer stripped down to its reader form only, all of these can locate a parent expression very quickly and reliably

shaunlebron06:02:08

so it’s like using atom-parinfer’s parent expression hack, except without the corner cases

chrisoakman06:02:37

I am seriously considering dropping the parent expression hack for files under a certain N length

shaunlebron06:02:19

is what I said clear

chrisoakman06:02:29

I mean - parinfer.js can do a 2800 line file in under 10ms

chrisoakman06:02:35

that's the debounce rate

chrisoakman06:02:48

just to put that in perspective

snoe06:02:01

yeah for what it's worth I haven't bothered with expressions in vim because the whole file is pretty much fast enough

chrisoakman06:02:39

I am willing to wager that 99% of lisp files that people regularly work with are under 1000 lines of code

chrisoakman06:02:55

there are obviously exceptions to that, like if you're working on cljs.core

shaunlebron06:02:36

what i’m saying is that it’s now possible to hit 100% reliably and quickly

chrisoakman06:02:07

are you going to add another function to the API?

shaunlebron06:02:44

is what I said about the parent expression thing clear?

shaunlebron06:02:26

you use regex

shaunlebron06:02:06

emacs and cursive can do the same as your regex, but reliably and as fast

shaunlebron06:02:21

so can parinfer’s reader

chrisoakman06:02:32

by just keeping a running data structure of the whole file?

shaunlebron06:02:49

i’m not sure how emacs and cursive do it

shaunlebron06:02:23

but emacs’ paredit and smarparens plugins have fast implementations for it

shaunlebron06:02:51

the point is that data is available for parinfer to use

chrisoakman06:02:17

I think atom uses regular expressions for the grammar engine

chrisoakman06:02:21

actually, I know it does

chrisoakman06:02:33

for syntax highlighting and whatnot

chrisoakman06:02:49

which is probably a decent decision for most languages, but probably the wrong one for lisp-based languages

chrisoakman06:02:57

where it would be cheaper to just keep the actual AST

shaunlebron06:02:37

right, that will not work for parinfer

shaunlebron06:02:23

breaking syntax highlighting will not corrupt your code, only its coloring

shaunlebron06:02:36

i want to test that in vim to see how well it works

shaunlebron06:02:05

heading out for the night

shaunlebron06:02:15

@snoe, I’ll keep thinking about the indent thing

chrisoakman06:02:21

did you bump parinfer.js on npm?

shaunlebron06:02:30

@chrisoakman: thanks for speed improve, will publish

chrisoakman06:02:55

I don't think we're going to get too much faster than this, btw

chrisoakman06:02:25

would be interesting to see how Chakra or SpiderWhatever performs

dominicm13:02:17

Has anyone got parinfer on emacs yet?

dominicm14:02:34

Last I saw, the strategy was to rewrite parinfer in emacs lisp, instead of using the lib

thomas14:02:24

I thought there was a link on the parinfer web site… but not sure what it said

dominicm14:02:49

There is. The first is a very limited rewrite in emacs lisp, the other is a node server of some kind, which you've got to start to use.

dominicm14:02:06

Maybe I'm just spoiled by @snoe's excellent neovim integration simple_smile