clojure-dev

2023-03-23T16:42:35.529909Z

Hi, I'm not sure this is the right place so forgive me if it isn't. Given that the reader is adding it's own meta to the forms it reads, programs that need to know if they read any meta need to subtract the readers meta. The only option seems to be by enumerating it like in the ClojureScript compiler here : https://github.com/clojure/clojurescript/blob/master/src/main/clojure/cljs/analyzer.cljc#L4186-L4204 which I think couples programs to the reader in a bad way, since you can't add new meta info to the reader without breaking who knows what. So kind of a breaking change by addition. Shouldn't the reader provide a mechanism for this? So programs can remove readers meta without enumerating it in their source code? I was experimenting with something that adds a new key to the meta in LispReader.java while testing it with the ClojureScript compiler and noticed that it broke in a very weird way, and took me some time to figure out I had to add my new key to that elision list in ClojureScript source code.

2023-03-24T11:32:00.638449Z

here is an example of the problem I was describing yesterday in de the Clojure compiler itself : https://github.com/clojure/clojure/blob/1e835a9d9fb4498a5cf643df861565c07701b18b/src/jvm/clojure/lang/Compiler.java#L4094-L4097 if you add any extra meta at the reader the compiler will fail in a obscure way unless you remember to add them in this enumerations because there is logic depending on it which is kind of brittle. There is one more here : https://github.com/clojure/clojure/blob/1e835a9d9fb4498a5cf643df861565c07701b18b/src/jvm/clojure/lang/LispReader.java#L1125-L1128 and there is also I think an attempt to deal with this stuff? not sure : https://github.com/clojure/clojure/blob/1e835a9d9fb4498a5cf643df861565c07701b18b/src/jvm/clojure/lang/Compiler.java#L287

borkdude 2023-03-24T11:33:31.760829Z

I have similar issues in SCI: it should also know when to evaluate metadata vs metadata it added itself when reading. But why is this your problem?

2023-03-24T11:35:43.802549Z

I'm hitting this because I'm working on a dev compiler, which is a patch on top of the Clojure compiler that emits instrumented code. I had to extend the LispReader to on top of line and column also provide a sexp coordinate (coord in the tree), and as soon as I added the extra meta the compiler didn't work anymore

2023-03-24T11:36:58.135249Z

and after fixing those two places I hit it again by breaking now the ClojureScript compiler, all by just adding a extra meta at the reader

2023-03-24T11:37:21.219289Z

does it make sense?

borkdude 2023-03-24T11:39:16.651929Z

yes. maybe you could use a custom reader :)

2023-03-24T11:39:46.233369Z

wdym?

2023-03-24T11:40:23.473209Z

I mean, I have everything working now, just reporting that this is kind of brittle

borkdude 2023-03-24T11:40:34.808529Z

ah so you if you add metadata during the read phase, the compiler breaks. yeah, that won't work with a custom reader then either, without changing the compiler

2023-03-24T11:40:50.244579Z

exactly

borkdude 2023-03-24T11:41:07.881879Z

I think the idea is that compiler works in sync with the reader. making changing in one place will also mean changes in another place. it's kind of expected?

2023-03-24T11:43:38.959279Z

well, IMHO adding somithing as simple as a extra meta key shouldn't break completely unrelated stuff (and hard to figure out) in an open map world system. and second it also broke the ClojureScript compiler and any other project that is removing reader keys to figure out if read returned any meta

borkdude 2023-03-24T11:44:36.001329Z

can you tell what exactly broke? What I would expect is that the metadata would be evaluated. Maybe adding extra quoting should help, if the expression contains symbols or lists.

2023-03-24T11:45:39.633649Z

you mean what brokes the Clojure compiler or other projects like ClojureScript?

borkdude 2023-03-24T11:46:03.110539Z

yes, both

2023-03-24T11:46:49.598519Z

for example, if you add a extra key at the reader phase then this goes wrong : https://github.com/clojure/clojure/blob/1e835a9d9fb4498a5cf643df861565c07701b18b/src/jvm/clojure/lang/Compiler.java#L4094-L4097

borkdude 2023-03-24T11:47:47.471219Z

what do you mean by "goes wrong", what is the error you are seeing?

2023-03-24T11:48:19.066319Z

oh, I'll have to make it happen again, let me see

borkdude 2023-03-24T11:48:49.653209Z

what I would expect is that the fn has metadata, so it will be evaluated at runtime, but adding extra quoting can prevent this

borkdude 2023-03-24T11:49:17.359449Z

user=> (meta ^{:x (+ 1 2 3)} (fn []))
{:x 6}

borkdude 2023-03-24T11:49:33.644389Z

user=> (meta ^{:x x} (fn []))
Syntax error compiling at (REPL:1:7).
Unable to resolve symbol: x in this context

borkdude 2023-03-24T11:51:08.263859Z

user=> (meta ^{:x 'x} (fn []))
{:x x}

2023-03-24T11:54:14.596979Z

by adding a extra key to the reader I got :

Exception in thread "main" java.lang.ExceptionInInitializerError
	at clojure.main.<clinit>(main.java:20)
Caused by: Syntax error compiling at (clojure/core.clj:340:44).
	at clojure.lang.Compiler.analyze(Compiler.java:6955)
	at clojure.lang.Compiler.analyze(Compiler.java:6892)
	at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3894)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7284)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyze(Compiler.java:6892)
	at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3970)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7284)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyze(Compiler.java:6892)
	at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3970)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7284)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyze(Compiler.java:6892)
	at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3970)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7284)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyze(Compiler.java:6892)
	at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3970)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7284)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyze(Compiler.java:6892)
	at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:6241)
	at clojure.lang.Compiler$LetExpr$Parser.parse(Compiler.java:6573)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7282)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7270)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyze(Compiler.java:6892)
	at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:6241)
	at clojure.lang.Compiler$FnMethod.parse(Compiler.java:5559)
	at clojure.lang.Compiler$FnExpr.parse(Compiler.java:4111)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7280)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7270)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.access$1(Compiler.java:6895)
	at clojure.lang.Compiler$DefExpr$Parser.parse(Compiler.java:587)
	at clojure.lang.Compiler.analyzeSeq(Compiler.java:7282)
	at clojure.lang.Compiler.analyze(Compiler.java:6936)
	at clojure.lang.Compiler.analyze(Compiler.java:6892)
	at clojure.lang.Compiler.eval(Compiler.java:7364)
	at clojure.lang.Compiler.load(Compiler.java:7826)
	at clojure.lang.RT.loadResourceScript(RT.java:381)
	at clojure.lang.RT.loadResourceScript(RT.java:372)
	at clojure.lang.RT.load(RT.java:459)
	at clojure.lang.RT.load(RT.java:424)
	at clojure.lang.RT.<clinit>(RT.java:338)
	... 1 more
Caused by: java.lang.RuntimeException: No such var: clojure.core/apply
	at clojure.lang.Util.runtimeException(Util.java:221)
	at clojure.lang.Compiler.resolveIn(Compiler.java:7573)
	at clojure.lang.Compiler.resolve(Compiler.java:7543)
	at clojure.lang.Compiler.analyzeSymbol(Compiler.java:7504)
	at clojure.lang.Compiler.analyze(Compiler.java:6915)
	... 48 more

borkdude 2023-03-24T11:55:06.057479Z

and what does the extra key contain?

2023-03-24T11:55:12.128339Z

the clojurescript issue was even worse. but let me try with a clojure clone from scratch, because I have other stuff modified

2023-03-24T12:05:01.188849Z

hmm I'm not being able to reproduce it now with that simple change, but you get the idea, there is logic on different applications that depend on knowing the exact keys the reader is using, which I think is brittle

2023-03-24T12:05:56.150789Z

I'll not spend too much time on this because it isn't a problem I have now, is just something I noticed and wanted to report here

borkdude 2023-03-24T12:08:40.074889Z

@jpmonettas yes, but the reason I asked for the exact error/repro is because I think it can be fixed without changing the LispReader/Compiler, by adding quoting. But maybe I'm wrong ;)

2023-03-24T12:13:01.068789Z

not sure I follow, I'm not working with Clojure, everything is in Java, I'm modifying the compiler itself, so I don't see how something like quoting fits. And what I'm talking about here is a more general problem. If a underlying library that provides you data (in this case the reader) is going to add to your data (in this case line, column, etc) if it doesn't provide a way for you to tell apart the original data from the enhanced one and you need to tell it apart ,you will have to enumerate the keys, which is brittle because it will break when more keys are added

2023-03-24T12:16:28.350439Z

don't know if that makes sense, but thanks a lot for your time!

2023-03-24T12:16:46.271299Z

and all the good stuff you write btw!

borkdude 2023-03-24T12:35:07.196429Z

thanks :) can you give an example of what kind of stuff you are adding on top of the existing metadata?

2023-03-24T12:37:39.168299Z

sure, so it is the forms coordinates, so if you have (+ a b ( 3 c)) the c will have :coord [3 2] while the ( 3 c) list will have :coord [3] etc

borkdude 2023-03-24T12:39:13.313929Z

alright, so just vectors and numbers, no lists or symbols?

2023-03-24T12:39:31.153669Z

then modifying macroexpansion to preserve that meta, you can link the expanded code to the original one

2023-03-24T12:39:36.303599Z

yeah, just a vector of numbers

borkdude 2023-03-24T12:40:17.627479Z

ok, then my theory doesn't hold

borkdude 2023-03-23T16:50:41.311239Z

@jpmonettas FWIW, edamame has a postprocess hook in which you can choose what metadata and location info you can add to the read form. You can also configure the location keys separately. It doesn't add *file* metadata to read forms.

user=> (map meta (e/parse-string "[x y]" {:row-key :foo}))
({:foo 1, :col 2, :end-row 1, :end-col 3} {:foo 1, :col 4, :end-row 1, :end-col 5})
user=> (map meta (e/parse-string "[x y]" {:location? (constantly false)}))
(nil nil)
user=> (e/parse-string "[x y]" {:postprocess (fn [{:keys [loc obj]}] [loc obj])})
[{:row 1, :col 1, :end-row 1, :end-col 6} [[{:row 1, :col 2, :end-row 1, :end-col 3} x] [{:row 1, :col 4, :end-row 1, :end-col 5} y]]]

2023-03-23T16:55:29.026419Z

nice you can configure that there, I guess those are the two options, or provide readers what keys you want in meta, or have a way to ask it to subtract or return the ones it added

2023-03-23T16:56:39.384779Z

but I think it is important to cover that in the main reader so it can grow without breaking other programs

Alex Miller (Clojure team) 2023-03-23T17:14:30.638149Z

why do you need to subtract the reader's meta?

2023-03-23T17:16:56.517789Z

I don't, but if you look at the ClojureScript compiler code I linked they are doing it, and I guess you do it if you want to check that the forms you are reading comes with any meta, and you are probably not interested in readers meta

2023-03-23T17:20:08.646329Z

in most cases I guess you can go the other way around, and check for the meta you are interested in, but there will be no way of answering "give me all the meta in the form I just read (not counting readers one)"

Alex Miller (Clojure team) 2023-03-23T17:26:06.580509Z

the forms are coming with meta

Alex Miller (Clojure team) 2023-03-23T17:27:43.243249Z

"what you are interested in" is entirely contextual

2023-03-23T17:30:02.762419Z

I mean, if I'm reading some forms, it can happen that I want to know if the original data contains any meta (I think that is why projects like the ClojureScript compiler are subtracting readers one)

2023-03-23T17:43:09.882399Z

well I guess you can work around it by not providing a LineNumberingPushbackReader, since it is what it's using to add line and column

Alex Miller (Clojure team) 2023-03-23T17:46:53.179419Z

I understand where you're coming from, but this is a world of open maps where tools and users are free to add things in the process. Reading is an operation that annotates what is read. If anything, I think an enhancement to add an option to the reader to not annotate would be maybe a possibility, not sure exactly what that would need to cover

2023-03-23T17:50:24.672769Z

yeah, so I'm in full support of open map world, and I just came here to report a issue with this not being the case, because I added a extra key to the LispReader meta and by doing that I broke the ClojureScript compiler

Alex Miller (Clojure team) 2023-03-23T17:51:13.755539Z

so is your complaint with the Clojure reader or the ClojureScript compiler? If the latter, then #cljs-dev is a better place to discuss

2023-03-23T17:53:08.970989Z

I figured out that is because the ClojureScript compiler is enumerating the readers keys, so I first thought on creating a issue on ClojureScript, but then thought how would they fix it, if the reader doesn't provide a way for you to tell apart the keys it read from the keys it added

2023-03-23T17:53:14.771179Z

does it make sense?

Alex Miller (Clojure team) 2023-03-23T17:57:42.880259Z

it makes sense, but still unclear what options we could consider

2023-03-23T18:07:07.215719Z

I think for the nicer one its already late, which is maybe namespaced keys. And a quick low risk one could be to maintain that enumeration inside the reader, bu yeah, not easy, since in the example of the ClojureScript compiler they are subtracting other keys like :source, :end-line, :end-column that aren't coming from the LispReader, so don't know 🤷‍♂️

2023-03-23T18:07:31.428959Z

Thanks anyway @alexmiller

bronsa 2023-03-23T19:48:46.675959Z

FYI when using tools.reader, you can control whether it adds or not the meta by using (or not using) an indexing-reader as the source

2023-03-23T21:34:59.865749Z

@bronsa nice, I guess in that respect is similar to the LispReader.java by using (or not using) a LineNumberingPushbackReader, but more of an accident than a designed feature. And there is still no way of asking for the meta (in case you need it) but also being able to tell it apart from the read data meta, without enumerating it in your source code