‘lo all. Looks like there’s a regression in clj-yaml 0.7.109 and 0.7.110 (current). Number-like strings aren’t quoted:
% clj -Sdeps '{:deps {clj-commons/clj-yaml {:mvn/version "0.7.108"}}}' -M -e "(require '[clj-yaml.core :as yaml]) (doseq [x [\"083\" {:x \"083\"}]] (print (yaml/generate-string x)))"
'083'
{x: '083'}
% clj -Sdeps '{:deps {clj-commons/clj-yaml {:mvn/version "0.7.109"}}}' -M -e "(require '[clj-yaml.core :as yaml]) (doseq [x [\"083\" {:x \"083\"}]] (print (yaml/generate-string x)))"
083
{x: 083}
% clj -Sdeps '{:deps {clj-commons/clj-yaml {:mvn/version "0.7.110"}}}' -M -e "(require '[clj-yaml.core :as yaml]) (doseq [x [\"083\" {:x \"083\"}]] (print (yaml/generate-string x)))"
083
{x: 083}
Well, that was fun.
fucking hell
You read my mind.
A couple of things: • I'm wondering if it's worth my time trying to parse the Yaml 1.1 spec. • I'm wondering if we should look at using snakeyaml-engine, which is supposed to be Yaml 1.2 compliant, and what the Yaml 1.2 spec says about this case (and what other surprises await there).
(I guess that's four things, not just a couple)
Another thing: • Do the custom thing where we preserve behavior of pre 1.30
I'd be fine with checking out snakeyaml-engine but there might be other breaking changes we'd introduce. Perhaps clj-yaml 2.0 then
If clj-yaml chooses to keep snakeyaml with its Yaml 1.1 "compliance", whether clj-yaml should use a custom resolver to "patch" this regression. For my particular use case with babashka, I don't think I can provide a custom resolver in the script itself: those are Java classes,, I believe.
yes, we could make that an option
and I personally would always use that option
I don't think anyone is really interested in yaml 1.1 and yaml 1.2: just use a subset and be done with the fucking yaml
Yeah, snakeyaml-engine would likely mean a new lib. And then, the babashka case: include both? Replacement would likely mean other behavioral differences.
I think it would be worth investigating the 2.0 option and see how many breakages there would be in practice. I will only include 1 yaml library
I won't spend any more megabytes on this bullshit
As you might have noticed, YAML really pisses me off every time
Yeah. I'm surprised both how these kinds of breaking changes are tolerated in many communities and how violently I now react against them.
we could make a snakeyaml 2.0 pod as well or have the other one as a pod
this is always an option
Me, too. By far my primary use case for yaml is working with AWS Cloudformation templates. A subset of that is Typescript CDK.
Re; Yaml 1.1, I think this is the controlling text (https://yaml.org/spec/1.1/#id865585): > Tag resolution is specific to the https://yaml.org/spec/1.1/#application/, hence a YAML https://yaml.org/spec/1.1/#processor/ should provide a mechanism allowing the https://yaml.org/spec/1.1/#application/ to specify the tag resolution rules. It is recommended that https://yaml.org/spec/1.1/#node/information%20model having the “`!`” non-specific tag should be resolved as “`tag:http://yaml.org,2002:seq`”, “`tag:http://yaml.org,2002:map`” or “`tag:http://yaml.org,2002:str`” depending on the https://yaml.org/spec/1.1/#node/information%20model. This convention allows the author of a YAML character https://yaml.org/spec/1.1/#stream/information%20model to exert some measure of control over the tag resolution process. By explicitly specifying a https://yaml.org/spec/1.1/#plain%20style/information%20model has the “`!`” non-specific tag, the https://yaml.org/spec/1.1/#node/information%20model is resolved as a string, as if it was https://yaml.org/spec/1.1/#quoted%20style/information%20model or written in a https://yaml.org/spec/1.1/#block%20style/information%20model. Note, however, that each https://yaml.org/spec/1.1/#application/ may override this behavior. For example, an https://yaml.org/spec/1.1/#application/ may automatically detect the type of programming language used in source code https://yaml.org/spec/1.1/#present/ as a non-https://yaml.org/spec/1.1/#plain%20style/information%20model https://yaml.org/spec/1.1/#scalar/information%20model and resolve it accordingly.
I'm not sure what is relevant in this blob of text?
I can read that as saying "whether or not it's quoted, if it doesn't include a tag, it should be interpreted as a seq, map, or string, depending on its structure"
There's a lot of wiggle room in there, too.
This is not how I wanted to spend my day.
Hi there! I was just driving by on #babashka and like an accident on the highway I could not look away from the unfolding chaos. Just wanted to send some virtual hugs your way @grzm and @borkdude - I've dealt with yaml issues before and they have scarred me for life! 🤗
FWIW, my particular use case does appear to be a difference between YAML 1.1 and YAML 1.2. I’m emitting YAML via babashka/clj-yaml/snakeyaml (YAML 1.1) and reading it with js-yaml v4 (https://github.com/nodeca/js-yaml) which is a YAML 1.2 processor, which notes the change in behavior from v3 to v4 (https://github.com/nodeca/js-yaml/blob/ab31bba6b41f58390f431123ffec5031b986edf5/migrate_v3_to_v4.md#loading-in-v4-documents-previously-dumped-in-v3).
I haven’t found yet if v3 is specifically YAML 1.1.
Actually, it looks like v3 is supposed to be YAML 1.2 as well? https://github.com/nodeca/js-yaml/tree/v3
have you tried json/generate-string? apparently it generates valid yaml 1.2 since json is a subset of yaml
Been thinking about that, and want to explore it more. While it is a subset of yaml, it does have different usability characteristics.
true
Throwing this incomplete thought out there (considering things might be on the table): my core use-cases for YAML are
• AWS CloudFormation templates (mostly reading) For reading AWS CloudFormation I’m currently using https://github.com/owainlewis/yaml in Clojure because of it’s yaml.reader/passthrough-constructor (https://github.com/owainlewis/yaml/blob/master/src/yaml/reader.clj#L11-L15) which allows me to easily handle AWS’s intrinsic (in particular, the short forms). https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/intrinsic-function-reference.html I’d love to see this in clj-commons/clj-yaml and babashka.
• Docker Dockerfile (though I currently mostly manage these manually rather than programmatically)
• JavaScript/TypeScript interop (haven’t found good EDN parser in JavaScript (not ClojureScript), though I haven’t looked extensively) I should probably use JSON here instead. This is the case where I’m having issues right now with the snakeyaml 1.29 to 1.30 behavior changes.
• Kubernetes resource file generation (more in the future, and yes I’m aware there is some k8s file Clojure library out there)
@grzm I think we should support passthrough constructor
issue + PR welcome
if you're doing JavaScript/TypeScript + YAML, you could also consider #nbb + a Node.js yaml lib
I could also expose some of these YAML classes in bb. But I don't want to have both snakeyaml and snake-yaml-engine in bb because of the size
I don't think those projects build on the same sources
so continuing with the current lib is maybe best + some options to make it behave saner
> so continuing with the current lib is maybe best + some options to make it behave saner That’s my gut reaction, too.
That, and continue to minimize my YAML exposure. It’s like lead or mercury, right? It accumulates and becomes increasingly toxic?
We should invent noml. Which is the same as EDN, but just with a hyped marketing site
Well, here's something for PassthroughConstructor: https://github.com/clj-commons/clj-yaml/pull/38
Haven't really wrapped my head around the quoting issue I first raised.
I opened https://github.com/clj-commons/clj-yaml/issues/35 to track.
I suspect it’s in the upstream snakeyaml library, but haven’t confirmed.
you could try to bump that in the newest to confirm?
I’ll give it a shot.
Looks like the current version of clj-yaml (0.7.10) uses the latest version of snakeyaml (1.32)
maybe post an issue here? https://bitbucket.org/snakeyaml/snakeyaml/issues?status=new&status=open
i’m seeing the same behavior on 110 and 109. And the quoted behavior on 108
maybe this commit? https://bitbucket.org/snakeyaml/snakeyaml/commits/89ba8dd5716d03a45489af5beeaecf781018cf97
@dpsutton Thanks for confirming.
Looks like the regression was introduced between 1.29 and 1.30
exactly
maybe it's JavaScript semantics or so? 083 in a Node REPL is just 83
or YAML spec weirdness
YAML spec weirdness, I suspect.
not sure, online yaml converters do not just automatically change strings into numbers
(and whatever is going on with the node REPL is just being bad)
I think the leading 0 being octal is just Java
The leading 0 is just a red-herring. It works (or rather doesn’t) with other numbers, too. Actually, maybe it’s not a red herring.
I think it's worth posting an issue about in snakeyaml
I meant YAML spec weirdness in that it’s so lax that people are often surprised by it’s behavior and as a result screw up their parsers/generators.
My Java is so freakin’ weak. What’s the quickest way to make a repro for a Java library?
I think your best bet is to look into the clj-commons library and just inline all the java interop into one blob
Oh, that I can do. What I’ll have trouble doing is building the durned thing 🙂
building? oh right
what I do:
javac --classpath $(clojure -Spath) Foo.javaand then java --classpath $(clojure -Spath) Foo.class
nowadays java also supports running a .java file (since java 11)
Coolio. Yeah, that sounds good. I was thinking of adding a test to their suite.
can you post a link in the clj-yaml issue when you've create one? going to sleep now, good luck!
For reference:
% cat NumberLikeString.java
package com.example;
import org.yaml.snakeyaml.Yaml;
class NumberLikeString {
public static void main(String[] args) {
String data = args[0];
Yaml yaml = new Yaml();
String output = yaml.dump(data);
System.out.print(output);
}
}
% java -classpath $HOME/.m2/repository/org/yaml/snakeyaml/1.29/snakeyaml-1.29.jar NumberLikeString.java 083
'083'
% java -classpath $HOME/.m2/repository/org/yaml/snakeyaml/1.30/snakeyaml-1.30.jar NumberLikeString.java 083
083
% java -classpath $HOME/.m2/repository/org/yaml/snakeyaml/1.32/snakeyaml-1.32.jar NumberLikeString.java 083
083
That wasn’t terrible. Thanks for the reminder.https://bitbucket.org/snakeyaml/snakeyaml/issues/550/regression-in-handling-number-like-strings