I'm working on a full EBNF grammar for edn.c reader implementation. And I start from enriching current EDN specification. https://github.com/DotFox/edn.c/blob/abaa31d06ad765711842c5100cf4b035cc132a07/docs/grammar/edn_spec.org It would be great if someone could take a look, in case I missed something.
I found this interesting case which doesn't seem to be documented as part of the edn spec:
> (read-string "asdlkfj::asasdf")
Invalid token: asdlkfj::asasdf
> (clojure.edn/read-string "asdfa::asdfadsf")
Invalid token: asdfa::asdfadsfIt is mentioned by https://clojure.org/reference/reader#_symbols
https://ask.clojure.org/index.php/14787/double-colon-in-the-middle-or-at-the-end-of-identifier
@nbtheduke that's the lisp reader which is different than edn.
if the EDN spec doesn't mention it, I guess it's undefined for EDN and defined for Clojure not to work. undefined behavior isn't something one should rely on
@smith.adriane sure but they share nearly identical code, so any answer to one will no doubt apply to the other
right, but clojure.edn/read-string doesn't parse asdf::adfa
even though it seems to be valid edn.
I'm preparing an extension to this spec that attributes all the differences between plain edn and how it is implemented in clojure:)
You may also be interested in https://github.com/sogaiu/tree-sitter-clojure as another data point (even though it's clojure syntax rather than plain edn)
ah, the EDN spec says:
> Additionally, : # are allowed as constituent characters in symbols other than as the first character.
double colon is actually documented. however, I don't remember where I found it
@delaguardo the reader spec: > A symbol can contain one or more non-repeating ':'s. non-repeating
right! there)
right, but that's a reference for clojure syntax. I don't believe it's mentioned in the edn spec.
correct. the EDN spec only says: https://clojurians.slack.com/archives/C06E3HYPR/p1764689192016559?thread_ts=1764675433.212249&cid=C06E3HYPR
clojure's edn not documented at all
is this not the documentation? https://github.com/edn-format/edn
sadly but no
yeah clojure.edn is purely implementation-based, is not bound or limited by the edn specification, and extends it in multiple ways (ratios, etc)
about the difference:
EdnElement = Nil
| Boolean
| Symbolic (* ##Inf, ##-Inf, and ##NaN *)
| Character (* Additional named characters and octal form *)
| String (* Additional octal escape sequence *)
| Integer (* Additional integer forms and minor deviations *)
| Float
| Ratio (* Rational ratios *)
| Symbol (* Many small additions and deviations *)
| Keyword (* Same as Symbols *)
| List
| Vector
| Map (* Additional syntax for namespaced maps *)
| Set
| TaggedElement
| DiscardSequence
| MetadataSequence (* Metadata literal for some Elements *)
;
here is the list I have so far.In case you haven't seen it, this self-PR contains the generative tests that found most of the open issues on fast-edn https://github.com/madclj/fast-edn/pull/1/files It evolved over several meetups, also see earlier commits in the branch.
in fact it's unusably specific as we had to work around all ~25 issues in the same test. earlier versions are better.
I'll see if I can port it to edn.c Looks amazing so far, thanks 🙂
is the goal of fast-edn to match clojure.edn or to match the edn-format spec?
np! @ericnormand found the basic algorithm if you want to start over, which is: deleting a random character in a string should result in equivalent edn in fast/clojure.edn OR an error
oh that's clever
while we are talking about edn vs. clojure.edn
I found an interesting case: clojure does not support 10+ dimensional arrays 😄
This is definitely minor problem but weird to see that String/9 works but String/12 doesn't
(I also extended it to randomly add commas, whitespace and newlines, which found many weird bugs around numbers)