Fork me on GitHub
#clojure-spec
<
2017-03-21
>
Alex Miller (Clojure team)00:03:46

@stathissideris @seancorfield regarding your conformer question above, use (s/conformer val) not second

Alex Miller (Clojure team)00:03:42

Or if you're feeling bold you can wrap the semi hidden and possibly going away (s/nonconforming ...)

seancorfield00:03:53

Ah, yes, it’s a MapEntry not a vector.

seancorfield00:03:16

I would use s/nonconforming but you keep saying it might go away 😛

seancorfield00:03:42

So why does it matter whether we use val or second?

seancorfield00:03:04

(I’m happy to “do the right thing” but want to know why my current code is “wrong”)

Alex Miller (Clojure team)00:03:06

val is direct field lookup

seancorfield00:03:16

Ah, so it’s faster.

Alex Miller (Clojure team)00:03:26

second will convert to a seq and pull two values

seancorfield00:03:56

I suspect I use second in other places where val would be better...

Alex Miller (Clojure team)00:03:37

Just don't ever use second :)

seancorfield00:03:13

Yeah, I have a whole bunch of (->> … (group-by something) (map second) …) forms!

seancorfield00:03:39

What about destructuring where you likely have (map (fn [[k v]] …) some-map)

seancorfield00:03:10

better to do (map (fn [me] … (key me) … (val me) …) some-map)?

Alex Miller (Clojure team)00:03:49

I'd use the one that you find more readable

Alex Miller (Clojure team)00:03:38

Id have to look at code to guess at any perf difference

Alex Miller (Clojure team)00:03:24

If destructuring uses nth then its probably a wash

stathissideris06:03:32

@alexmiller thanks, I wasn't aware that using second instead of val for map entries is less performant

yonatanel12:03:32

I made a custom tagless s/oradapted from the original, but had to copy half of spec's private functions like specize, pvalid?, explain-1 etc. Am I missing some public util functions? https://github.com/yonatane/spec-grind/blob/master/src/spec_grind/impl.clj

iGEL13:03:46

A spec newbie question: How do I create expectation about non namespaced keyword keys of a map?

iGEL13:03:03

(do
  (s/def ::customer-id int?)
  (s/def ::category (s/keys :req [::customer-id]))

  (s/explain-data ::category {:customer-id 2}))

danielneal13:03:17

You can use req-un and opt-un

danielneal13:03:32

There's some details on the about spec page I think https://clojure.org/about/spec

iGEL13:03:53

Thanks. We missed that you still need to pass namespaced keywords, but the expectation is non namespaced. Now it works

Alex Miller (Clojure team)13:03:13

@yonatanel no, those are internal and subject to violent change without warning :)

iGEL13:03:12

Another question: I wanted to use spec in my tests, where I do want to expect most keys in my map to be exactly this, but others like the id or created_at, which I can't predict, to just conform to predicate. For example, given this map:

{:id 4
 :customer_id 5
 :section "lala"}
I want to see that customer_id is exactly 5 and section is "lala", but for id it just want to know that it's an integer. Finally, it would be nice if the spec would fail if the result has other keys, but that's not required. Is that possible and how?

moxaj13:03:55

@igel for the first part, you can use sets as predicates, like #{5} or #{"lala"}. For the second part, you can do (s/and (s/keys ...) #(every? #{:your :keys} (keys %)))

iGEL14:03:44

Thanks 🙂

moxaj14:03:05

(but it's generally advised against)

Yehonathan Sharvit14:03:15

Another question related to (s/keys …) Imagine I run this:

Yehonathan Sharvit14:03:17

(s/def ::my-map (s/keys))
(s/valid? {:foo/core “aa”})

Yehonathan Sharvit14:03:16

If foo.core is somehow defined in my spec registry, then its specs will run when calling (s/valid? {:foo/core “aa”})

Yehonathan Sharvit14:03:31

But it is not my intent at all

Yehonathan Sharvit14:03:53

foo/core might be a namespace defined by a library that I am requiring

Yehonathan Sharvit14:03:03

Is there a security risk?

Yehonathan Sharvit14:03:19

Unexpected code can run based on user input

Yehonathan Sharvit14:03:47

imagine I parse a JSON with {”foo/core” “aa”} and decide to keywordize the keys

yonatanel14:03:44

@viebel that's why it's recommended to use namespaces you own, otherwise you break others and others break you.

yonatanel14:03:18

@viebel But yes, that global registry is a pain in the ass.

Yehonathan Sharvit14:03:01

The problem is that once I require a library, I cannot prevent it to add specs in the registry

Yehonathan Sharvit14:03:32

So when I parse the JSON, I must be careful when keywordizing the keys !!!?!?!

yonatanel14:03:58

@viebel I'm not sure spec is meant for portable data validation and coercion at all.

moxaj14:03:38

@viebel you can still preprocess the keywordized map, before validating it

yonatanel14:03:59

Because of how it's built. I'm using it for JSON all the time but it feels wrong.

mpenet14:03:21

I do the same and I don't have any issue with it. Not sure I follow

bronsa14:03:38

I think viebel has a point

yonatanel14:03:54

@mpenet I don't like prefixing all my keywords with a company name for example.

bronsa14:03:54

suppose you have a spec whose conformer does heavy-ish work

bronsa14:03:09

if you conform user input against a spec that uses keys

bronsa14:03:41

it will force that conformer to run

bronsa14:03:58

might be an opportunity for dos attack

bronsa14:03:17

forcing cpu bound work while parsing user data

mpenet14:03:36

if you accept input with keywordized keys and use spec you must follow some rules yes (proper namespaces), otherwise just use the *-un variants

mpenet14:03:55

(I do the later)

mpenet14:03:21

the endpoint dicts the context already, no need for namespaces usually

bronsa14:03:21

well, the point is that if you're using keys and accepting user data you don't have control over what the user will pass you

bronsa14:03:48

and might malignantly pass namespaced keys that force conform using an expensive conformer

mpenet14:03:52

if you take edn directly yes that would matter. but arent' we talking about json ?

mpenet14:03:30

I think so

Yehonathan Sharvit14:03:39

we are talking about JSON

bronsa14:03:49

well, point still stands

bronsa14:03:56

suppose I was talking about edn/transit :)

Yehonathan Sharvit14:03:58

that you parse with keywrodizing the keys which is quite common practice

mpenet14:03:28

you then get un-namespaced keywords

mpenet14:03:34

so no risk for "injection"

yonatanel14:03:56

Unless you have data like {"event/type" "blabla"}

bronsa14:03:57

still, if you do something like (comp clojure.walk/keywordize-keys cheshire.core/parse-string) over your input data

bronsa14:03:11

& then conform

mpenet14:03:11

ah right, didn't realize that (keyword "event/type") would actually create a ns keyword, bummer

mpenet14:03:31

another argument for never keywordizing json I guess

bronsa14:03:54

(doesn't solve this issue with edn/transit over the wire)

mpenet14:03:41

this kind of sucks indeed, I wonder why keyword allows this at all

yonatanel14:03:44

@bronsa I guess you shouldn't blindly use (s/keys) on user input, or do anything blindly with user input.

mpenet14:03:55

there's the 2 arity version for ns stuff already

bronsa14:03:06

@mpenet it's been like that since forever and it's used quite often

mpenet14:03:17

still very unfortunate

bronsa14:03:26

it's quite useful

yonatanel14:03:59

@mpenet I read somewhere that keyword accepts anything exactly for this reason, to keywordize user input.

bronsa14:03:12

@yonatanel meh, i don't buy the argument "you should be careful with what you do while parsing user input" tho -- i'm conforming user input to check for valid data

mpenet14:03:15

that said you could write a safe keywordize-keys

mpenet14:03:25

using find-keyword

bronsa14:03:34

i don't want to manually check for invalid data and then conform it

bronsa14:03:51

what if I do want to use namespaced keys in my user data tho

bronsa14:03:26

then it's either live with the possibility of people exploiting this for possibly overload cpu while validating user input or don't use namespaced keys at all

bronsa14:03:40

I would consider this a significant limitation

mpenet14:03:26

I guess this is a +1 for the "closed" key-set definition people are asking for

mpenet14:03:34

"a la Schema"

bronsa14:03:50

or using a private registry

idoa0114:03:53

Hey, new here and new to clojure-spec, I was discussing with @viebel on this issue before he came here.

idoa0114:03:00

the thing that struct me is that the (s/valid?) will call unexpected code on various input. as we saw here, it might be used to run malicious code on the server

bronsa14:03:53

I wouldn't say malicious code as the user can't really inject code

bronsa14:03:10

undesiderable code, tho, yes

idoa0114:03:11

I think the openness is a huge problem here, (s/valid?) shouldn't run anything that I didn't ask for

idoa0114:03:49

it could be malicious if the code came from a library designed to inject that (s/def)

bronsa14:03:02

you're in control of what libs you're using tho

idoa0114:03:07

the library has many useful code, and a backdoor

mpenet14:03:15

well that library would also write files and what not 🙂

bronsa14:03:42

if you're pulling in a backdoored lib spec is the last of your problems

mpenet14:03:46

buy yes, the auto validation of ns kw even when not in spec is odd

idoa0114:03:55

but sure, the more common problem is a bug in the library, or just a cpu-intensive code

bronsa14:03:17

I don't think that (the opennes of keys) is ever going to change

bronsa14:03:19

for good reasons

mpenet14:03:55

openness is fine, auto-validation of non spec-present ns'ed key, not so much imho

idoa0114:03:12

then maybe add a (s/secure-valid?) for user input

yonatanel14:03:49

more like (s/keys :only [...])

mpenet14:03:52

the famous (s/keys) that triggers validation on every single key of the map

bronsa14:03:53

i don't see the point of openness w/o auto conform/validation

bronsa14:03:59

@yonatanel yeah that's never going to happen

bronsa14:03:17

it's been proposed a bunch of times and rejected with good reasons

yonatanel14:03:29

Cool. any links to that?

bronsa14:03:41

rich's last clojure/conj talk explains why closed specs make it harder to evolve systems

yonatanel14:03:04

Yeah, I watched it.

yonatanel14:03:27

So what's the problem really?

idoa0114:03:34

in that case, the library has to have a way to clean the input

bronsa14:03:43

-Namespaced keyword specs can be checked even when no map spec declares those keys

This last point is vital when dynamically building up, composing, or generating maps. Creating a spec for every map subset/union/intersection is unworkable. It also facilitates fail-fast detection of bad data - when it is introduced vs when it is consumed.

yonatanel14:03:19

You're asking spec to sanitize your inputs and conformers to be your lossy coercions. From what I gather it's not meant to do that (I do use it that way though :))

idoa0114:03:53

I'm asking spec to not run code I didn't specifically intend to run.

dergutemoritz14:03:11

By that reasoning you'd also have to prohibit polymorphism across libraries

idoa0114:03:44

that's taking it too far, and when I use polymorphism, I intend to use polymorphism.

bronsa14:03:45

don't think that's the same at all

dergutemoritz14:03:19

It's similar in a way that you potentially run code that you didn't know you were going to run from the point of calling a function

bronsa14:03:37

this issue implies that you can't run s/valid? on user-provided input before validation (which is what s/valid? is for?). Now suppose one of your libs defines a spec for some internal validation and it's bugged in a way that causes certain inputs to never terminate parsing. Now somebody discovers it, and maliciously provides those inputs to you and you have no way of avoiding that to run

bronsa14:03:47

I think it's a very valid issue to raise

bronsa14:03:07

let's wait to see what @alexmiller thinks about it

dergutemoritz14:03:11

Yeah, that spec is not good for running on untrusted inputs is a good obversation

bronsa14:03:04

this to me feels similar to the problem that clojure.edn solved for parsing user input

Alex Miller (Clojure team)14:03:43

but you are asking spec to run validation using s/keys, which will validate keys

Alex Miller (Clojure team)14:03:47

if you don’t want that, don’t do it

dergutemoritz14:03:06

Good point, hehe

idoa0114:03:31

then what is the recommended way to validate a map?

idoa0114:03:00

or more specifically, a JSON

Alex Miller (Clojure team)14:03:09

I think you’re implicitly conflating multiple things into one question

Alex Miller (Clojure team)14:03:33

spec has capabilities for validating data

Alex Miller (Clojure team)14:03:24

one of them is to define a spec for a map as a collection of attributes

Alex Miller (Clojure team)14:03:26

recognizing that many current maps have unqualified keys, s/keys provides :req-un and :opt-un options for validating unqualified keys

Alex Miller (Clojure team)14:03:57

and that is one approach to validating json

idoa0114:03:15

but :req-un and :opt-un still accepts fully qualified namespaces, which will cause the same issue

Alex Miller (Clojure team)14:03:41

so don’t rely on spec to solve every problem for you

Alex Miller (Clojure team)14:03:24

its your responsibility as an application developer to think about how you handle untrusted inputs from the external world

bronsa14:03:28

>but you are asking spec to run validation using s/keys, which will validate keys right, but even if I'm not while validating input data, I might in some internal function that my valid input data might get threaded through

bronsa15:03:36

>its your responsibility as an application developer to think about how you handle untrusted inputs from the external world which is why I'm validating it with spec :) it feels a bit odd to me that we have to be careful about passing user data to validating functions

bronsa15:03:11

if "spec is not the right tool at this level" is the answer I guess I'll live with that but I can imagine a lot of people will be confused by this answer?

Alex Miller (Clojure team)15:03:03

I’m not saying it’s not the right tool, just that you should think critically about what you’re doing. sorry not sorry about that.

bronsa15:03:07

esp because people are using schema for doing that right now and will be tempted to switch from schema to spec w/o realizing the implication

Alex Miller (Clojure team)15:03:23

you control the libraries you’re using, the code you’re running, and when and how you validate data

Alex Miller (Clojure team)15:03:15

all your specs call predicate functions, maybe from other libraries - how is that any different?

bronsa15:03:37

that's all true, it still doesn't make me feel any less weird about having to validate my data in order to validate it through spec

bronsa15:03:49

it's not different, conform vs predicate functions is not the issue

idoa0115:03:16

it's a bit like adding eval to the code, without telling the non-expert user that you're doing it. I was really surprised when @viebel pointed this feature to me.

bronsa15:03:33

the issue is it appears it's not safe to validate/conform user-provided data with spec w/o.. validating it before passing it to spec

idoa0115:03:42

so it may work especially well for anyone in the "knows" but will shot the regular developer in the leg.

Alex Miller (Clojure team)15:03:27

can you give an example of a spec that would be “not safe” in this way?

bronsa15:03:17

not safety is not an issue, I don't think. I'm thinking of some malformed spec that causes non termination with very simple inputs, which is not unlikely to happen

Alex Miller (Clojure team)15:03:47

if it’s a bug in a predicate or spec, then you would fix it, just like you would with any bug that has that effect in your code

Alex Miller (Clojure team)15:03:54

I’m questioning the premise of this problem

bronsa15:03:04

correct, it would be a bug of a spec

bronsa15:03:10

which I don't have control over

Alex Miller (Clojure team)15:03:17

you do - you chose to load it

Alex Miller (Clojure team)15:03:22

you can choose to not load it

bronsa15:03:33

yes, but I might not know about the bug in this spec

bronsa15:03:44

nor might the library author

Alex Miller (Clojure team)15:03:46

and you might not know about a bug in a function in a library you call

bronsa15:03:52

but an attacker might discover it

Yehonathan Sharvit15:03:01

The weird thing is that even if the spec is just there - defined in the lib - without even being in use, it will be in the registry and corrupt my code

Alex Miller (Clojure team)15:03:14

it will not be unless you load the code

Alex Miller (Clojure team)15:03:23

you control the registry

Yehonathan Sharvit15:03:36

Should I check all the specs defined in all the libs I load?

Alex Miller (Clojure team)15:03:48

should you check all the functions defined in all the libs you load?

bronsa15:03:52

right, it seems highly more likely to be explotable in a spec tho that finding the correct data that causes an invocation of that function to trigger the bug

Yehonathan Sharvit15:03:56

All the functions I use

Yehonathan Sharvit15:03:03

But not the funtions I don’t use

Alex Miller (Clojure team)15:03:12

all the functions your functions call?

Alex Miller (Clojure team)15:03:28

then yes, you should check all the specs too

idoa0115:03:54

say I'm using a common library (say cheshire) and it upgraded with a buggy spec. now anyone who uses that lib, are open to that bug, without explicitly calling the spec. just by loading cheshire

Alex Miller (Clojure team)15:03:14

yes, exactly like if cheshire had a function that was buggy

bronsa15:03:17

I understand what you're saying, I don't agree that the two have the same severity tho

Yehonathan Sharvit15:03:28

I cannot discover this bug with any kind of unit tests

Yehonathan Sharvit15:03:42

(Even using test.check !)

Alex Miller (Clojure team)15:03:53

that does not make sense

bronsa15:03:59

I have control over what functions I use and how I use them, I don't have control over what (loaded) specs some user data will invoke unless I validate that data before I spec/validate it (or don't use keys)

rauh15:03:22

At the easiest you can trigger a stackoverflow if a user has a recursive schema somewhere defined.

bronsa15:03:40

anyway, it seems like the answer is "be careful", not sure it satisfies me but I can live with it

rauh15:03:48

I'd love to vote on a ticket if one is created, because until 10min ago I didn't realize s/keys checked other keys implicitly.

bronsa15:03:58

that's in s/keys docstring

yonatanel15:03:10

using an empty (s/keys) beats defining every object in even a medium system.

Alex Miller (Clojure team)15:03:10

@rauh if the spec throws an error, then it is doing it’s job in telling you that it’s invalid data

idoa0115:03:24

the implicitly of (s/keys) opens the code for trouble. an in-between solution would be that (s/keys) will invoke only locally namespaced specs, and not library defined specs.

Alex Miller (Clojure team)15:03:59

those are not two differentiable things

donaldball15:03:06

Every once in a while I wish for a s/keys version that allowed a whitelist of spec keywords or namespaces that it checks

bronsa15:03:25

they are if you allow using a local registry rather than a global one

bronsa15:03:03

the local registry could be defined as a subset of the global one, becoming effectively a whitelist

bronsa15:03:23

(not sure if it's a good idea)

idoa0115:03:54

used with (s/keys :ns ... )

Alex Miller (Clojure team)15:03:37

s/keys combines multiple things. enhancement requests to do individual parts of that might be worth considering.

Alex Miller (Clojure team)15:03:38

separating required/optional key checks from validation of specified keyed attributes from validation of all keyed attributes

rauh15:03:12

@alexmiller A Stackoverflow which can easily be triggered by user input isn't something most programmers catch, and certainly valid? is expected to return true/false. Stackoverflow doesn't even get caught with (a pretty wide) (catch Exception _)

rauh15:03:40

It goes under VirtualMachineError and will likely kill the thread.

Alex Miller (Clojure team)15:03:21

again, how is this different from having a bad function that throws a StackOverflowError?

Alex Miller (Clojure team)15:03:49

if it’s a bug, fix it

idoa0115:03:50

the difference is that the code only needs to be loaded, not referenced anywhere

Alex Miller (Clojure team)15:03:05

which is not at all different than multimethods or protocol extensions

Alex Miller (Clojure team)15:03:23

there are several Clojure constructs that create runtime state on load

rauh15:03:11

Because the attack is much easier.

Alex Miller (Clojure team)15:03:48

how is it any different than a bug in the validation function you were going to hand write instead?

Alex Miller (Clojure team)15:03:42

the attack (from outside) is identical

idoa0115:03:46

the question is, if a programmer which isn't in this slack room, uses clojure-spec, will he think about sanitizing the json before passing it to (s/valid?) if not, then the buggy behavior will be wide spread.

idoa0115:03:11

was this behavior of the (s/keys) been discussed in regards to this issue before? if not, I think that it qualifies as a "surprising" side effect of clojure-spec

Alex Miller (Clojure team)15:03:41

it’s been discussed many times. it’s discussed in the guide, and in the spec rationale, and in the doc string.

idoa0115:03:56

every one who uses clojure-spec to validate REST API calls

Alex Miller (Clojure team)15:03:05

no, I mean for a particular application

Alex Miller (Clojure team)15:03:22

if someone passes bad input, it could yield an error response

Alex Miller (Clojure team)15:03:27

how is that different than any bad input

Alex Miller (Clojure team)15:03:36

that’s the whole point of validation?

idoa0115:03:46

because I'm using clojure-spec to validate the input is not bad.

idoa0115:03:02

so clojure-spec should tell me it is.

idoa0115:03:16

not run arbitrary code that might harm the machine

Alex Miller (Clojure team)15:03:25

how can it harm the machine?

Alex Miller (Clojure team)15:03:32

you chose to load the “arbitrary code”

Alex Miller (Clojure team)15:03:44

this is not code some attacker is supplying to you

idoa0115:03:50

stackoverflow, memory consumption, cpu overload. you name it.

Alex Miller (Clojure team)15:03:11

all things that can happen from an invalid input also sent to a “bad” validation function

idoa0115:03:26

it all depends on what is written in a faulty spec, that i didn't say i want to use.

Alex Miller (Clojure team)15:03:05

I don’t see any way to actually cause memory consumption or cpu overload in dangerous ways. stackoverflow maybe, although I don’t have an example of that either.

bronsa15:03:37

i think the claim being made is that the surface area of the possible "attack" (for lack of a better word) is potentially way larger with bugged specs than other bugged constructs

Alex Miller (Clojure team)15:03:42

I have no examples of “faulty specs” that can cause improper machine resource usage.

Alex Miller (Clojure team)15:03:59

what are the alternatives?

Alex Miller (Clojure team)15:03:19

1) don’t validate and don’t be aware of invalid data

Alex Miller (Clojure team)15:03:34

2) validate by using functions in either your code or libs

Alex Miller (Clojure team)15:03:01

1 does not seem better and 2 does not seem effectively different to me

idoa0115:03:02

allow me to not run any arbitrary spec code implicitly if i don't want to 😕

Alex Miller (Clojure team)15:03:12

and why is that harmful?

mpenet15:03:20

a spec predicate that does something over the network (call a db?), parses string content (or whatever resource heavy operation you can think of)

idoa0115:03:25

why would I want to run it?

mpenet15:03:42

not your average predicate, but not too crazy either

rauh15:03:18

IMO it'd be nice to have multiple repositories, for instance, I'm using :db/id, :object/id and :file/id in my application code. Down the road when many libraries are spec'ed this will get trampled and lead to issues. Or am I missing something?

mpenet15:03:27

that's not so much of an issue with ns aliases + proper namespacing

mpenet15:03:47

it's the same problem with project names/ns/package. that said multiple repos would be nice for other reasons

Alex Miller (Clojure team)15:03:42

@rauh you are missing the use of proper namespacing :)

yonatanel15:03:19

@alexmiller Would you say datomic attributes should be qualified with a namespace you own, or is :album/year enough?

yonatanel15:03:50

I see spec and datomic go hand in hand. Correct me if I'm wrong.

Alex Miller (Clojure team)16:03:37

@yonatanel same advice as spec. if you’re providing data for use with others, you must use a qualifier that you “control" (reverse domain, trademarked entity, etc). if the data will be used only in an environment that you control, it must only be “sufficiently unique"

Alex Miller (Clojure team)16:03:37

so in a generic open source library, use a qualifier you control. If confined in your app, do whatever you want. If in an organizational context, you might need to ensure uniqueness in your organization.

moxaj16:03:41

@alexmiller care to comment on the snippet above? How should one defend against an 'attack' like this?

Alex Miller (Clojure team)16:03:13

Don't call valid?, don't load this spec, don't use s/keys, or use select-keys to pre-filter what you look at

Alex Miller (Clojure team)16:03:29

Check whether your input contains 10000 nested maps

Alex Miller (Clojure team)16:03:14

Again, also compare this with what you would do without spec too - is that prone to the same issue?

moxaj16:03:15

well, without spec, I wouldn't check for an ::x key within my input, so no

Alex Miller (Clojure team)16:03:55

so is it better to notice the bad input or to pass 10000 nested maps around your system?

dergutemoritz16:03:55

(provided your JSON parser didn't blow up with a StackOverflowError before that point anyway, hehe)

dergutemoritz16:03:25

(or whatever your input format is)

tbaldridge16:03:28

And "attack" is a bit of a strong word here, considering it's an exception, not a granting of root privileges or something like that.

moxaj17:03:31

It isn't a bad input though. My spec says it might have an 'a' key and that's it. I do not care about the rest, if I did, I would have specified that in my spec.

moxaj17:03:52

@tbaldridge I agree about the attack part, hence the quotes (I'm pulling a trump here lol)

Alex Miller (Clojure team)17:03:13

@moxaj that’s not what spec says that spec means

Alex Miller (Clojure team)17:03:00

that spec says “a map that might have an ::a key, and where all values are valid according to the spec of their keys"

sparkofreason17:03:17

Is there any way to define coll-of specs such that the associated generator would yield a collection type such as PersistentQueue, sorted-set, etc? I assume I could do it with a custom generator, just wondering if there's a shorter path.

Alex Miller (Clojure team)17:03:29

@moxaj if you want what you said, then just map? is sufficient

Alex Miller (Clojure team)17:03:54

(s/def ::c (s/coll-of int? :into (sorted-set)))

moxaj17:03:20

The keys spec is perfect for me, except for the implicit part. But I guess that's not subject to change, so no point in arguing :)

Alex Miller (Clojure team)17:03:43

well as I said above, an enhancement ticket that separates parts of keys seems like a reasonable idea

Alex Miller (Clojure team)17:03:37

I do not know how Rich would react to it, but “decomplecting” is usually a good thing :)

otfrom17:03:19

I do a lot of ETL stuff where I start with a pile of strings and turn them into some sort of seq of data structures. The strings encode things like a map of key to string or key to set of numbers. I've had some reasonable success in using spec to check that the strings are of a format that is coercable to something "100" -> 100 or 100,101,102 -> #{ 100 101 102 } Is this a terrible, no good, pls stop use of spec?

Alex Miller (Clojure team)17:03:10

are you checking that they are coercible or actually doing the coercion?

otfrom17:03:36

I've got a function that checks s/valid? using the spec and then s/conform using the same spec

Alex Miller (Clojure team)17:03:05

so you are transforming the data via the conform

otfrom17:03:14

(which I think is the bad bit)

Alex Miller (Clojure team)17:03:28

so the general caveat for stuff like that is not that it’s necessarily bad but that it has consequences that you should understand

Alex Miller (Clojure team)17:03:48

namely that users of that spec cannot recover the original data (as you could via s/unform normally)

Alex Miller (Clojure team)17:03:08

and if you put that spec in the registry, you’ve made that choice for consumers of the spec

Alex Miller (Clojure team)17:03:26

if “consumers of the spec” == you, then that’s up to you :)

Alex Miller (Clojure team)17:03:40

if you’re using conformers, and if the function is reversible, you can supply an unform function in conformer to retain the reciprocal nature of that (assuming it is worth the trouble for you, which it may not be)

otfrom17:03:29

I've not seen an unform example that does that yet. That would be interesting. After talking to seancorfield about it quite a bit I've been building up quite a few generators for testing things that have done a good job of driving out some bugs in my functions.

otfrom17:03:02

would I be able to use an unform function like that inside a generator for testing?

otfrom17:03:22

(and this is gold dust to me so thx. I'm glad it is a trade off I should think about rather than a terrible idea)

otfrom17:03:52

(and I realise this spark/hadoop/ETL strings -> data is not every domain)

sparkofreason17:03:01

@alexmiller Thanks, that worked. Docs make it sound like :into is limited to [], (), {}, #{}, should have just tried it.

Alex Miller (Clojure team)17:03:13

user=> (s/def ::s (s/conformer #(str "hi "%) #(subs % 3)))
:user/s
user=> (s/conform ::s "bruce")
"hi bruce"
user=> (s/unform ::s "hi bruce")
“bruce”

Alex Miller (Clojure team)17:03:24

it’s just functions yo

otfrom17:03:41

:hugging_face:

Alex Miller (Clojure team)17:03:51

Rich would also caution you against treating spec as a transformation engine rather than something that validates declarative statements about your data (ie “it’s not a meat grinder”).

Alex Miller (Clojure team)17:03:07

but that capability is there for your abuse :)

otfrom17:03:24

I hadn't seen that conformer done that way

Alex Miller (Clojure team)17:03:44

@dave.dixon actually, I may have led you into the path of undefined behavior there

Alex Miller (Clojure team)17:03:11

I think you’re right that the intention was to only support that fixed set in :into for the time being

otfrom17:03:11

alexmiller this is the "use clojure.core for this kind of thing" comment I've seen floating about

otfrom17:03:53

I think I'm still searching for a better way of turning other peoples strings into data.

otfrom17:03:29

(and dividing the invalid from the valid ones and reporting on the invalid ones well)

otfrom17:03:52

I quite liked the error msgs that I could get out of spec around what had failed when I tried to validate

Alex Miller (Clojure team)17:03:45

@dave.dixon there is some ambiguity about whether (I think due to impl churn) about whether :kind is supposed to be a function or a spec. Only a function works now, but the doc implies a spec (which could itself have a custom generator). There is a ticket regarding this issue and I haven’t yet gotten Rich to respond to me about it. :)

otfrom17:03:03

thx alexmiller

sparkofreason17:03:06

@alexmiller Thanks, that answers my next question. Will probably just go with it for now since it seems to work, and the only other option would appear to be to write a custom generator for every "non-standard" collection spec.

yonatanel18:03:18

What's that book/paper spec was inspired by?

hiredman18:03:08

are you talking about the parser?

hiredman18:03:21

racket has a contract system which, while I don't think I have heard anyone on the core team say was inspiration, a lot of people just assume it must have been

hiredman18:03:41

the way spec validates sequential data structures is by "parsing" them using a parser based on parsing with derivatives (also a racket connection there)

yonatanel19:03:09

yes, parsing with derivatives. Thanks!

Yehonathan Sharvit19:03:04

@yonatanel Here is an interactive article that I wrote about the basics of “parsing with derivatives"

Yehonathan Sharvit19:03:13

With interactive clojure code snippets

yonatanel19:03:33

@viebel Yeah it's one of the first google results for parsing with derivatives...