Fork me on GitHub
#core-typed
<
2022-05-01
>
didibus03:05:19

Random thought. How difficult would it be do design a static type checker (or maybe it's a kind of analyzer) which would track the keys and values of maps statically across an application so that: 1. It can assert that a map will have certain keys inside a given function 2. It can assert that a map might have certain keys inside a given function 3. It can be asked to tell you what keys a map at a particular place in the code could contain with a list of options

didibus03:05:42

It wouldn't even need to track the type of keys or values, that be a bonus if it also can.

ambrosebs03:05:45

Without annotations?

didibus03:05:05

I guess it would need to track the movement of the map through the code, and also track all instances of assoc or update on it where the key is constant. And somehow it would need to figure out that possibly unreachable code makes the key optional. I wouldn't expect it to track dynamic keys, like computed keys, though maybe indirect like a constant inside a variable maybe it could as well.

didibus03:05:52

I guess part of my question is if it is possible at all, and how little annotations could it get away with

ambrosebs03:05:58

IIRC ocaml does something like this with row variables. I barely remember the details but you can imagine that it's held together delicately and has many restrictions (IIRC you can't use the same field in two unrelated objects in the same file).

Ben Sless04:05:04

Seems like a rather arbitrary restriction, no?

ambrosebs15:05:58

Global inference is a whole other world, I think if you're trying to nail down the fields of a record and their types using global analysis it might be a necessary restriction. But take it with a grain of salt, haven't thought about this in 5 years.

Ben Sless15:05:58

The basic model for type inference is sets, right?

ambrosebs18:05:51

Yeah sets of constraints. It's a delicate balance of expressivity to make them solvable.

Ben Sless18:05:46

Can't maps be inferred then by treating them as a union of constraints regarding the occurrence of their entries?

Ben Sless18:05:36

Maybe that just breaks everything, I'm no expert

ambrosebs19:05:07

I think you're right in terms of inference, but if you have 2 different records with the same field name, the type system won't know which record you're referring to.

(defn foo [m1] (inc (:a m1))
(defn bar [m2] (name (:a m2))
(foo {:a 1})
(bar {:a 'a})
If you're just trying to find the "type" of :a entries, is it Int or Sym or (U Int Sym)? That's the basic problem IIRC.

ambrosebs19:05:07

You still need (inc (:a m1)) to type check. So you need to choose. And IIRC it's simply forbidden so the choice is made implicitly.

Ben Sless03:05:56

Why can't they be different sets? Their Union succeeds and their intersection fails. Their inference is tracked just like any set propagating through the calculation. What I'm missing is the need to have a global state associated with it

ambrosebs14:05:33

Good point, I went back to the papers to stop guessing. You're right, the functions are inferred separately with polymorphic rows. See sections 2.1, 2.2 in https://caml.inria.fr/pub/papers/garrigue-structural_poly-fool02.pdf

ambrosebs14:05:23

I misremembered, the fields on an object are inferred on its definition, not its usages. see first example in https://caml.inria.fr/pub/papers/remy_vouillon-objective_ml-tapos98.pdf

didibus21:05:35

I'm thinking not even doing anything with the types really, so lets assumes keys and values are just of type ANY. The question would be more:

(defn bazz [m] (get m :name))
Will :name exist on m in all cases, in only some cases, or in no cases.

didibus21:05:14

And could you ask the type checker to list out all cases of the set of keys m could have from all the callsite to `bazz?

didibus21:05:51

(bazz {:name 10 :age 10})
(bazz {:age 23})
(bazz (if wtv? {:hello true} {})
For example, you would get something like:
bazz is called with:
a :name key AND an :age key
or
a :age key
or
a :hello key
or
with no keys

didibus22:05:57

So similarly, you could tell the type checker to assert that bazz is called with both a :name and :age key, and it could tell you if that's not the case. And maybe it could help you find where in the code woulds it get called without those keys.

didibus22:05:06

And finally, I know some places would be undecidable, like say:

(bazz (hash-map (keyword (slurp "foo.txt")) "John"))
For those, I'd imagine the type checker would tell you that there are some unknown keys the map can have, and it would help you find where in the code those could be added.

didibus22:05:10

And possibly it could let you add a runtime assertion and maybe the type checker could understand that or be told about it, for example:

(bazz (assert-key (hash-map (keyword (slurp "foo.txt")) "John") :name))

didibus22:05:10

So now even though its dynamic, the checker would know that this should make it so the map contains the :name key as well.

didibus22:05:40

Maybe calling it a type checker would be wrong, it be more like a static key checker.

ambrosebs03:05:31

obviously that's one end of the spectrum (hindley-milner)

ambrosebs03:05:28

I think you end up with the same problem as automatically inferring Typed Clojure annotations. You'd need to track maps thru sequence fn's and transducers etc

ambrosebs03:05:08

looking at my dissertation again, I think this is a kind of Higher-order Control Flow Analysis https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.36.6128