Fork me on GitHub
#malli
<
2023-09-15
>
Stuart Nath19:09:45

Hello Everyone, I'm new to Malli and am trying to use it for referential integrity on my data. The following approach works, but is slow compared to other approaches, such as a http://tech.ml join. The dataset is a vector of maps, where each map represents a row. There are 170k rows. I am trying to conduct a referential integrity check on the "Item" key, of which there are 2,100 unique "Items". I have followed the following process:

(def enum-statement  (into [:enum] set-of-items))
(def m-schema [:map ["Item" enum-statement] ["ShipmentDate" :some] ["Tons" :double]])
(time (frequencies (pmap #(m/validate m-schema %) data))) ;; => 40 seconds
(time (frequencies (map #(m/validate m-schema %) data))) ;; => 90 seconds
Is there another way in Malli to accomplish this faster? If not, is there a way to make the current approach faster? I still might end up using this because it is pretty elegant, but wanted to ask here to see if there was an easy performance boost available.

1
Stuart Nath19:09:18

Using the validator function we got the time down to 1.7 seconds:

(time 
(let [valid? (m/validator m-schema)]
  (frequencies (pmap #(valid? %) demand-data))))
This has solved our speed issue. We will keep experimenting with it.

💯 2
ikitommi11:09:44

Yes, m/validator returns a pure and (mostly) optimized function for this. I would assume map would be faster here, if the data is already read into memory. e.g. (map valid? demand-data)

Stuart Nath13:09:13

That did turn out to be the case - map was 3x faster.

Eli Pinkerton23:09:56

Hello all, I'm new to Malli (as well as Clojure!) and am using it via gungnir (https://kwrooijen.github.io/gungnir/model.html) to define my db models. This is very cool. However, I have a problem - the data source for the objects that I want to put into my db don't always have matching field names. For example, the db colum for a table might be named "foo", but the actual JSON that I want to shove into that table's field is named "foobar". I want to map "foobar" to "foo", and potentially run some conversion functions on it (to change type from an int to a string, for example). It looks like Malli supports transformers (https://github.com/metosin/malli#value-transformation), but I'm a bit confused as to how to use them. Would appreciate some guidance here!