This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-08-23
Channels
- # babashka (104)
- # beginners (23)
- # calva (15)
- # cider (2)
- # clojure (29)
- # clojure-europe (14)
- # clojure-nl (2)
- # clojure-norway (3)
- # clojure-spec (4)
- # clojure-switzerland (1)
- # cursive (3)
- # datomic (6)
- # emacs (17)
- # etaoin (2)
- # expound (1)
- # fulcro (4)
- # graphql (4)
- # honeysql (7)
- # introduce-yourself (2)
- # jackdaw (5)
- # malli (3)
- # meander (19)
- # nbb (3)
- # off-topic (35)
- # pathom (6)
- # pedestal (4)
- # polylith (31)
- # rdf (11)
- # re-frame (8)
- # reitit (6)
- # shadow-cljs (8)
- # specter (4)
- # squint (15)
- # vim (6)
Hello! I ran into some kind of performance issue with the code below. The time to execute seems to increase exponentially for every m<nr>
I add on the LHS. The code below takes about 10secs to execute. Any ideas?
(m/rewrite {0 {:id 0
:m1 {:a {:id :a, :x 0}}
:m2 {:b {:id :b, :x 0}}
:m3 {:c {:id :c, :x 0}}}
1 {:id 1
:m1 {:a {:id :a, :x 0}}
:m2 {:b {:id :b, :x 0}}
:m3 {:c {:id :c, :x 0}}}}
(m/map-of _
{:id !id
:m1 {& (m/seqable [!m1-ids !m1s] ..!m1-cnt)}
:m2 {& (m/seqable [!m2-ids !m2s] ..!m2-cnt)}
:m3 {& (m/seqable [!m3-ids !m3s] ..!m3-cnt)}})
[{:id !id
:m1-ids [!m1-ids ..!m1-cnt]
:m2-ids [!m2-ids ..!m2-cnt]
:m3-ids [!m3-ids ..!m3-cnt]
} ...])
I spent some time narrowing it down to test-m2
being slow compared to the others below
;; fast
(defn test-m1 [x]
(m/rewrite x
(m/map-of _
{:id !id
:m1 [!m1s ...]
:m2 [!m2s ...]
:m3 [!m3s ...]
:m4 [!m4s ...]})
[{:id !id
} ...]))
;; slow
(defn test-m2 [x]
(m/rewrite x
(m/map-of _
{:id !id
:m1 [[!m1-ids !m1s] ...]
:m2 [[!m2-ids !m2s] ...]
:m3 [[!m3-ids !m3s] ...]
:m4 [[!m4-ids !m4s] ...]
})
[{:id !id
} ...]))
;; fast
(defn test-m3 [x]
(m/rewrite x
{:id !id
:m1 [[!m1-ids !m1s] ...]
:m2 [[!m2-ids !m2s] ...]
:m3 [[!m3-ids !m3s] ...]
:m4 [[!m4-ids !m4s] ...]
}
[{:id !id
} ...]))
test-m1
is faster than test-m2
because it has fewer memory variables. Repeating the pattern upto m8 in test-m1
will make it perform as test-m2
.
test-m3
does not show the same behaviour. So my working conclusion is that it is related to map-of
and the amount of memory variables
I guess this could be a similar issue as #234 https://github.com/noprompt/meander/issues/234#issue-1292196144 I will leave it at that for now and try a different solution.
Hello! I'm starting to learn my way through meander and I hoped someone could help me with (what I believe is) a simple use case I struggle with: I have 2 csvs (vector of vectors) with one holding a vector of ids in the other csv (tags). I want to collect all the tags that match the relevant item. Something like this, but I haven't wrapped my head around matching \ collecting \ spreading:
(defn data-mapper [data]
(m/search data
{:data (m/scan _ ?product)
:tags (m/scan _ (m/pred #(contains? (split (?product 36) ";") (% 0)) tag?))}
{:products {"name" (?product 12)
"description" (?product 13)
"price" (js/parseInt (?product 17))
"media" {"data" {"src" (?product 24)}}
"product_tags" {"data" [{"name" (tag? 3)} ...]}}}))
If anyone can point in the right direction that would be great ^_^Hi Lidor, are you able to provide a sample of your input data and expected output data? Also, note that it is only safe to use ?product
in the m/pred
function like that for small maps e.g. PersistentArrayMap
.
well it will be pretty hard because the input is quite dirty, its is basically a parsed csv (vector of vectors) with 54 columns and the first vector being the headers, so something like this:
{:products [["name" "some" "unimportant" "values" ... "description" ... "price" ... "media" ... "tags" ...]
["some-name" "bla" "bla" "bla" ... "some long description" ... "42" ... "" ... "238;239;756;785;1111;" ...]
...]
:tags [["Category ID" ... "Category Name" ... "Parent" ...]
["238" ... "catcat" ... "catcat's mom"]
["239" ... "catcat's mom" ... ""]
...]}
And the output should look something like this:
[{"name" "some-name"
"description" "some long description"
"price" 42
"media" {"data" {"src" ""}}
"product_tags" [{"name" "catcat" "parent" "catcat's mom"}
{"name" "catcat's mom" "parent" ""}
...]}
...
]
I'm guessing I didn't use meanderright for this task, I'm just starting to learn its deeper powers 😅I could clean the input before inputting into meander but I was hoping to be able to use meander for that as well...
For the CSV, personally, I would zipmap
the fields or pluck them out with nth
, etc. in a preprocessing step and then run them through meander.
I would also index the tags as well. You can then do the joins much more easily (and more legibly).
Then you could
{:products (m/scan {:as ?product ,,,})
:tags (m/scan {:name (m/pred #(contains ,,,) ,,,})}
Great I was wondering where I should do the cleaning, now I have an answer 😁
Thank you!
So I came up with this:
(m/search data
{:products (m/scan {"Product Name" ?product-name
"Meta Tag Description" ?description
"Price" ?price
"Image(Main image)" ?media
"Length" ?length
"Width" ?width
"height" ?height
"Weight" ?weight
"Manufacturer" ?manufacturer
"Product Tags" ?product-tags
"Categories id" ?categories-id})
:categories (m/scan {"Category ID" (m/pred #(some #{%} (split ?categories-id ";")))
"Category Name" ?category-name
"Parent" ?parent})}
{"name" ?product-name
"description" ?description
"price" ?price
"media" {"data" {"src" ?media}}
"categories-id" (split ?categories-id ";")
"category-name" [?category-name]})
And it seems that the second scan emits every match as its own entry so instead of this:
[{"name" "some-name"
"description" "some long description"
"price" 42
"media" {"data" {"src" ""}}
"product_tags" [{"name" "catcat" "parent" "catcat's mom"}
{"name" "catcat's mom" "parent" ""}
...]}
...
]
I get this:
[{"name" "some-name"
"description" "some long description"
"price" 42
"media" {"data" {"src" ""}}
"product_tags" [{"name" "catcat" "parent" "catcat's mom"}]}
{"name" "some-name"
"description" "some long description"
"price" 42
"media" {"data" {"src" ""}}
"product_tags" [{"name" "catcat's mom" "parent" ""}]}
...
]
which actually makes sense as I understand meander better, but for my task I need to aggregate all of the tags under the relevant product's "product_tags"I figured what I'm looking for is a join + group by
I found a previous answer that suggests to do the group-by out side of meander. can't wait to see how far will you go with meander, for now I'm happy with the current solution, thank you! P.S I invested time learning meander as I believe in it as a general declarative solution for the complicated data transformations in our company, so... rooting for ya!