Fork me on GitHub
#specter
<
2022-08-29
>
Alexandre Gomes15:08:10

Hi! I'm a Clojure newbie and found specter after googling for nested/complex data processing on Clojure. I am considering it for my learning journey and (hopefully) applying to some challenges I'm currently facing, but I have a question regarding its functionality: Does specter allows flattening JSON data dynamically? And by that I mean: • Unnest each and every nested key without knowing their names or paths; • Explode each and every array key, also without knowing their names or paths; • Move all exploded/unnested keys to the top/root level; • Able to handle a moderate amount of JSON data at once (~2GB). The idea is to dynamically transform a deeply nested JSON into a flat data file (it can be CSV, JSON, Parquet, etc., anything works for me), that is, without knowing its keys beforehand. Currently, I have a Scala application that does this by recursively using ClassTag/reflective functions to check if each key is a Struct or an Array and then performing the necessary steps to make it flat. If this kind of task is not possible or recommended with specter, let me know if there are any other tools/libraries that you find to be better suited for this particular challenge.

pppaul15:08:15

i've done this with clojure/walk and merge-with

👍 1
isak16:08:24

Probably possible with specter, but not sure. You may be interested in this, depending on your use case: https://github.com/cloojure/tupelo/blob/master/docs/forest.adoc

pppaul16:08:47

always wanted to get around to learning forest

isak16:08:44

Yea it seems like a great idea. I'm not sure he mentions it on that page, but the way it works is by flattening the tree, which is what the OP was asking about.

Alexandre Gomes17:08:13

Thanks for the suggestions @U0LAJQLQ1 and @U08JKUHA9, much appreciated. I will take a good look on these alternatives!

1
Alexandre Gomes17:08:51

Do you think I would be able to generate data in a tabular format using the tupelo.forest library? After flattening the structure, that is. I ask this because I also have JSON datasets that have similar structure, however those need to become tables after they are flattened.

isak17:08:59

I think so, yea. But haven't looked very closely beyond the talk and article.

Alexandre Gomes17:08:20

Great! Thanks for the reply, @U08JKUHA9

pppaul17:08:11

flatting nested objects is pretty easy, cept for how you want to name the keys that are are based on 10+ levels of aggregate keys

Alexandre Gomes17:08:44

@U0LAJQLQ1 In my Scala application, I take the parent's key name and use it as a prefix for the "child" key. As you can imagine, the names get pretty big, but it works fine for my purpose. I honestly have no idea how to do that on Clojure yet, though 😂

pppaul17:08:08

when i did this i made my key names similar to what i would use for (get-in args for the original json (minus array indices)

1
👍 1
pppaul17:08:54

if you do something like that, then you can have a name function that does something smart with that type of data