This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2024-04-17
Channels
- # announcements (8)
- # babashka (27)
- # beginners (60)
- # biff (7)
- # calva (3)
- # cider (10)
- # cljs-dev (69)
- # clojure (18)
- # clojure-europe (12)
- # clojure-hungary (1)
- # clojure-korea (2)
- # clojure-new-zealand (12)
- # clojure-nl (1)
- # clojure-norway (80)
- # clojure-uk (9)
- # clojurescript (55)
- # cursive (69)
- # data-science (16)
- # events (5)
- # helix (11)
- # hyperfiddle (23)
- # jobs (1)
- # lsp (5)
- # malli (14)
- # matrix (1)
- # missionary (2)
- # off-topic (40)
- # portal (31)
- # re-frame (17)
- # reitit (11)
- # releases (11)
- # shadow-cljs (4)
- # squint (4)
- # tools-deps (5)
- # yamlscript (4)
I'm loading a csv
where one properties
column is a json object (well, a string that can be parsed into a json object).
In pandas I would use a combination of json.loads
and pandas.json_normalize
to get each top level key in the json as a new column.
Can I do something similar with tablecloth
?
I've never tried that but there is an option :parser-fn
where you can pass a parser for specific column. Something like (tc/dataset csv-file {:parser-fn {"properties" your-json-parser}})
Thanks @U1EP3BZ3Q - unfortunately that does not seem to do anything for me - but to clarify this is what I want: From:
(-> (tc/dataset [{:x 1, :y 2 :properties "{\"a\": 1, \"b\": 1}"}
{:x 2, :y 3 :properties "{\"a\": 2, \"b\": 2}"}]))
; | :x | :y | :properties |
; |---:|---:|------------------|
; | 1 | 2 | {"a": 1, "b": 1} |
; | 2 | 3 | {"a": 2, "b": 2} |
;
To:
; | :x | :y | :a | :b |
; |---:|---:|---:|---:|
; | 1 | 2 | 2 | 1 |
; | 2 | 3 | 2 | 2 |
(require '[clojure.data.json :as json])
(def ds-with-json (tc/dataset [{:x 1, :y 2 :properties "{\"a\": 1, \"b\": 1}"}
{:x 2, :y 3 :properties "{\"a\": 2, \"b\": 2}"}]))
(->> (:properties ds-with-json)
(map json/read-str)
(tc/dataset)
(tc/append (tc/drop-columns ds-with-json :properties)))
;; => _unnamed [2 4]:
;; | :x | :y | a | b |
;; |---:|---:|--:|--:|
;; | 1 | 2 | 1 | 1 |
;; | 2 | 3 | 2 | 2 |
I think json/read-str
can be passed to parse-fn
in the ds creation from csv file like I described in my first comment.
(-> (tc/dataset [{:x 1, :y 2 :properties "{\"a\": 1, \"b\": 1}"}
{:x 2, :y 3 :properties "{\"a\": 2, \"b\": 2}"}])
(tc/write-csv! "/tmp/out.csv"))
(-> (tc/dataset "/tmp/out.csv"))
; /tmp/out.csv [2 3]:
;
; | x | y | properties |
; |--:|--:|------------------|
; | 1 | 2 | {"a": 1, "b": 1} |
; | 2 | 3 | {"a": 2, "b": 2} |
;
(-> (tc/dataset "/tmp/out.csv"
{:parser-fn {"properties" json/parse-string}}))
; /tmp/out.csv [2 3]:
;
; | x | y | properties |
; |--:|--:|------------------|
; | 1 | 2 | {"a": 1, "b": 1} |
; | 2 | 3 | {"a": 2, "b": 2} |
If I follow the docstring in tc/dataset I can tweak it into this shape:
(-> (tc/dataset "/tmp/out.csv"
{:parser-fn {"properties" [:map json/parse-string]}}))
; /tmp/out.csv [2 3]:
;
; | x | y | properties |
; |--:|--:|----------------|
; | 1 | 2 | {"a" 1, "b" 1} |
; | 2 | 3 | {"a" 2, "b" 2} |
For completeness, this works well for me:
(let [ds-with-json (tc/dataset "/tmp/out.csv"
{:key-fn clojurize
:parser-fn {:properties [:map #(json/parse-string % clojurize)]}})]
(-> (:properties ds-with-json)
(tc/dataset)
(tc/append (tc/drop-columns ds-with-json :properties))))