Fork me on GitHub
#sql
<
2021-04-29
>
seancorfield16:04:43

I’m not familiar with Metabase but reducing a plan onto a core.async channel and processing it that way sounds like a reasonable approach to me. Have you seen this @henrikheine https://www.grammarly.com/blog/engineering/building-etl-pipelines-with-clojure-and-transducers/ — it ends up using core.async and pipelines and transducers. A quick Bing search suggests that most of the ETL talks around Clojure rely either on Spark/Hadoop etc or on Datomic.

henrik4206:04:04

Nice post. Just to clarify: does plan just mean transducer/reduceable or something special you're referring to?

seancorfield06:04:23

next.jdbc/plan produces an IReduceInit so it is reducible and mostly transducible.

seancorfield06:04:11

It allows for streaming very large result sets (eagerly).

henrik4206:04:09

Got it. Thx

lukasz16:04:55

Metabase is not going to help you here, it's just a RO interface to visualizing data stored in various data stores

henrik4219:04:55

Ja, I thought I could use it for exploring the database and may run some reports in order to find outliers and do sane checks, comparing the dest db to source db etc. Any suggestion for an alternative? Something in clojure or java would be cool.

lukasz19:04:37

Oh, for that metabase is great - we've been using it basically since it launched few years ago. We use it for all sorts of reports, debugging and big-data-ish stuff for the data stored in BigQuery

👍 4
lukasz16:04:03

It doesn't do any data processing on its own