2024-12-03 data-science | Clojure Slack Archive

# 100-days-of-code # _silence # aatree # admin-announcements # adventofcode # ai # ai-skeptics # alda # aleph # all-the-channels # announcements # arachne # architecture # asami # atlanta-clojurians # atom-editor # autochrome-github # avi # aws # aws-lambda # babashka # babashka-sci-dev # bangalore-clj # beginners # berlin # biff # bigdata # bitcoin # boot # boot-dev # boulder-clojurians # braid-chat # braveandtrue # brevis # bristol-clojurians # business # calva # capetown # carry # catalyst # cbus # cestmeetup # cherry # chestnut # chlorine-clover # chrondb # cider # circleci # clara # clerk # clj-commons # clj-http # clj-http-lite # clj-kondo # clj-on-windows # clj-otel # clj-together # clj-yaml # cljdoc # cljfx # cljs-dev # cljs-experience # cljsfiddle # cljsjs # cljsrn # clojars # clojure # clojure-android # clojure-argentina # clojure-art # clojure-austin # clojure-australia # clojure-austria # clojure-bangladesh # clojure-bay-area # clojure-beijing # clojure-belgium # clojure-berlin # clojure-boston # clojure-brasil # clojure-canada # clojure-chennai # clojure-chicago # clojure-china # clojure-colombia # clojure-conj # clojure-czech # clojure-denmark # clojure-denver # clojure-derby # clojure-dev # clojure-doc # clojure-dusseldorf # clojure-ecuador # clojure-egypt # clojure-estonia # clojure-europe # clojure-filipino # clojure-finland # clojure-france # clojure-gamedev # clojure-germany # clojure-greece # clojure-guangzhou # clojure-hamburg # clojure-hk # clojure-houston # clojure-hungary # clojure-india # clojure-indonesia # clojure-ireland # clojure-israel # clojure-italy # clojure-japan # clojure-kc # clojure-korea # clojure-losangeles # clojure-madison # clojure-mexico # clojure-miami # clojure-mk # clojure-mke # clojure-morsels # clojure-my # clojure-new-zealand # clojure-nl # clojure-nlp # clojure-norway # clojure-ohio # clojure-poland # clojure-portugal # clojure-provo # clojure-quebec # clojure-romania # clojure-russia # clojure-sdn # clojure-seattle # clojure-serbia # clojure-sg # clojure-shanghai # clojure-spain # clojure-spec # clojure-survey # clojure-sweden # clojure-switzerland # clojure-taiwan # clojure-turkiye # clojure-uk # clojure-ukraine # clojure-za # clojurebridge # clojurebridge-ams # clojurecup # clojured # clojuredesign-podcast # clojureindia # clojureremote # clojurescript # clojurescript-ios # clojuresque # clojureverse-ops # clojurewerkz # clojurewest # clojurex # clojurian-chat-app # clojutre # cloverage # cloxp # clr # code-art # code-reviews # component # conf-proposals # conjure # consulting # contributions-welcome # copenhagen-clojurians # core-async # core-logic # core-matrix # core-typed # cryogen # crypto # css # cursive # cz-clojure # d2q # data-oriented-programming # data-science # datacrypt # datahike # datalevin # datalog # datascript # datavis # dato # datomic # defnpodcast # deps-new # dev-tooling # devcards # devops # dirac # docker # docs # domino-clj # duct # dunaj # eastwood # editors # emacs # error-message-catalog # etaoin # ethereum # euroclojure # events # exercism # expound # figwheel # figwheel-main # flambo # fulcro # funcool # functionalprogramming # funimage # garden # ghostwheel # girouette # gis # google-cloud # gorilla # graalvm # graalvm-mobile # graclj # graphql # gratitude # gsoc # guix # hammock-driven-dev # helix # heroku # hispano # holy-lambda # honeysql # hoplon # hugsql # humbleui # humor # hypercrud # hyperfiddle # immutant # improve-getting-started # incanter # indycljs # inf-clojure # instaparse # integrant # interceptors # interop # introduce-yourself # iot # iotivity # ipfs # jackdaw # jaunt # java # javascript # javelin # jobs # jobs-discuss # jobs-rus # joker # joyride # jukebox # juxt # jvm # kaocha # keechma # kekkonen # keyboards # klipse # kosmos # lambdaisland # lazytest # ldnproclodo # lein-figwheel # leiningen # liberator # lingy # liquid # livestream # local-first-clojure # london-clojurians # lsp # luminus # lumo # mail # make-a-clojure # malli # matcher-combinators # mathematics # matrix # meander # melbourne # membrane # membrane-term # mental-health # microservices # mid-cities-meetup # midje # minecraft # missionary # monads # mount # mranderson # music # nbb # new-channels # new-clojure # nextjournal # nginx # nrepl # numerical-computing # nyc # obb # observability # off-topic # om # om-next # omni-trace # onyx # other-languages # other-lisps # overtone # pamela # parinfer # pathom # pedestal # pegex # perun # philosophy # phzr # planck # plastic # play-clj # podcasts-discuss # polylith # pomegranate # portal # portkey # portland-or # powderkeg # practicalli # precept # prelude # programming-beginners # proletarian # proton # protorepl # pulsar # pure-frame # qa # qlkit # quil # random # rdf # re-frame # react # reactive # reading-clojure # reagent # reclojure # reitit # releases # remote-jobs # rephrase # respo # rethinkdb # reveal # rewrite-clj # ring # ring-swagger # robots # rum # schema # sci # scittle # sfcljs # shadow-cljs # signaali # sim-testing # sioux-falls # slack-help # sneer # sneer-br # spacemacs # spam-reports # specmonstah # specter # speculative # spirituality-ethics # sql # squint # startup-in-a-month # sydney # test-check # test-doc-blocks # test200 # test345 # testify # testing # thejaloniki # timbre # tmp-json-parsing # tools-build # tools-deps # trading # transit # tree-sitter # uncomplicate # unrepl # untangled # utah-clojurians # videos # vim # vrac # vscode # wasm # web-security # windows # xtdb # yada # yaml # yamlscript # yleinen

data-science 2024-12-03

ts1503 2024-12-03T11:37:02.943939Z

Hey guys! Are anybody working with tech.ml.dataset and arrow files? I noticed that if I have an arrow file with multiple batches I have to use stream->dataset-seq function which gives me a lazy sequence of datasets. What is a common way to deal with that sequence? Should I concatenate all datasets into a single one for further processing?

genmeblog 2024-12-03T11:45:20.350129Z

If you want to aggregate data you can reach for a https://techascent.github.io/tech.ml.dataset/tech.v3.dataset.reductions.html. Functions defined there are prepared to work on the sequence of datasets. Otherwise concatenate.

ts1503 2024-12-03T11:46:27.438609Z

thanks, will take a look

Harold 2024-12-04T02:22:31.283599Z

Yes, it definitely depends on what you're going to do with the data - generally speaking reducing over a sequence of datasets is as common (or perhaps more common) than concatenating. Sequences of datasets are very normal, and come up all the time. Being familiar with them is a good idea.

👍 1

Clojurians Log v2

data-science 2024-12-03