This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2018-05-10
Channels
- # beginners (35)
- # cider (165)
- # cljsrn (18)
- # clojars (1)
- # clojure (141)
- # clojure-greece (2)
- # clojure-italy (11)
- # clojure-nl (1)
- # clojure-spec (21)
- # clojure-uk (89)
- # clojurescript (56)
- # community-development (3)
- # cursive (3)
- # data-science (55)
- # datomic (13)
- # emacs (12)
- # fulcro (31)
- # graphql (6)
- # jobs-discuss (35)
- # lein-figwheel (10)
- # mount (2)
- # off-topic (3)
- # onyx (22)
- # parinfer (4)
- # portkey (7)
- # re-frame (29)
- # ring-swagger (4)
- # shadow-cljs (37)
- # specter (9)
- # sql (30)
- # tools-deps (15)
- # vim (2)
- # yada (17)
mΓ₯ning
Morning.
My request to JFrog to mirror Clojars in JCenter repository seems to be done (I am on hols, so will test next week). This mirror should give Clojure developers behind corporate firewalls access to all the Clojure libraries that aren't pushed to Maven Central.
hello!
not bad thanks how bout yourself
what you working on today?
It must be top secret π
the grand old duke of york he had 10000 databases he migrated them all to the origin and he migrated them down again πΆ
sorry to crash in as usual with offtopic questions
anybody used sparkling or flambo for spark stuff?
I can't be fucked to learn scala and how to do this shit in azure in the same week, and I can probably win one argument but not both
haha yes, morning
@alex.lynham you probably want @otfrom or @jasonbell to grumble at you
any opinions or grumbling muchly appreciated Β―\(γ)/Β―
I'm just a linked data boy in a big data world
@alex.lynham I've used Sparkling a fair bit. Really like it.
If its just for core Spark then it's perfect, if it's beyond that then you'll have to see if it's been updated as the streaming wasn't implemented (Flambo had it but it was usable but not perfect)
> as the streaming wasn't implemented
assume that I haven't used spark before - why would this sway me one way or the other?
@alex.lynham if you only want to use spark for batch queries then you won't mind if streaming isn't supported in your clojure lib... if you want to use it for stream-processing (something like windowing over a conceptual endless stream of data) then you might want spark streaming support
riiiiiiiight okay
well at the moment I think the only data we have will be static or batch anyway
but we will want to support streaming eventually
but, depending on what you are wanting to do, you might also want to look at https://kafka.apache.org/documentation/streams/ for stream processing
so the thing I don't fully understand about spark is why I don't just use kafka and then use clojure to interact with the streaming API
(I get that e.g. databricks or spark python API is more data scientist friendly)
but what's the USP of spark?
spark-batch is a good fit for us for ad-hoc queries against cassandra. spark-streaming might look good to me if i wanted to avoid adding another component
but i'm currently more interested in the streaming-as-a-lib deployment model offered by kafka-streams... spark-streaming doesn't seem very shiny to me atm
> streaming-as-a-lib deployment model because rather than being standalone it's something you can interact with inside of your app in clj?
no, for straightforward deployments and monitoring. we're currently using onyx, and you run a bunch of onyx peers as processes, and then submit jobs... stuff like configuring your jobs is a bit painful and has to use a different mechanism to other components (like our api)
and because the peers are onyx processes, i can't just run a healthcheck listener in each process, and link that to dc/os health monitoring
@alex.lynham for batch then Sparkling would be fine.
@alex.lynham I've blogged a fair bit about it over the years (yes I just said that) https://dataissexy.wordpress.com//?s=sparkling&search=Go
so my situation is I'm one of maybe three engineers with a data background (engineering is circa 20 plus contractors) in engineering, data science is mostly people who are used to using SQL and Power BI, data engineering (separate function) is almost entirely outsourced to a company that... so far as I can tell aren't the most up-to-date
but it means we're at once having to deal with engineering inexperience and lack of resource as well as needing v user friendly stuff at the other end... which I think is where the spark notion has come from
@jasonbell oh shit, you're that jason bell? oh right yeah I've read your blog, it's really useful
(ducks down behind a wall)
@alex.lynham <<oh shit, you're that jason bell? oh right yeah I've read your blog, it's really useful>> Please can I put that in my ClojureX slides?
On an aside from the Jade question... I know someone that's hoping to put together a beginners kafka study group (most likely a remote thing). Can I voluntell anyone for the role of group mentor? (cough @mccraigmccraig cough @jasonbell cough)
@jasonbell 100%, a few of your blog posts have been really really good for helping me spot dumb shit
haha, my kafka experience is narrow... @minimal is doing lots with kafka atm as well tho

in fact I just realised why my json parser was doing something unexpected - because it's reading one character at a time, so probably somewhere I needed to explicitly call .getBytes
so chalk up another one sir
@yogidevbear getting me in a good mood isn't going to help. I'm up to the eyeballs at the mo.
No worries, figured it was worth asking π
Craig has thrown @minimal under the bus for you both anyway π
Chris, feel like mentoring a group of people interesting in learning kafka?
@yogidevbear that makes the assumption that I know kafka π¬
Well... do you? π
@alex.lynham yes messages are byte arrays and need deserialising
@otfrom that will leave the real Taylor Swift feeling rather jaded
yeah I've realised that I'm comp
ing together elements at producer and consumer end and on the one that uses nippy vs json the getBytes call is partialed, and in the JSON one it's not. d'oh
Lol, sorry, I couldn't resist that one π