This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-04-05
Channels
- # announcements (3)
- # babashka (135)
- # beginners (82)
- # calva (55)
- # chlorine-clover (23)
- # cider (13)
- # clara (1)
- # clj-kondo (39)
- # cljs-dev (1)
- # cljsrn (2)
- # clojure (96)
- # clojure-france (3)
- # clojure-uk (24)
- # clojuredesign-podcast (1)
- # clojurescript (56)
- # conjure (73)
- # core-typed (1)
- # cursive (1)
- # datomic (10)
- # fulcro (57)
- # joker (4)
- # juxt (1)
- # malli (20)
- # meander (2)
- # off-topic (54)
- # re-frame (4)
- # reagent (3)
- # shadow-cljs (11)
- # spacemacs (6)
- # sql (26)
- # tools-deps (7)
It seems like the transducer for filter
when applied with sequence
isn't as lazy as the lazy variant of filter
:
(defn printer
[xf]
(fn
([] (xf))
([result] (xf result))
([result input]
(print input "")
(xf result input))))
(def s
(sequence
(comp (filter odd?)
printer)
(range 100)))
(def l
(->> (range 100)
(filter odd?)
(map #(do (print % "") %))))
(take 1 s)
;; Prints: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49 51 53 55 57 59 61 63 65
(take 1 l)
;; Prints: 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
This isn't the case for all transducers, for example:
(def s
(sequence
(comp (map inc)
printer)
(range 100)))
(take 1 s)
;; Prints: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
it seems like the chunk size the is the same
(defn printer
[xf]
(fn
([] (xf))
([result] (xf result))
([result input]
(print input "")
(xf result input))))
(def s
(sequence
(comp (filter (constantly true))
printer)
(range 100)))
(def l
(->> (range 100)
(filter (constantly true))
(map #(do (print % "") %))))
(take 1 s)
;; prints 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
(take 1 l)
;; prints 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
here, the only difference is filtering with (constantly true)
vs odd?
if you think that makes no sense, try:
(defn printer
[xf]
(fn
([] (xf))
([result] (xf result))
([result input]
(print input "")
(xf result input))))
(def pred #(> % 50))
(def s
(sequence
(comp (filter pred)
printer)
(range 100)))
(def l
(->> (range 100)
(filter pred)
(map #(do (print % "") %))))
(take 1 s)
;; 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83
(take 1 l)
;; 51 52 53 54 55 56 57 58 59 60 61 62 63
Hum... actually, this might make me think of something. It seems that with transducer, it grabs 32 result elements, while with lazy-seq it will grab 32 elements from the input coll maybe?
I think the moral of the story is that relying on a certain amount of laziness is asking for trouble. I don’t think there are any guarantees and it may even change from one version of clojure to another
Ya, I do know that, and I'm not too worried if 32 becomes 50 or 84 as long as the magnitude doesn't change, like going from 100 to 1000 would be more problematic
i think it’s like you were saying, when you’re doing (filter pred coll)
, filter
itself is chunking on the input whereas when you use (filter pred)
with sequence, then take
is doing the chunking and it’s not even seeing some elements
(defn printer
[xf]
(fn
([] (xf))
([result] (xf result))
([result input]
(print input "")
(xf result input))))
(def pred1 #(zero? (mod % 2)))
(def pred2 #(zero? (mod % 4)))
(def pred3 #(zero? (mod % 8)))
(def s
(sequence
(comp (filter pred1)
(filter pred2)
(filter pred3)
printer)
(vec (range 200))))
(def l
(->> (vec (range 200))
(filter pred1)
(filter pred2)
(filter pred3)
(map #(do (print % "") %))))
(take 1 s)
;; 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128 136 144 152 160 168 176 184 192
(take 1 l)
;; 0 8 16 24
so in this example, take
is looking at 30 numbers and only 4 results. with the transducer, it’s still taking 30 numbers, but the filtering happens first so it’s taking 30 filtered numbers
it’s actually one of the key features of transducers that they don’t box and unbox on every step. so when a transducer filters, “nothing happens”, but it can change the chunking behavior for lazy operations
What are the options for reading edn data in non-clojure languages? I thought this was what transit was for (specifically transit-js) but seems like it isn't. I guess it's because the edn format supports things that most languages don't have like keywords?
Hum, it's more that Clojure isn't too popular yet and most other languages haven't adopted EDN really
There are some to find in the wild, but I don't know of a list. Such as for python: https://github.com/swaroopch/edn_format
@U0K064KQV thanks, looks like this is the official list: https://github.com/edn-format/edn/wiki/Implementations
Even if a language doesn’t have a transit encoder, you can layer one on top of a JSON encoder relatively easily.
Its only a transport protocol for exchanging data between systems, which uses JSON under the hood (and some of its binary variants) I think
But ya, its good if you want to pass data between a Clojure/Script app and other apps in different language, since it uses JSON based protocols, most languages will have a fast and good parser for it
But for say storing data in your DB, or exchanging documents I don't think it is as good
> NOTE: Transit is intended primarily as a wire protocol for transferring data between applications. If storing Transit data durably, readers and writers are expected to use the same version of Transit and you are responsible for migrating/transforming/re-storing that data when and if the transit format changes.
> The design of Transit is focused on program-to-program communication, as opposed to human readability. While it does support an explicit verbose mode for representing Transit elements in JSON (called JSON-Verbose), Transit is not targeted for situations where human readability is paramount.
EDN is more of an alternative for JSON, where transit seems to target more MessagePack (which it uses), protobuf, ion, avro, thrift, etc.
But yeah, transit specifically says that it should not be used for persisted data because the spec might change. https://github.com/cognitect/transit-format#implementations
From reading this it seems like transit, since it is it's own format and can be extended with custom tags, is probably a better option than using a specific languages edn parser?
If you want to communicate from a Clojure/Script app to an app in a different language, I'd say Transit is better than trying to use EDN yes.
If you want to say export some data form a Clojure/Script app which can be imported in some other app in some other language. EDN might be an option, though the lack of good support in other languages might mean you'd be better off using XML or JSON, or CSV, etc.
@U0CLLU3QT EDN can be extended with custom tags as well.
The primary differences are: 1. Transit is layered on top of JSON for easier performant translation to other langs. 2. Transit has compression built-in.
https://github.com/greglook/clj-cbor is another option:
> CBOR is a binary encoding with the goal of small code size, compact messages, and extensibility without the need for version negotiation. This makes it a good alternative to https://github.com/edn-format/edn for storing and transmitting Clojure data in a more compact form.
I still think it depends when you store data durably, binary formats are always at a disadvantage in my opinion, so I'd favour XML, CSV, JSON or EDN personally unless you really can't deal with the size. Especially for archival.
Because a binary format isn't self explanatory, you need to have support for a deserializer for it, which if you don't, is harder to write yourself, and if you can't tell what binary format you're dealing with it becomes much harder to figure out what to use to decode it
Whereas give JSON or EDN to someone, and without any knowledge they can probably figure out what they're dealing with and write a parser themselves
I'm not sure why the Transit readme mentions that, but it does. Either because of its use of MessagePack as a binary format, or I'm guessing it reserves itself the right to change in backward incompatible ways in the future, for example changing from MessagePack to CBOR, or making changes to the protocol.
Dunno, but ya, I guess in practice it hasn't had a backward breaking change and newer versions can still read old versions, so there's that, just like many things in Clojure, stable and working but not explicitly committed
@U5H74UNSF No. When Rich et al say something isn’t finalized and is subject to change, they mean it. Just because you “feel” like it’s stable, doesn’t make it so.
If you’re willing to either take the outright risk, or if you’re willing to organize version changes (via e.g. mass migrations), or you’re willing to potentially never upgrade: then yes. Use it for durable storage.
To be clear: The tradeoffs here are not very subtle. I just listed them all out, and they’re manageable. But transit is not “very stable,” and using it for durable storage does come with those tradeoffs.
and the warning re durable use was toned down last year https://github.com/cognitect/transit-format/commit/9466eab97ba6a876757d901a9d340abc247b8716
> The Transit format has thus far had only one version (0.8) and has not changed in several years.
Risk is not bad, but IMO not really worth it in this example (when you can presumably use EDN).
I mean, I think Alex’s note just clarifies the risks. It doesn’t remove any risk whatsoever.
I suspect there’s a reason Alex’s commit does not bump it to 1.0. I strongly suspect they have an idea that might cause a breaking change.
I've got a Java object I want to explore with the REPL. I haven't worked that much with Java from Clojure.
Are there any nice functions for REPL explorations I could use?
In Clojure, I'd reach for doc
and dir
.
Thanks!
@teodorlu there's a section for exactly that in the REPL guide: https://clojure.org/guides/repl/data_visualization_at_the_repl#_dealing_with_mysterious_values_advanced
Thanks! I totally forgot about the REPL guide. I appreciate the qualitative discussion. If I remember correctly, you had a hand in writing that guide?
Yes I did (but wasn't trying to do any self-promotion)
@teodorlu Depending on exactly what you mean by "explore", you might find org.clojure/java.data
useful as a library.
Or there's bean
built-in (but it does a lot less -- see the comparison in the README https://github.com/clojure/java.data#feature-comparison-to-clojurecorebean )
Thanks! (javadoc my-object)
managed to take me to a google search which found the class. Not sure if that was intended behavior, but it sure worked. Thanks!
(didn't find javadoc
at first, was looking just in the clojure.repl
namespace)
> exactly what you mean
Eg I'd like to know that I could have called .getType
and .getValue
on a PGobject
I got
The combination of javadoc
, reflect
and bean
was perfect -- the former for static documentation and the latter for dynamic exploration. Thanks!
Full namespaces for use from an editor (and don't have clojure.repl
and such preloaded):
(def e (return-some-java-obj))
;; Look for static docs, might redirect to Google search for java class
(clojure.java.javadoc/javadoc e)
;; Dynamically explore all the info we've got in a data structure
(clojure.reflect/reflect e)
;; Dynamically explore a "tight" data model of e
(clojure.core/bean e)
I'm trying to understand the details of ref
's :max-history
and :min-history
options. It makes sense that dosync
transactions retry on alter
for refs that have changed in value in the meantime, so the only use I can think of for these options is on deref, and not commit (since it retries given any change). I'm not sure how the STM/MVCC is implemented behind the scenes, but I'm imagining a timestamp-like value recorded at the beginning of the transaction. Because you can't really know (at least I think this is true) what refs will be accessed during the transaction (let's say you have some long computation before derefing the ref), I'm guessing the history queue is there to provide a sliding buffer for values of refs in case they're being modified in another thread. If the timestamp doesn't exist in the scope of the queue by the time the deref happens inside a dosync
, the transaction retries. I'm guessing that these types of read-only retries increments the history queue by 1, and that's why the :min-history
option exists to provision a queue upfront.
Am I on the right track? The docstring for ref
is a little terse and this is the best interpretation I could come up with
@teodorlu there's a section for exactly that in the REPL guide: https://clojure.org/guides/repl/data_visualization_at_the_repl#_dealing_with_mysterious_values_advanced
I must be missing something... is there a way to set the version in pom.xml as generated by clj -Spom
?
No. I added options to clj-new
that let you control more things when it generates the initial pom.xml
file (but that's separate from clj -Spom
)
OK, so it's a manual process then. I mainly worry about keeping dependencies up-to-date in large projects, so regenerating the pom and setting the project version prior to publishing the jar is the way to go, I presume.
clj -Spom
updates just the dependencies. It doesn't touch the rest of the file.
@cpmcdaniel If you intend to deploy a JAR to Clojars and, especially, if you plan to use http://cljdoc.org for that library, you need quite a bit more in the pom.xml
file than clj -Spom
creates initially. That's why clj-new
creates a full-featured pom.xml
file.
(but, yeah, then you need to manually update the version
and the tag
elements for each new release, and you need to run clj -Spom
to update pom.xml
if you change the project's dependencies)