clojure 2017-03-11 | Slack Archive

Can you chunk records when writing to a file using transit?

@tbaldridge im mucking with some ruby. i have a lot of records to write to a file and want to chunk the set of items returned from a query, and use the transit writer to write chunks of 5,000.

tbaldridge01:03:56

ah I see, the problem isn't with records, it's with the collection they are in.

tbaldridge01:03:58

From what I know of transit, I have to say no, since the structure of the file and the reader/writer don't support that.

tbaldridge01:03:16

I'm pretty sure you'd have to do it manually via multiple calls to (write ...)

devn01:03:42

@tbaldridge i guess the other thing i was wondering was if the writer kept track between different calls of write

devn01:03:11

like if there were a repeating value, would it know on the next call to write?

tbaldridge01:03:27

No, and that's a debate I've had with some of the designers of transit, (I forget if it was Rich or someone else). The rationale is that each thing written should be self-contained, so that if someone wants to drop one of the messages, or it just gets tossed by some transport protocol, the caches aren't screwed up.

tbaldridge01:03:11

Since the caches work on a rolling numerical count of cached values, both the reader and writer have to be in sync.

devn01:03:28

@tbaldridge hmmm, so it sounds like if i want to chunk, I'll actually need to create a separate file for each chunk instead of reusing the writer

tbaldridge01:03:48

No you can reuse the same writer and file multiple times

devn01:03:00

oh i must have misread what you were saying there

tbaldridge01:03:02

You just need multiple calls to write

devn01:03:39

ahh, i see

devn01:03:30

@tbaldridge i thought you were saying that because of the rolling numerical count of cached values, multiple calls to write on the same file would result in some sort of malformed thing that the reader wouldn't like

tbaldridge01:03:43

No that's fine, and all of this is transparent

tbaldridge01:03:56

The cache will just reset with every call to write.

tbaldridge01:03:07

So it's optimal to do something like this:

tbaldridge01:03:35

(doseq [chunk (partition-all results 500)]
  (write writer chunk))

devn01:03:18

@tbaldridge i was doing something like that (though on the ruby side)

devn01:03:49

but instead of [[...][...]], it wrote [...][...]

tbaldridge01:03:10

yes, and so on the other end you'll need multiple calls to read.

devn01:03:40

that makes sense, don't know why i didn't think of that 🙂

tbaldridge01:03:54

That's just the nature of the beast, some transit readers may stream (Clojure does I think), but others are pretty much (json->transit (read-json file))

tbaldridge01:03:18

So you have to fully read in a JSON object before you can even start to read in the transit.

devn01:03:33

was hoping to stream the set to a file on S3

devn01:03:50

but multipart upload with chunked writes is option 2

tbaldridge01:03:53

🙂

tbaldridge01:03:23

(not that the format has changed in the past 2 years or so, but you never know)

devn02:03:54

@tbaldridge nod

devn02:03:12

@tbaldridge so, maybe i misunderstand what you mean by multiple reads

devn02:03:32

subsequent calls to read don't seem to return additional chunks

devn02:03:29

p. sure i screwed something up, nevermind

devn02:03:32

maybe i found a bug in transit?

qqq02:03:10

is https://en.wikipedia.org/wiki/The_Art_of_the_Metaobject_Protocol relevant in clojure, or are the two incompatible ?

devn02:03:33

@qqq that book was on rich's list of books that he read when working on clojure IIRC

devn02:03:49

so, yes i think it is

devn02:03:37

@tbaldridge looks like i did find a bug after all

devn02:03:57

i call read multiple times on the file i create with ruby on the clojure side, and it works

devn02:03:07

but on the ruby side i can only read the first chunk

bcbradley08:03:18

@juanmp In OO you cannot execute arbitrary functions on arbitrary data, because some data is private and can only be modified by member functions. A class is basically a rule that says you cannot execute arbitrary functions on specific data; there is a well defined set of functions you can execute on specific data (the member functions). Inheritance allows you to extend that set so that you can execute parent functions on a child type. In clojure you can execute arbitrary functions on arbitrary data, and no data is private. You can prevent the program from executing arbitrary functions on specific data by preventing yourself from writing those invocations. Clojure doesn't assume it is smarter than you are.

andrea.crotti09:03:33

anyone knows how to debug this error?

andrea.crotti09:03:38

remote:        java.lang.RuntimeException: No dispatch macro for: =, compiling:(core.clj:39:1)        
remote:        Exception in thread "main" java.lang.RuntimeException: No dispatch macro for: =, compiling:(core.clj:39:1)

andrea.crotti09:03:53

it happens trying to deploy to Heroku a chestnut project

andrea.crotti09:03:08

and it seems related to leiningen + environ

andrea.crotti09:03:22

and maybe AOT compilation

andrea.crotti09:03:46

but the errot itself seems rather obscure to me

andrea.crotti09:03:38

if I don't use (env :port) to get the port number it doesn't do it anymore, but Heroku needs that since the port is set dynamically

andrea.crotti09:03:05

mm alright using (System/getenv "PORT") instead of (env :port) makes it work, would be nice to understand what is going on there though

andrea.crotti09:03:39

argh ok I found it: https://github.com/weavejester/environ/issues/48

andrea.crotti09:03:58

it was just this setting in project.clj messing everything up :hooks [environ.leiningen.hooks]

manoj937211:03:27

Anyone here ?

bcbradley12:03:18

yes

scknkkrer14:03:03

Guys, I need some help about my programming life. I grow up with c++ and OOP, and then I saw clojure. And falled in love with it. The problem is I can’t really embrace functional thinking I guess. I can write code but I really feel bad and dirty on my code. I need a mentor for this, anyone ?

chillenious14:03:31

Read lots of (Clojure) code from others. I’ve been coding professionally for 19+, mostly OO but also several adventures in Clojure-land, and it blows my mind every time how sophisticated many of the Clojure libraries are (and how I probably wouldn’t have been able to write them). @scknkkrer

scknkkrer14:03:42

Ok, I will. Do you suggest me any way ? Like github ?

chillenious14:03:36

Besides books on Clojure, check out some of the Clojure libraries you use and try to figure out what makes them tick.

chillenious14:03:59

I don’t know, maybe others have more pointed advice for you 🙂

scknkkrer14:03:49

And I have a question about pure functions. I know this is fundemental, basic. But I want to clear something in my head. f(x) -> x * 2 is a nice defination. just call it with one numerical argument and it gives a mult. by two. What if I wanted to call another function in f ? Then it’s return value not solely depends on it’s argument. f(x) -> g(x) * 2; ----------> is this a pure function ? I am not sure I can write my functions full depends on it’s arguments. Am I wrong ?

scknkkrer14:03:50

I am reading a lot by the way. But fully theoric. Not practical.

donaldball14:03:28

Most people would say that f is pure iff g is also pure, though you’re right to point out that f depends on the state of var-space. In practice, most clojurists tend to the view that vars should be immutable by default.

scknkkrer14:03:25

Whould I cool about using pure functions in pure functions ?

scknkkrer14:03:11

@chillenious, @donaldball, I am really confused now. 😄

bcbradley14:03:14

why wouldn't its return value not depend solely on its argument?

bcbradley14:03:23

f(x) looks like f(x) to everyone

bcbradley14:03:28

nobody knows it uses g(x)

bcbradley14:03:46

either f(x) is pure or it isn't

scknkkrer14:03:56

Think big functions. real-world functions are really bigger than in examples.

bcbradley14:03:05

its even possible (although difficult) to build a pure function from impure procedures

bcbradley14:03:57

for instance, all the operations your computer defines are stateful. They all change the registers or main memory.

bcbradley14:03:08

and yet clojure and haskel are pure, and are implemented in terms of those procedures

bcbradley14:03:03

if you only use pure operations in the implementation of a function, the function itself is pure

scknkkrer14:03:03

I am concerned about my function’s purity. Can I mark it with pure tag ?

bcbradley14:03:28

if you don't that doesn't necessarily imply that the function isn't pure, (although liklihood is a different matter)

bcbradley14:03:00

the convention in clojure is to mark impure functions or "procedures" with an exclamation point after the name

bcbradley14:03:10

(defn foo! [a b] ...)

bcbradley14:03:37

(foo! 2 3)

scknkkrer14:03:43

Yeah, like that. So, is the second f defination pure ?

bcbradley14:03:27

which one are you referring to?

scknkkrer14:03:41

f(x) -> x * 2 -------> I know this is pure. f(x) -> g(x) * 2; ----------> But, is this a pure function ?

bcbradley14:03:53

is g(x) pure?

scknkkrer14:03:38

is it depends on it ?

bcbradley14:03:18

if g(x) takes any x and returns that same x on the first 2 invocations, but on the third and thereafter returns 0, then it is stateful and impure

bcbradley14:03:35

f(x) would have similar behavior, only it would return 2*x on the first two invocations, then 0

Prakash14:03:12

You can say that f(x) is pure iff g(x) is pure

scknkkrer14:03:14

thanks a lot.

bcbradley14:03:05

@pcbalodi that is true for f(x) -> g(x) * 2, but isn't true in the general case

bcbradley14:03:41

for instance, there could be a function g(x) that returns 0 on the first three invocations and then x on every invocation after that, so that is impure

bcbradley14:03:18

another function h(x) could be the opposite, returning x on the first three invocations and then zero on every one after that

bcbradley14:03:35

f(x) -> h(x) * g(x) is pure even though h(x) and g(x) are not

mobileink20:03:27

bcbradley: i do like this example, altho i would drop "pure" and say f is a function.

bcbradley20:03:12

purity just means that a function will produce a return value corresponding to its arguments; for each set of arguments, there is one return value, and that value will always be returned if those arguments are givne.

bcbradley20:03:46

it doesn't imply anything about "side effects" that might occur as a result of the function performing its operation

bcbradley20:03:17

if you want to interpret the side effects a computational implementation of a function might have as part of its return value, you could, and if you do then of course such an implementation couldn't be interpreted as a pure function

bcbradley20:03:45

but that is just a choice; there is nothing that says you have to have that interpretation

bcbradley20:03:38

its possible to have side effects that just don't matter; they might not affect your program's operation-- for instance, causing a light to blink on and off could be a side effect

mobileink21:03:45

"purity just means that a function will produce a return value corresponding to its arguments; .

mobileink21:03:18

see that's the problem with the term. by your definition my implementation of "inc" that launches the missiles would count as "pure". and if "pure" just means "functional semantics", then why bother,

chillenious15:03:37

if the result depends on the number of invocations, that function isn’t pure, right? so then f(x) is pure iff g(x) doesn’t hold

chillenious15:03:14

if g(x) is pure, f(x) is pure (because it’s only components are a pure function and a constant

chillenious15:03:27

dunno, I’m not much of a theorist, but that seems logical to me

bcbradley15:03:32

you can't say a function is impure because its constituents are impure

bcbradley15:03:43

you can say a function is pure if its constituents are pure

Prakash15:03:03

yeah, I agree with bcbradley on this

chillenious15:03:08

but then you can never say a function is pure if there is any other function component to it that isn’t guaranteed to be pure

bcbradley15:03:25

i didn't invent the maths

chillenious15:03:31

yeah, ok, gotcha 🙂

bcbradley15:03:33

it might be unintuitive, but thats how it is

chillenious15:03:45

yeah, I get what you mean

bcbradley15:03:23

i'm trying to think of a way to cancel out the impure parts of opengl to provide a library of pure functions that provides the same utility

bcbradley15:03:37

obviously actually using the above principle is far easier said than done.

paulocuneo15:03:00

if you want a function to be absolutly pure, the computer should be pass as a parameter 😛 and it must produce a new computer as a result

donaldball15:03:14

mmm not sure that I agree; many hold that a pure fn both always returns the same value for the same args and has no observable side effects

Prakash15:03:24

yeah, I think the f(x)-> h(x) * g(x) follows that

donaldball15:03:44

It affects the states of the impure h and g; that’s an observable side effect

donaldball15:03:37

It’s not a super interesting debate tbqh, clojure isn’t overly concerned with purity, much more of a semantic quibble

mobileink18:03:16

fwiw, "impure function" is an oxymoron. since all real -world computations have observable side-effects (at the very least, consumption of energy), the are no genuine functions in any programming language. but operations with no semantic side effects are close enough, so we call them "pure functions". note that an operation that fits the classic defn of function - always gives same result for same input - could nonetheless have semantic effects. for example a fn that always returns 3 could update a global counter each time it is called. to the client that calls it, it looks like a pure function, especially if the client doesn't care about the global counter. But it's not even a function, strictly speaking. this can be very useful, esp. for metaprogramming, support for which is one of Clojure's greatest strengths. lots of Clojure stuff is "impure" in this way, e.g. most antything that starts with "def" will have semantic side effects altering the environment (namespace). what really matters is reliable predictability that approximates genuine functionality, rather than "purity".

bcbradley19:03:46

the whole idea of purity is just a tool, like the concept of immutability is a tool, or the idea of objects is a tool

bcbradley19:03:03

tools are useful at helping you solve specific problems

bcbradley19:03:02

insisting "nothing is pure" and leveraging that point of view to assert that the idea of purity isn't useful isn't very productive

bcbradley19:03:54

that is because obviously something is pure, or it wouldn't be possible to imagine purity

bcbradley19:03:06

in other words, the land of imagination is as valid a place as the land of you and me

bcbradley19:03:27

and that is where all mathematical and syllogical ideas exists

bcbradley19:03:39

reality has no bearing on the utility of functional purity

mobileink19:03:14

@bcbradley personally i'm not very interested in ideological purity. i just like accurate descriptions of the way things actually work. "purity" is a buzzword that people are free to use as they wish. but it is not a technical term in cs; it's simply not very meaningful. accurate terminology is critically important, esp. for newcomers. "what is a pure function?" and "is this function pure?" are common kinds of question. imho the way to answer them is to focus on computability rather than mathematics.

mobileink20:03:26

i would add the the people behind clojure have very laudably emphasized, in so many words, that Clojure is about thinking. thinking about functional v. imperative v. oo programming is hard, imo.

mobileink20:03:31

is "range" a function? no; functions always terminate. but it always returns the same result, and it has no side effects. is it pure? does "pure" even meaning anything here?

bcbradley20:03:15

range is a function, termination is a computational idea

mobileink20:03:01

bcbradley: i wouldn't be so sure about that. afaik, mathematically, functions always return finitw values. but i am not a mathematician, and math can be surprising; it would not surprise to to learn that this is not always the case. but then i think we would be talking about 2 different concepts of "function".

bcbradley20:03:48

https://en.wikipedia.org/wiki/Function_(mathematics) there is nothing that says a function's domain or range have to have numbers as elements

bcbradley20:03:05

they could have names, hobbies, structures, people

bcbradley20:03:11

or infinite sets

bcbradley20:03:42

a function is just a collection of mappings from one or more things to a corresponding thing

bcbradley20:03:59

A phone book is a function

mobileink20:03:02

sure, but they always return a value. a fn coukd return the set Nat, but it could not return an infinity of nats.

mobileink20:03:57

i think you may be missing my point. an infinity is not a "thing".

mobileink20:03:35

the result of any function must be finite, afaik.

mobileink20:03:46

the underlying issue here is induction v. coinduction. the functions we know and love, like +, are inductively defined. "functions" like range cannot be inductively defined. there's a fundamental symmetry.

dpsutton20:03:23

why must "the result" of any function be finite?

dpsutton20:03:09

f(3) = [1,2] is certainly valid and the "result" is an uncountable set of real numbers

bcbradley20:03:19

"an infinity" is most certainly a thing. Here: F(x) => "an infinity". There is a function that takes anything and returns an infinity. If that seems a bit facetious that is because functions really are this general. Its humans and computers that are too limited, not functions.

mobileink21:03:28

busted? help me with this. obviously in a computer there is no such thing as an uncoutable set. our procedures can only approximate. but more to the point: your f returns a definite set, which happens to be infinite. it does not return a list of values that keeps going. also you cannot do (first (f 3)).

dpsutton21:03:18

i was just wondering about your aversion to finite, so i posted a counter example. but it appears i misunderstood which range function you were discussing

dpsutton21:03:43

i also can't tell if we're talking about pure mathematics or particular instances of functions in code

dpsutton21:03:51

it just looked interesting to me

mobileink21:03:58

@U11BV7MTK maybe "determinate result" would be better.

mobileink21:03:55

fwiw this topic is in my experience very complex. also fascinating, also directly relevant to oo v. functional. afaik it's all down to this v. co-this. functions: induction. routines that returrn infinite lists, random numbers, etc.: co-induction. oop is co-stuff. the symmetry is incredibly beautiful.

mobileink21:03:21

@U11BV7MTK no aversion to infinite here, luv the stuff! it's just not functional.

dpsutton21:03:57

is there some definition that you are using then? i don't understand why a function returning an infinite list makes it not functional

mobileink21:03:08

@bcbradley regarding "F(x) => 'an infinity'". it's obvious that F cannot return an actual infinity, right? if it could, it would not in fact be infinite. the best F can do is return some kind of peocedure the lets us crawl thru the infinity.

mobileink21:03:47

@U11BV7MTK ok, let's go back to foundations. what problem did turing set out to solve? in a word, "effective procedure (for computing a function)". procedures that do not terminate are not "effective". they do not calculate the answer to the question, since they never answer. Clojure's "range" never answers since it keeps going, forever. sure you can "take" partial answers, but that's different. sorry, i'm still working on the language to explain this clearly.

mobileink21:03:38

also "range" does not return an infinite list! how could it? computers do not have infinite memories! what it returns is a computation.

mobileink21:03:48

something you can use ad infinitum the get the next element.

mobileink21:03:05

but that is not an infinity.

mobileink20:03:37

"termination" is just the computional analog to "finite", no?

dpsutton20:03:32

why is range not a function?

dpsutton20:03:46

and can you say what range "function" you are talking about?

bcbradley21:03:17

I think he means (range)

mobileink21:03:26

bcbradley: yes, i meant (range), or any other "function" that returns a pseudo-infinity.

dpsutton21:03:25

oh ok

dpsutton21:03:43

i was thinking the range of a function

bcbradley21:03:14

he believes the idea of purity doesn't apply to (range) because it never terminates

bcbradley21:03:53

I believe it doesn't matter if it terminates or not because termination is a computational idea; (range) returns all integers, in order. that is a well defined return value.

mobileink21:03:13

bcbradley: no, range does not return all integers. how could it? the very idea is absurd. the idea that it does is a useful fiction, but still a fiction. if you don't think termination is important i have some bugs i 'd like you to look at. ;)

bcbradley21:03:16

the fact that no computer can return it is irrelevant.

slipset21:03:28

It might be easier to talk about purity by considering referential transparency.

slipset21:03:40

If f(5) -> 25 and you can substitute any f(5) by the constant 25 in your code, then f is referentially transparent and thus "pure"

slipset21:03:25

There is of course also idempotency, which touches on the same subject.

slipset21:03:17

Further, the composition of a bunch of pure functions yield another pure function, but the composition of a bunch of pure functions and at least one impure function yields an impure function.

mobileink21:03:08

@slipset if i knew what "pure" means, i might agree. but @bcbradley gave an excellent example of a pure fn composed of impure fns.

slipset21:03:11

https://en.m.wikipedia.org/wiki/Pure_function

slipset22:03:52

@mobileink I fail to see how the composition of impure functions can yield a pure function.

mobileink22:03:33

@bcbradley gave a good example. https://clojurians.slack.com/archives/clojure/p1489244285027879

shdzzl22:03:13

What I'm getting from this is: You can compose a function that is referentially transparent from functions that aren't. You can't compose a function that has no semantically observable side effects from functions that do have them.

mobileink22:03:00

or howsabout this: i have a fn, my-inc. it take a nbr and increments it. but it also has a metadatum, which keeps track of all the times it has been called. functional? pure?

shdzzl22:03:33

Updating the call counter is a semantically observable side effect.

mobileink22:03:42

but only for things that know about it, at the meta-level.

mobileink22:03:46

iow, it's not global, nobody can accidentally get screwed up by it.

mobileink22:03:17

like interns in a namespace, kinda.

keymone22:03:17

why is using (locking …) problematic within go blocks?

Alex Miller (Clojure team)22:03:38

because it can block a go thread for an arbitrary amount of time

Alex Miller (Clojure team)22:03:00

and there are a fixed number of go threads

keymone22:03:16

ok, then what is the best way to do this: (go-loop [] (locking out (println (<! channel))) ?

Alex Miller (Clojure team)22:03:54

why do you want to do that?

keymone22:03:21

i’m just testing stuff in repl and want to print all messages on a channel

Alex Miller (Clojure team)22:03:32

you don’t need to lock *out* to write to it

keymone22:03:56

i get interleaved outputs when i don’t lock

Alex Miller (Clojure team)22:03:17

so build complete strings before you println

Alex Miller (Clojure team)22:03:25

(println (str …))

keymone22:03:39

yeah, it’s the newline characters that are interleaved

Alex Miller (Clojure team)22:03:52

(print (str … “\n”))

keymone22:03:01

ok, i’ll try that

keymone22:03:03

thanks

Alex Miller (Clojure team)22:03:32

or alternately send to a channel (or an agent) and have the channel or agent read and print messages one by one

Alex Miller (Clojure team)22:03:45

in other words, reduce the number of writers to one

keymone22:03:33

right, that’s an option, didn’t do it because then i’d need to change all println calls to put!s or something. (print (str)) seems to work, thanks!

Alex Miller (Clojure team)22:03:06

👍

mobileink22:03:18

that is, the counter is not a global, it's atyached to the fn. to get at it you have to really want to. ;)

shdzzl22:03:19

@mobileink The fact that it's metadata doesn't change anything from my perspective. It's a value outside of it's arguments, that you are affecting in an observable way by calling the function.

mobileink22:03:13

true. but there's a big diff between metaprogramming and app progrsmming. that's one of the things i love about clojure. app programmers don't need to worry about the meta stuff.

shdzzl22:03:02

If you were limiting the domain of your semantics to exclude metadata, it would be a pure function.

mobileink22:03:25

a lib can use metadata on namespace, vars, etc. to enable amazing things, and the users need never know about it.

mobileink22:03:36

e.g. i have a lib in the works that uses tons of metadata and other metaprogramming stuff, but it's all completely hidden. it all looks like "pure" functional stuff, but it isn't really - just like clojure.

shdzzl22:03:10

True, Clojure lets us do amazing things. But users can still observe those things if they want to, with regards to the discussion about purity and what constitutes an observable side effect. So it could be considered disingenuous to say a function like that is pure. I think the Clojure core team are pretty clear about the fact that large parts of the language aren't pure. There's a distinction between pure and stateful transducers, for example. And core.async.

mobileink22:03:05

@shdzzl heh. nobody can ever say the core Clojure folks are ideologues, thank goodness! fwiw, "purity" is just not in my vocab, wrt prugramming. i tbink it does more harm than good. this routine is functional (meaning, it approximates functionality), this one isn't,.

shdzzl23:03:41

@mobileink You literally can't do anything with your code except make the processor hot without a touch of impurity. One of the really tricky parts of language/lib design in the functional space is how to introduce side effects into your base assumption of functional purity. Clojure does it really well IMO. Purity is a useful term when considering functions that are run to get their value, but that might be run mutliple times or asynchronously. In your example of a function that updates a counter in metadata, if you were to use that to try to update a ref in a dosync, the counter might be updated mutliple times even though you only intended to run the function once. So the semantics have to be clear.

qqq23:03:44

where can I read up on how `(symbol) and '(symbol) differ during macro expansion

qqq23:03:54

it seeems like leaves it as is, whereas tries to qualify it or something

shdzzl23:03:12

@qqq http://www.braveclojure.com/writing-macros/ The section on syntax quoting. That's where I learnt about it. The official docs for syntax quoting: https://clojure.org/reference/reader#syntax-quote

seylerius23:03:03

Okay. Is there a right enough way to integrate compojure and cider such that C-c M-j launches the test server and inserts the REPL into the appropriate namespace?

qqq23:03:40

@shdzzi: official docs link clearned everything up; thanks!

qqq23:03:13

I'm aware of read-string and load-string and slurp; but is there read-file or load-file? (I want to keep the metadata of file name / line number)

qqq23:03:25

how do I modify

(defn read-from-file-with-trusted-contents [filename]
  (with-open [r (java.io.PushbackReader.
                 ( filename))]
    (binding [*read-eval* false]
      (read r))))

from clojure.docs to read until EOF instead of only first form ?

2017-03-11

Channels