Fork me on GitHub
#beginners
<
2018-07-16
>
stardiviner00:07:33

I want to write a simple web crawler, how to send request with cookie in Clojure?

lispyclouds05:07:29

@vale This is how I get around the issue:

(def data-source
  (when-not *compile-files*
    (make-datasource options)))
*compile-files* is set when AOT is happening and prevents the eval.

valerauko05:07:05

is that some magic variable or do i need to set it myself?

lispyclouds05:07:44

Its a special variable set in the environment by Clojure. Have a look at this: https://stackoverflow.com/questions/1986961/how-is-the-var-name-naming-convention-used-in-clojure

valerauko05:07:11

cool, thanks!

😄 1
Keith Harper15:07:49

Is there any way to use https://github.com/weavejester/ns-tracker to trigger a reload of a namespace whenever a configuration file changes? Say I have a “resources” directory with “selectors.edn” in it, is there a way to automatically reload the namespace that reads that config file in whenever it is changed?

Keith Harper15:07:52

Nevermind, I found a small library that I was able to use to cause a reload of the config namespace on resource file change: https://github.com/pocket7878/file-tracker

itsalitee18:07:26

Hi everyone, I wander to know how you are setting up your development environment, particularly if you are using Visual Studio Code. Any input would be greatly appreciated 🙂

bj18:07:44

I think calva is the way to go with VSC

bj18:07:10

But personally I'm using Cursive now after being on atom for about a year and a half

uwo19:07:36

I’m trying to pipeline a stream of XML. What I have was working for smaller files, and I was under the impression that I was streaming, but everything locks up when I pass it a very large xml file. Here’s a snippet of my let block

content [(clojure.data.xml/parse (java.io.FileInputStream. filepath))]
          from (a/chan 1 (comp (mapcat :content)
                               (filter #(-> % :tag (= :TagOfInterest)))))
          _ (a/onto-chan from content)
          to (a/chan 10)
          _ (a/pipeline 1 to xf from true ex-handler)

uwo19:07:03

it never get’s into the pipeline transducer xf

noisesmith20:07:00

I assume you are simply consuming from the to channel in a loop?

noisesmith20:07:07

also, what does xf look like?

uwo20:07:21

xf just does a computation transform (map (fn [x] (some-transform x))

uwo20:07:31

when you say consume, do you mean from to? If so I’m just piping like this (also snippet from let block)

res-ch (a/promise-chan)
          into-ch (a/into [] to)
          _ (a/pipe into-ch res-ch)

hiredman20:07:04

is the xml parsing fully realized before going through the channels?

uwo20:07:49

when it was working on smaller sets I assumed it was, but now that it locks up on larger files I assume I’m missing something

hiredman20:07:05

the doc string for clojure.data.xml/parse says it is lazy

uwo20:07:29

> “Parses the source, which can be an InputStream or Reader, and returns a lazy tree of Element records. Accepts key pairs with XMLInputFactory options, see http://docs.oracle.com/javase/6/docs/api/javax/xml/stream/XMLInputFactory.html and xml-input-factory-props for more information. Defaults coalescing true.”

uwo20:07:35

whoops sorry, yeah, what you said

uwo20:07:45

exactly, so not sure why it’s freezing up now :thinking_face:

hiredman20:07:06

because you are doing io on the core.async threadpool

hiredman20:07:08

I mean, I dunno, I am not sure it should be freezing, but io definitely should not be done on the core.async threadpool, processing xml records which are lazily realized from a file like that is a no go

uwo20:07:22

so I was under the impression that the reading and parsing was not happening on the core.async threadpool. parse returns a lazy sequence, which is then spooled onto the from channel with onto-chan, which should get back pressure because from has a finite buffer, 1 in this case

hiredman20:07:39

there are a lot of faulty assumptions there

hiredman20:07:22

onto-chan doesn't examine the contents of the collection, so the result of xml is not traversed and not realized by onto-chan

hiredman20:07:51

onto-chan is also implement using a go loop, so even if it did, it would be doing it on the core.async thread pool

hiredman20:07:49

and xml is a linearized tree structure, so parsing it lazily can get weird

uwo20:07:30

ahh. thanks @hiredman, that helps

hiredman20:07:18

I am not sure that would cause it to lock up, because the io from the file, while technically blocking, should be guaranteed to make forward progress and eventually complete

uwo20:07:53

hah! i got it working

uwo20:07:34

^ s/it’s/its

uwo20:07:14

whoops. messed up last code block snippet

noisesmith21:07:59

calling mapcat on [x] is slightly odd - would you have multiple items in your real use case?

noisesmith21:07:05

pipeline has a variant, pipeline-blocking that safely performs ops in a threadpool outside the core.async go block pool, but you would still need to consume the channel from pipeline to drive consumption

uwo21:07:00

ah, yeah, sorry. that’s left over from when I was exploring via (def children (mapcat :content)) which it’s nicer to use mapcat when you’re calling children repeatedly. totally unnecessary in this context though, you’re right.

uwo21:07:09

in this case there’s currently no io going on in the transducer, xf, that’s being passed to a/pipeline, just a simple projection

noisesmith21:07:42

OK, right, I'm just saying don't expect xf to run against the data if you aren't consuming from the pipeline

noisesmith21:07:26

which you may or may not be doing given what you shared, just thought I'd mention

uwo21:07:28

ah, right you mean from the end of the pipeline? yeah, sorry. super snipped; we’re pulling from to

uwo22:07:03

@hiredman even though I got it working, I am going to take your advice and make a version of onto-chan using thread instead of go

hiredman22:07:22

none of what I have seen fully realizes the xml structure (as far as I can tell), it won't matter how the structure is put on to the channel if some go block down the line traverses down the xml tree to some part that hasn't been realized yet and then does io