Fork me on GitHub
#beginners
<
2024-05-15
>
Shawn Northrop16:05:53

Hi, I have a question around working with lazy seqs. I am using the clojure/data.csv library which talks about laziness and the ability to work with large files because of this. In my code I am reading a csv, using map to update / transform data and eventually writing the rows back into a new csv. In addition, I want to parse each row and add another row to the seq based on the existence of a non-null value. For example my input.csv

id, price, discount
1, 10, 5
2, 20, null
The output.csv would be
id, type, value
1, 'total', 10
1, 'discount', 5
2, 'total', 20
In this example the discount column is parsed and a new row is created with the same id This is the code I have that works, however, I use reduce which I do not believe is lazy:
(defn add-discount-rows
  [rows]
  (reduce (fn [acc row]
            (if contains-and-not-empty? row :discount)
              (conj acc row (create-discount-row row))
              (conj acc row)))
          '()
          rows))
*contains-and-not-empty? is a custom fn that checks for the existence of the column, and ensures it is not null and not an empty string A couple of questions: 1. Am I correct in assuming reduce is not lazy? 2. How can I go about making this lazy or is there a better approach to this? a. I do not need these rows to be in order and they can be 2 separate seqs. The only requirement is that all rows are written to the csv

Shawn Northrop16:05:37

Here is a shortened example of code that composes the functionality together:

(defn run
   [input output]
   (with-open [reader (io/reader input)
               writer (io/writer output)]
     (->> (csv/read-csv reader)
          transform-data
          add-discount-rows
          (csv/write-csv writer))))

Jason Bullers16:05:52

Maybe mapcat would work here? Your mapping function would take a row and emit a sequence of one or more rows that then get concatenated at the end

Shawn Northrop17:05:04

cool ill check that out

Shawn Northrop21:05:18

Thanks @U04RG9F8UJZ that seems to have worked out! I updated the code to the following:

(defn add-discount-rows
  [rows]
  (mapcat (fn [row]
            (if (contains-and-not-empty? row :discount)
              (list row (create-discount-row row))
              (list row)))
          rows))

👍 1
Jason Bullers18:05:22

I was reading some stuff about Babashka and saw mention that it's normal to use require in scripts to pull in useful namespaces. I've also heard in "standard" Clojure, you'd only really use the :require directive of the ns macro to do the same. Is there a reason to not use the require function in my Clojure ns? Can bad things happen?

Alex Miller (Clojure team)18:05:16

nothing bad will happen

hiredman18:05:21

it is more dynamic and less visible to tools that do static analysis

Alex Miller (Clojure team)18:05:31

the benefit of putting it in ns is lots of tools use that to understand what you depend on

☝️ 1
Samuel Ludwig18:05:59

For these reasons you'll most frequently see require in quick one-off, single-file scripts, and :require in more robust babashka projects

👍 1
Jason Bullers18:05:00

I see. So it's about discoverability, but semantically they're the same. I can certainly imagine the headache of requires strewn around a large source file

hiredman18:05:42

require is also slightly different between different variants of clojure

hiredman18:05:12

on jvm clojure require is just a normal function you can call anywhere (often leading to weird results)

hiredman18:05:52

I am not sure if clojurescript even has require? I think it only has the :require directive in ns forms

hiredman18:05:58

I am not sure if babashka's require is a function, or it might be a macro or a special form

Alex Miller (Clojure team)18:05:24

the ns :require does literally turn into the same call to require eventually (in clojure)

👍 1
borkdude19:05:46

require is a function in bb

didibus23:05:18

I believe clj-kondo understands calls to require though. So at least the most common static analyzer does. @U04V15CAJ to confirm?

borkdude09:05:38

Yes, this will have no problems:

(ns scratch)

(require '[clojure.string :as str])

(str/join "," [1 2 3])
As long as you use it on the top level and use quoted vectors, it will be treated the same way as within the ns form

jpmonettas12:05:53

in a way the ns macro is sugar for (do (in-ns ....) (require ...))

(macroexpand
 '(ns foo
    (:require [clojure.string :as str])))

(do
  (clojure.core/in-ns 'foo)
  (clojure.core/with-loading-context
    (clojure.core/refer 'clojure.core)
    (clojure.core/require '[clojure.string :as str]))
  ...)
but as everybody said, important sugar since a bunch of tools (like tools.namespace) parse the non expanded form

borkdude12:05:21

In ClojureScript the above doesn't hold

jpmonettas12:05:48

oh yeah, I was referring to Clojure only

stopa20:05:31

Hey team, I am using ManagementFactory to report some metrics periodically. I am getting warnings like this:

WARNING: Illegal reflective access by clojure.lang.InjectedInvoker/0x00000008000ab840 (file:/Users/stopa/.m2/repository/org/clojure/clojure/1.11.1/clojure-1.11.1.jar) to method sun.management.GarbageCollectorImpl.getName()
WARNING: Illegal reflective access by clojure.lang.Reflector (file:/Users/stopa/.m2/repository/org/clojure/clojure/1.11.1/clojure-1.11.1.jar) to method sun.management.GarbageCollectorImpl.getName()
WARNING: Illegal reflective access by clojure.lang.InjectedInvoker/0x00000008000ab840 (file:/Users/stopa/.m2/repository/org/clojure/clojure/1.11.1/clojure-1.11.1.jar) to method sun.management.GarbageCollectorImpl.getCollectionCount()
WARNING: Illegal reflective access by clojure.lang.Reflector (file:/Users/stopa/.m2/repository/org/clojure/clojure/1.11.1/clojure-1.11.1.jar) to method sun.management.GarbageCollectorImpl.getCollectionCount()
WARNING: Illegal reflective access by clojure.lang.InjectedInvoker/0x00000008000ab840 (file:/Users/stopa/.m2/repository/org/clojure/clojure/1.11.1/clojure-1.11.1.jar) to method sun.management.GarbageCollectorImpl.getName()
WARNING: Illegal reflective access by clojure.lang.Reflector (file:/Users/stopa/.m2/repository/org/clojure/clojure/1.11.1/clojure-1.11.1.jar) to method sun.management.GarbageCollectorImpl.getName()
WARNING: Illegal reflective access by clojure.lang.InjectedInvoker/0x00000008000ab840 (file:/Users/stopa/.m2/repository/org/clojure/clojure/1.11.1/clojure-1.11.1.jar) to method sun.management.GarbageCollectorI
This is the code that is causing it:
(let [memory (ManagementFactory/getMemoryMXBean)
        gcs (ManagementFactory/getGarbageCollectorMXBeans)
        thread (ManagementFactory/getThreadMXBean)
        solo-executor (clojure.lang.Agent/soloExecutor)
        pooled-executor (clojure.lang.Agent/pooledExecutor)
        metrics (flatten
                 [{:path "gauges.calculated_at_ms"
                   :value (System/currentTimeMillis)}
                  {:path "jvm.memory.total.init"
                   :value (+ (-> memory .getHeapMemoryUsage .getInit)
                             (-> memory .getNonHeapMemoryUsage .getInit))}
                  {:path "jvm.memory.total.used"
                   :value (+ (-> memory .getHeapMemoryUsage .getUsed)
                             (-> memory .getNonHeapMemoryUsage .getUsed))}
                  {:path "jvm.memory.total.max"
                   :value (+ (-> memory .getHeapMemoryUsage .getMax)
                             (-> memory .getNonHeapMemoryUsage .getMax))}
                  {:path "jvm.memory.total.committed"
                   :value (+ (-> memory .getHeapMemoryUsage .getCommitted)
                             (-> memory .getNonHeapMemoryUsage .getCommitted))}
                  {:path "jvm.memory.heap.init"
                   :value (-> memory .getHeapMemoryUsage .getInit)}
                  {:path "jvm.memory.heap.used"
                   :value (-> memory .getHeapMemoryUsage .getUsed)}
                  {:path "jvm.memory.heap.max"
                   :value (-> memory .getHeapMemoryUsage .getMax)}
                  {:path "jvm.memory.heap.committed"
                   :value (-> memory .getHeapMemoryUsage .getCommitted)}
                  {:path "jvm.memory.non-heap.init"
                   :value (-> memory .getNonHeapMemoryUsage .getInit)}
                  {:path "jvm.memory.non-heap.used"
                   :value (-> memory .getNonHeapMemoryUsage .getUsed)}
                  {:path "jvm.memory.non-heap.max"
                   :value (-> memory .getNonHeapMemoryUsage .getMax)}
                  {:path "jvm.memory.non-heap.committed"
                   :value (-> memory .getNonHeapMemoryUsage .getCommitted)}
                  (for [gc gcs]
                    [{:path (str "jvm.gc." (-> gc .getName str/lower-case) ".count")
                      :value (-> gc .getCollectionCount)}
                     {:path (str "jvm.gc." (-> gc .getName str/lower-case) ".time")
                      :value (-> gc .getCollectionTime)}])
                  {:path "jvm.thread.count"
                   :value (-> thread .getThreadCount)}
                  {:path "jvm.thread.daemon.count"
                   :value (-> thread .getDaemonThreadCount)}
                  (for [thread-state (Thread$State/values)]
                    {:path (str "jvm.thread." (-> thread-state str str/lower-case) ".count")
                     :value (count
                             (filter #(and % (= thread-state (.getThreadState %)))
                                     (.getThreadInfo thread
                                                     (-> thread .getAllThreadIds))))})
                  (for [[executor description]
                        [[solo-executor "agent-pool.send-off"]
                         [pooled-executor "agent-pool.send"]]]
                    [{:path (str "jvm." description ".queue-depth")
                      :value (-> executor .getQueue .size)}
                     {:path (str "jvm." description ".active")
                      :value (.getActiveCount executor)}
                     {:path (str "jvm." description ".tasks")
                      :value (.getTaskCount executor)}
                     {:path (str "jvm." description ".completed-tasks")
                      :value (.getCompletedTaskCount executor)}
                     {:path (str "jvm." description ".size")
                      :value (.getPoolSize executor)}
                     {:path (str "jvm." description ".core-size")
                      :value (.getCorePoolSize executor)}
                     {:path (str "jvm." description ".largest-size")
                      :value (.getLargestPoolSize executor)}
                     {:path (str "jvm." description ".maximum-size")
                      :value (.getMaximumPoolSize executor)}])])]
    (into {} (map (juxt :path :value) metrics)))
I originally got this form https://github.com/PrecursorApp/precursor/blob/master/src/pc/gauges.clj . Looking around I see that this may be because java has changed it's internal api, since the code was first written. Is there a resource I can look at, to fix the warning?

stopa20:05:07

Bam

(for [^GarbageCollectorMXBean gc gcs]
                    [{:path (str "jvm.gc." (-> gc .getName str/lower-case) ".count")
                      :value (-> gc .getCollectionCount)}
                     {:path (str "jvm.gc." (-> gc .getName str/lower-case) ".time")
                      :value (-> gc .getCollectionTime)}])
Did the trick. Thanks @U0NCTKEV8!

emccue22:05:34

those will be hard errors in 16+

emccue22:05:07

> The first thing to note here is that this is a warning. Java 9 through all current releases will permit the call to be made and the code will continue to work. This line should be updated probably