Fork me on GitHub
#clojure
<
2020-11-03
>
fadrian03:11:36

I'm trying to transform a collection of data items into XML. The items each represent one instance of one of about 500 different classes, which will be read into a Java program via JAXB (what can I say, this system was designed in the early 2000's). I have the .xsd files for these classes. Is there any way to easily import these .xsd's into clojure, turning the types described therein into corresponding records? Similarly, I need to also do the reverse process - reading the XML data objects into Clojure. Again, is there an easy way to do this? Alternately, is there a simple JAXB wrapper that I can use for these operations?

Ed19:11:23

I have had some success with clojure.data.zip in the past (https://github.com/clojure/data.zip) but it seems there's this which looks newer https://github.com/clojure/data.xml and may be more what you're looking for? Writing a custom JAXB handler probably isn't much work though ...

Ed19:11:30

equally though, you should be able to generate the java classes, and use them directly

Ed19:11:37

what I'd recommend you do probably depends on the transformations you want to make 😉

fadrian20:11:27

Thank you all. I'm considering the suggestions you've all given. They've been helpful.

suren05:11:45

Hi guys I wrote a clojure package for composing sql statement. I have been using it for my internal project and I am happy with it. The main reason being, it has lower learning curve and I get to write mostly sql statements. Check it out here https://github.com/ludbek/sql-compose The package has not been published yet due to following artifact issue during deploy https://stackoverflow.com/questions/64654868/deploy-clojure-packages-to-clojar Cheers

phronmophobic05:11:36

did you run lein jar first?

suren06:11:47

nope I didnt I did this time and I still get the same error

phronmophobic06:11:14

are you using up to date versions of lein and clojure?

suren06:11:24

$ lein --version
Leiningen 2.9.4 on Java 10.0.1 Java HotSpot(TM) 64-Bit Server @vm 
Clojure version is 1.10.1

phronmophobic06:11:56

i'm not actually sure what the issue is. from the error message it seems like it's having trouble finding files to deploy. is your project.clj file shareable?

suren02:11:11

The issue has been fixed. Not sure what went wrong with lein. Instead of using lein I used mvn to deploy the package. The blog post below was helpful. https://oli.me.uk/clojure-projects-from-scratch/

Jovannie Landero19:11:15

what channel should i use for questions regarding generating csv files?

noisesmith19:11:07

here should be fine, are you using clojure.data.csv?

Jovannie Landero19:11:14

yes. i'm trying to generate a csv with a large amount of data. my function works fine with smaller datasets but gets stuck when dealing with a large dataset. was wondering if anyone ran into this issue using clojure.data.csv

ghadi19:11:35

Paste your csv writing code @jovannie.landero396

Jovannie Landero19:11:49

(defn maps->csv
  "Converts list of hash-maps to CSV."
  [file data]
  (let [filename (str "resources/csvs/" file "-" (time/instant) ".csv")
        headers (->> data
                     first
                     keys
                     (map name)
                     (map str/capitalize))
        row-data (map vals data)]
    (with-open [writer (io/writer filename)]
      (csv/write-csv writer (cons headers row-data)))))

andy.fingerhut20:11:47

I believe such code will cause the entire sequence row-data to be realized in memory all at once, with none of it beginning elements GC'able garbage, because of the row-data reference to the head of the list. If all of that data is large compared to your JVM's max heap size, that will cause it to go slower and slower, GC'ing more frequently, until eventually it could cause an OutOfMemory exception.

andy.fingerhut20:11:31

I also believe there are small variations of your code that should avoid "holding onto the head", e.g. do not declare row-data at all, and use the expression (map vals data) instead of the one place where row-data occurs now. Note that if data is a lazy sequence and its head is held onto outside of the function map->csv, then it could also end up consuming lots of non-GC'able memory.

andy.fingerhut20:11:59

(non GC'able until some later time, when your code no longer keeps a reference to the beginning of such a sequence)

noisesmith20:11:30

I thought locals clearing was smart about escape

Jovannie Landero20:11:01

hm yep data is a lazy sequence as well and i believe its head is held onto outside of the function but just to make sure we're on the same page.. what do you mean by its head is held onto outside of the function?

noisesmith20:11:01

eg. this runs forever but doesn't increase heap usage

user=> (let [r (range)] (run! identity r))

noisesmith20:11:39

@jovannie.landero396 if there's something outside maps->csv that holds data, that means the realized values can't be garbage collected

noisesmith20:11:43

so it's a potential heap bomb

noisesmith20:11:32

@jovannie.landero396 anyway, this should be easy to test, with eg. visualvm or yourkit attached to your process, look for heap usage, which methods are being called etc.

Jovannie Landero20:11:12

so the amount of memory being used it 54.7 MiB without running it through that function