Fork me on GitHub
#data-science
<
2023-03-11
>
Edward Hughes17:03:39

Has anyone used tech.ml.dataset for reading arrow files? I'm helping a friend with a project, but we seem to have trouble with getting the dependencies set up as mentioned in the docs for libs.arrow. The REPL complains it can't seem to find the FramedLZ4CompressorInputStream class from lz4, and we're having trouble installing liblz4 on windows.

Ash18:03:24

It might not help but in my deps.edn I have: org.apache.arrow/arrow-vector {:mvn/version "6.0.0"} org.lz4/lz4-java {:mvn/version "1.8.0"} and can then successfully use, for example: tech.v3.libs.arrow/stream->dataset

Edward Hughes18:03:11

We put those in our deps, but the namespace throws the class-not-found on loading the namespace with stream->dataset in it

Ash19:03:27

Hmm, can’t really help - sorry. Although, I wonder if that library needs JNI(?). In my jvm-opts I use: "--add-modules" "jdk.incubator.foreign" "--enable-native-access=ALL-UNNAMED"

chrisn00:03:57

It definitely needs lz4.DLL

👍 2