Fork me on GitHub
#clojars
<
2023-05-08
>
phronmophobic23:05:43

I'm hitting a 413 Request Entity Too Large error. Obviously, the problem is that my jar is too big (31M). However, I'm not sure it's possible to make the jar much smaller. Maybe worse, I'm not sure what I'm trying to do is a good idea to begin with. A little background: I've been writing clojure wrappers for useful c libraries like clang, glfw, graphviz, libretro, ffmpeg, etc. So far, I've just told users that it's up to them to figure out how to obtain a compatible shared library (usually via package manager along with some extra dev configuration). This adds a lot of friction for trying out these wrappers and the problem is even worse if a library builds on top of a wrapper since they also have to include instructions for getting the right shared library. I've been thinking about how to approach the problem of distributing native binaries without writing a package manager from scratch. What I came up with is to use https://github.com/conda/conda to do all the hard work and write a few utilities for making the results available via clojars. Distributing native libraries via jars seems to be common practice (eg. sqlite, rocksdb, glfw, javacpp, etc). The only thing I'm doing different is offloading the work of building native libraries to conda. I've written an initial implementation and it seems to work well. For the libraries I've tried, there's only one that makes a jar that's too big (llvm which is 31M). Does this approach sound plausible? Is this a dumb idea for some reason I'm not thinking of? Maybe a reasonable idea, but not a good fit for clojars?

Mark Wardle01:05:11

I can’t help on your main question sorry, but have you seen that lmdbjava is now using zig to build the required native libraries ready to be bundled into the jar? So I think this is another example of the bundling binaries into jar files. So a) bundling seems pretty routine and b) could zig simplify your builds? I am not a zig expert but it made it very simple. The only other consideration is that lmdbjava bundled a few standard binaries but I had a case where I deployed on FreeBSD so was quite happy to be able to simply install lmdb using the OS package manager and point the Java library to the native library. Having that escape override was useful.

phronmophobic01:05:53

> So I think this is another example of the bundling binaries into jar files. Yep, bundling binaries into jar files is common (eg. sqlite, lmdb, glfw, everything under javacpp, and many others). However, the process is usually pretty opaque and bespoke. One of the goals is to move away from a "mini" package manager that only does just enough to build native dependencies for a single library. > could zig simplify your builds? I'm not that familiar with zig. Do you have any links that explain how that might work? > Having that escape override was useful. 💯 Yea, these libs would be completely optional and anyone would easily be able to ignore them and use their preferred package manager or local build from source.

Mark Wardle08:05:21

I'm still failing to answer the fundamental question in the OP, but see https://andrewkelley.me/post/zig-cc-powerful-drop-in-replacement-gcc-clang.html and https://github.com/lmdbjava/lmdbjava/blob/master/cross-compile.sh for an example of how the lmdbjava project has switched.

🙏 2
Mark Wardle10:05:49

And I'm sure you're aware of this.... but I didn't until it bit me recently when I deployed to Amazon Linux.... glibc compatibility is other consideration for native builds.... which can come up even if one's CI pipeline is not changed simply as a result of the provider (e.g. GitHub) updating their build systems... and suddenly your jar files stop working on older Linuxes when before they worked fine!

tcrawley10:05:39

I can't speak to if using conda is a good way to distribute native binaries (I don't have any experience with it), but can say that we have raised the upload limit in the past for projects that include native code (opencv specifically), and I'm not opposed to doing that for cljonda.

phronmophobic16:05:29

@U013CFKNP2R, thanks for the links. They're very helpful. It seems like zig has a very reasonable approach to dealing with libc. One of the other problems I'm wrestling with is dealing with dependencies. It doesn't look like lmdb has many or any dependencies (which is ideal). Some projects like ffmpeg and graphviz have many dependencies which also have dependencies. I'll have to think on this a bit more. Any pointers, tips, or insights would be appreciated!

Mark Wardle17:05:50

Interesting. I saw https://github.com/andrewrk/ffmpeg which tracks upstream but replaces the build with zig. When I did C 30 years ago there wasn’t anything with packages… and Go when it started didn’t have proper package management but I remembering vendoring code either copying or just got submodules.. latter fine as long as read-only but hairy if not. My recent experience with lmdb doesn’t help you because as you say, it has no deps of its own. I think I’d be tempted to create a repo per native lib and make it work to build as you want, with whatever works for that lib, and then depend on that. Having used lmdbjava for sometime, I’ve been watching alternatives such as clong and the new Java FFI stuff with interest so keen to see what you do! I fear I’ve hijacked this thread so you might need to repost your original question if @U06SGCEHJ can’t sort a case by case increase in size.

👍 1
phronmophobic05:05:34

> I fear I’ve hijacked this thread Not at all! I think it's actually addressing the central problem which is "does this approach make any sense?" I took at the commit history for this fork and it seems like converting it to zig is quite a bit of effort. Another goal is to find a strategy that is somewhat scaleable (ie. very little incremental effort to access a new c library). My experimental version of using conda to create dependencies for ffmpeg seems to work on Mac OSX and at least some version of Ubuntu, but failed for debian (glibc issues). In theory, it might also work for Windows. I also did some reading on https://docs.conan.io/2/, which seems interesting. For now, I think I'm going to give some time for the ideas to marinate. Maybe it makes sense only to offer binaries for mac osx and eventually windows. Maybe there's some magic approach with zig or ziglike that makes sense. Maybe there's another tool that does the trick. As an aside, I was investigating nix previously and got some advice from their community chat that nix is probably not a good fit for this use case.

👍 1
Mark Wardle14:05:45

Interesting. You might simply solve the glibc issues for debian by provisioning a build pipeline using a well defined and slightly older Linux distribution... or other consideration is to ask are people who use java/clojure libraries that leverage native libraries generally the kind of folk quite happy to compile or install those native libs themselves? Is it really an ambition to make native libs as easy to use as a JVM lib? What if your wrappers simply looked in the "appropriate" places for native libs and let people install for themselves? But then... Windows.... (!) And is the same approach applicable to all native libs or could different approaches based on the specifics of the lib at hand be needed...e.g. lmdb looks pretty easy via zig vs. ffmpeg is more work and might need a git submodule of it and its dependencies manually managed similar to how Linux distros do for upstream projects. Anyway, definitely not an expert on this so I can't think of any other suggestions at this point! Good luck!

phronmophobic16:05:06

> or other consideration is to ask are people who use java/clojure libraries that leverage native libraries generally the kind of folk quite happy to compile or install those native libs themselves? I don't think there's any correlation. I think there are lots of devs who would love to use the functionality found in libclang, graphviz, glfw, and ffmpeg who aren't interest in how they're compiled/procured. > Is it really an ambition to make native libs as easy to use as a JVM lib? It was before I learned more about the problem! > What if your wrappers simply looked in the "appropriate" places for native libs and let people install for themselves? That's what the wrappers currently do. However, there are some libraries that aren't even available via popular package managers (eg. libclang and libretro).