I've got a little side project of developing my own e-reader software for e-ink devices. Nearly all those devices (kobos, remarkables, kindle etc) are 32 bit arm (armv7l) hardware. Memory and storage is really limited, and the cpu is slow. I've got my own 32 bit arm jdk build. Unfortunately there's no 32bit arm panama ffi/ffm backend, so I have to use the fallback libffi backend. I wrote a bare bones membrane backend for java2d that renders to a pixel buffer that I can blit onto the e-ink device. Then I did the same with skia (after going down the rabbit hole of getting a minimal skia build for armv7l). The Skija bindings were too heavy wait so I've been rolling my own minimal clj -> skia native bridge. The surprising thing is that the java2d rendering is much faster than skia. I can get sub 100ms renders with java2d but nothing below 350ms with skia.
That sounds like a really cool project. I would be curious to see a flamegraph of the profile to see where all the time is being spent. I can imagine if ffi calls are slow, then making a bunch of ffi calls to draw might be slow as well. However, it's hard to guess.
For comparison, do you know what render times are like using whatever the recommended methods are?
For membrane, draws have been pretty fast and I haven't prioritized spending more time making it faster. There's still plenty of low hanging fruit for optimizations in an environment where drawing is slow.
Intuitively, I don't feel like using libffi for ffi calls would necessarily be slow. There might be some way to help make the ffi calls faster.
For comparison, do you know what render times are like using whatever the recommended methods are?What do you mean by this? Which recommended methods? > I would be curious to see a flamegraph of the profile to see where all the time is being spent. I'll create one The java2d one will be most interesting as it'll all happen jvm side. > I don't feel like using libffi for ffi calls would necessarily be slow. There might be some way to help make the ffi calls faster. Well there's certainly some reason that all the other architectures got a specialized backend for panama. https://bugs.openjdk.org/browse/JDK-8371909
Add support for the Foreign Function & Memory (FFM) API on 32-bit ARM (arm32). FFM is already supported on all other platforms, but arm32 remains unsupported. This work requires defining the ARM32 platform ABI and adding the necessary backend support to enable native interoperation on this architecture.There's also good chance that I'm just doing it all wrong xD
my first naive skia impl did a lot of ffi, for every draw command, but skia has a "replay" mechanism where you convert your draw commands to data, batch them and send them over the ffi boundary and do the draw all at once on the native side. that increased render speed significantly but it's still slower than java2d
> What do you mean by this? Which recommended methods? Whatever example code they might have for rendering.
oh, like the device manufacturer?
yea
I know some backends got specialized backends, but I don't think libffi performance is particularly bad.
not sure if you can get https://github.com/clojure-goes-fast/clj-async-profiler to work on the device, but it also shows time spent in native code.
ah hahah, there's no recommendations, this is all homebrew/device hacking stuff. for kindles you have to jailbreak them, for the kobo/remarkables you just need to flip a hidden little config in the sd card to enable ssh access. but officially they are not open for 3rd party development
i'll try to get clj-async-profiler working, I don't see why it wouldn't? just might be really slow.
you don't happen to have a low-level Paragraph element implementation for membrane do you? Something analogous to <p> in html, that flows multiline text with wrapping?
that's what membrane.skia.paragraph is https://phronmophobic.github.io/membrane/styled-text/index.html
it's skia only though
oh, how did I miss that..ah it's in the skia folder that's why
The links aren't very organized, so it's probably easy to miss.
I'd like to have a unified API that works for skia and java2d, but I haven't yet looked at what java2d offers and how much work that would be.
> i'll try to get clj-async-profiler working, I don't see why it wouldn't? just might be really slow. clj-async-profiler relies on async-profiler which has a native dependency, but maybe it's not too hard to build for your platform. https://github.com/async-profiler/async-profiler
Yea I started this thinking that surely java2d would be slower than skia. I only started my impl with java2d because it was straightforward to implement that just to prove i understood how membrane works.. because getting a skia build working and wired up was a whole thing. So far I seem to be wrong, but I'm sure there's more to learn here.
yea, finding out where all the time is being spent should give some hints.
I've used clj-async-profiler before a lot on other projects.. I don't know why I didn't think to use it here 🤦♂️ I've spent a lot of time instrumenting the code with timers and printlns
btw I love how minimal membrane is 🙂 It would be interesting (but perhaps a waste of time if I'm the only audience) to split out the current repo into little micro-libs. Currently I am vendoring a small handful of membrane sources files in tree just because shipping around all that code and deps (even if it isn't compiled) is painfully slow
I’m away from keyboard, but it doesn’t seem like the membrane source should be that large. My guess is the dependencies? Maybe the cljs stuff? If you have a sense of what the slow part is, I can think about ways to mitigate the problem.
Looking at the dependencies, it seems like the total size is only around a few dozen megabytes. Looking at git ls-files in the membrane repo shows only about a dozen megabytes. That doesn't seem unreasonable, but I guess it depends on how you're stuff around. Do you have a sense of what is the slow part for you?