Fork me on GitHub
#xtdb
<
2020-11-08
>
udit14:11:03

Heya folks! I was wondering if there’s a way to handle file uploads in crux like rails’ active-storage - store the actual file to a file system (local / s3 / whatever), and store the URI for that in crux’s document store. Is there an existing lib that handles this with crux?

refset14:11:07

Hey 🙂 not seen anything in the wild yet, but it's definitely a good idea. You could probably re-use the document store protocol (+ implementations), or at least borrow from it heavily

🙏 3
udit15:11:52

Sweet. So this doesn’t sound outlandish at all, right? Thanks for the pointer. Let me go through the doc-store code 🙂

👌 3
malcolmsparks16:11:39

It's not outlandish at all. You may want to upload a file, store its content in a docstore and save the identity/URL in Crux. You could consider a Content Addressable Store to provide this. Personally I've been working on something very similar for streaming multipart uploads into Crux but it isn't ready for public use yet.

udit06:11:09

This sounds interesting. Are there plans to make your work public? If nothing else, it would be nice to read it :)

Steven Deobald11:11:03

I'm trying to reason my way through how much of ActiveStorage is worth borrowing, ideologically. I think a lot (all?) of their motivation to decouple the actual domain models from "attachments" by removing the need to identify an attachment within the domain model itself stems from the fact that it uglies up schema migrations. I keep mulling it over, but I can't see why we'd bother... there's no real harm to tracking the id(s) of the attachment(s) directly in Crux documents. That also means we don't really need both models from ActiveStorage: Attachment and Blob become one thing for us, I think, though we still need one model to track original filename, file size, and other metadata. (I wouldn't want to barf all that into every Crux document that had an attachment.) A lot of the other abstraction from ActiveStorage is valuable once we've unified those two things: • separate the public URL from the internal URL • purging (though I don't see much need for purge_later) • convenience fns like attached? and url_for • permanent URLs vs. short-lived URLs • maybe the direct upload JavaScript from AS could be used verbatim, at some point?

Steven Deobald11:11:08

I'd be curious to hear both your thoughts on any of these things keeping in mind that we'd like to release this ActiveStorage-equivalent as a library unto itself. In particular, if there's a reason to include the extra layer of indirection from the get-go, we might as well head down that route rather than trying to patch it up later. I just can't think of one.

Steven Deobald11:11:12

@U050CTFRT Just to clarify: were you thinking of an existing general-purpose CAS? Or just a hash of contents (or equivalent) that we build ourselves? So far I'd only considered the latter.

malcolmsparks10:11:15

But please be warned, I've only tested with small uploads, when I tried with a large file (a movie) it hung, so I haven't quite got it working -but you can see what's involved. I probably haven't quite got the RxJava incantations correct.

malcolmsparks10:11:54

The point of the code is to stream an upload to disk (but any streamable sink could be used, such as S3), but also hash the upload in parallel, so that you don't have to reread the file after storing it to generate the hash.

malcolmsparks10:11:16

It's non-blocking with back-pressure