New Proletarian user here as well. Thanks for the great library! I have some recurring jobs I want to run in a cluster so that exactly one node executes the job at any scheduled time. My current working hypothesis is to use a) Chime for scheduling, b) Proletarian for at-least-once + retries, c) custom table with timestamp (from Chime) + job-type + locking when enqueuing to get at-most-once behaviour. Does this sound sane? Have I missed some Proletarian functionality/pattern that supports clustered at-most-once out of the box? In case there's no existing way to do this, would it be wildly out of scope for Proletarian to support at-most-once behaviour by accepting an additional key when enqueuing a job (I could try my hand at creating a PR).
Yes, idempotency checks are the key thing here, as every node in the cluster will try to schedule the same job and it should only be executed once per any scheduled time. I guess the specific exact requirements for idempotency vary so much between use cases that it probably makes sense to roll your own implementation. Thanks for taking the time to address this!
Sounds sensible overall. Do note that job/enqueue! does support :process-at and :process-in options, which allows future scheduling of jobs. If your scheduling requirements are more complex this may not work for you, but I’ve found that it works fine for my use case, allowing me to drop Chime for scheduling and losing that dependency. At-most-once will be more tricky I reckon. As it stands, proletarian will run your job within the db transaction that pulls the job from the queue.
This ensures that a job will be executed at-most-once in a cluster - assuming that all nodes in a cluster are pointing to the same Postgres DB. This means that, across all systems, proletarian should be almost always at-most-once - except where a) whatever work you’re doing completes but does not report back correctly (in which case you are screwed either way), or b) the proletarian db becomes unresponsive after you execute your job and before the update to the proletarian job record is committed. Tough problem to solve, but super useful if you had a solution!
Hi @santeri.korri, thanks for using Proletarian! Like @denis.mccarthy.kerry says, Proletarian is "at-most-once" in the happy path - when no exceptions occur during processing of the job. If there is a failure it will get processed again, however. This is to provide the at least one guarantee that we absolutely want.
Out the box Proletarian can do "[...] exactly one node executes the job at any scheduled time". This is the bread and butter of Proletarian. But we have to consider failure and retries, too. A job might have done all there is to do (call a third party API, for example), yet fail just before Proletarian can commit its finished state to the database - then it will be retried (if configured to do so), and it's on you, the developer, to ensure that the effect of the job, when run again, is the same as if it was only run once. This is idempotency. (You might tolerate, in your business domain, that the effect happens twice or more. In that case you don't have to implement your own idempotency checks.)