sci

doojin 2026-04-01T11:43:36.141049Z

It seems to me that running one SCI script at a time in each node.js worker thread is a better way to enforce timeout than node.js vm module. node.js vm module launches a new thread when you specify timeout. https://nodejs.org/api/vm.html says > Using the timeout or breakOnSigint options will result in new event loops and corresponding threads being started, which have a non-zero performance overhead. This means it is more expensive than just maintaining a node.js worker thread pool where each thread runs one SCI script at a time. I can enforce timeout by terminating a worker thread. It seems worker.terminate() is not deprecated unlike JVM's Thread.stop(). A worker thread can enforce its own memory consumption limit. Although out-of-memory error can take down the entire process, it can also just take down only a worker thread. To limit the blast radius, you can limit the number of threads in each process. node.js worker threads are a better substrate for untrusted SCI scripts. In my mind, I would have a dynamically resizing process pool where each process has a few threads, and each thread processes one SCI script at a time.

2026-04-01T12:35:54.415479Z

I don't know much about killing threads in node.js, but depending on which functions/macros you want expose you can do the sandboxing in Sci. I got pretty far with this. I think there is one last hurdle; recursion of functions. I think this could be fixed when this issue is resolved https://github.com/babashka/sci/issues/1002

doojin 2026-04-01T12:36:55.159659Z

I'm talking about limiting execution time and memory consumption outside SCI API sandboxing.

doojin 2026-04-01T12:37:08.986479Z

SCI itself can't enforce timeout or memory limit.

doojin 2026-04-01T12:37:31.230109Z

You can run SCI scripts in isolated-vm or a worker thread.

doojin 2026-04-01T12:38:58.345409Z

worker.terminate() can reliably kill an SCI script stuck in a CPU loop.

doojin 2026-04-01T12:42:26.413969Z

I don't think I can realistically find a way to limit execution time and memory consumption inside SCI.

2026-04-01T12:43:24.412339Z

If you control loops and recursion you can control execution time, no?

doojin 2026-04-01T12:43:39.975709Z

If I want to allow recursion?

2026-04-01T12:45:11.958049Z

So https://github.com/babashka/sci/issues/1002 is about blocking the underlying fn* , but you would implement your own version of fn that has a counter of recursion. So you would allow it until x times or x seconds

doojin 2026-04-01T12:45:43.973469Z

There is also letfn which can lead to recursion.

2026-04-01T12:46:36.535849Z

Yeah I think you can do the same as with fn, but i didn't try yet. The problem is that fn* cannot be denied at the moment

doojin 2026-04-01T12:47:42.346099Z

Are you sure that you can actually limit execution steps or execution time?

doojin 2026-04-01T12:47:46.877669Z

What about memory?

doojin 2026-04-01T12:49:09.583099Z

I can see that you are trying to use JVM.

2026-04-01T12:52:12.640009Z

Yeah I am not sure about memory, but you can override everything with Sci so you could measure the length of vectors or the reading of certain strings. So I'm not saying you get it for free, but I think it is possible For recursion, you have to wrap all generated fn's with an internal counter. And every so often you can check the elapsed time for instance

2026-04-01T12:52:59.578879Z

Maybe killing a thread is easier if possible

doojin 2026-04-01T12:54:28.926919Z

Stopping an asynchronous SCI script in node.js is not straightforward. I guess SCI scripts are going to be synchronous on JVM anyway.

2026-04-01T12:59:07.050489Z

Here is an example how you can control fn execution https://gist.github.com/jeroenvandijk/6586198b17a67a3863349f1f90cc6846#file-anti_recursion-clj-L44-L62

2026-04-01T13:00:10.025939Z

But as said before there are currently still backdoors via fn* , letfn* and some others

doojin 2026-04-01T13:01:19.056019Z

fn* has to be blocked? I am new to sci. I don't really know sci. I thought everything could be whitelisted.

2026-04-01T13:03:01.766029Z

Yeah it an implementation detail of Sci, so a user should normally not call it directly. But it gives access to function generation and thus also recursion

doojin 2026-04-01T13:04:20.643009Z

Can't you override fn* directly?

2026-04-01T13:04:46.440879Z

I thought this issue described it, but it is not complete https://github.com/babashka/sci/issues/1002 I have to check again what the problem was exactly, one moment

doojin 2026-04-01T13:09:04.448749Z

I think the better way to solve the issue is to enforce strict whitelist...?

doojin 2026-04-01T13:09:35.176159Z

It can add a strict mode to avoid breaking existing codebase?

2026-04-01T13:12:09.869579Z

So you have to replicate fn** (https://github.com/babashka/sci/blob/1f3a8cef69f9cc7c4a577084a948a2e749b0682e/src/sci/impl/fns.cljc#L200) but with the counter/timer built in. Currently this fn** depends on fn* (https://github.com/babashka/sci/blob/1f3a8cef69f9cc7c4a577084a948a2e749b0682e/src/sci/impl/fns.cljc#L205) so without fn* you can't generate functions with the current fn macro. I'm not sure what would be a way around it

doojin 2026-04-01T13:12:11.545009Z

If fetching from database takes a long time, you might still need timeout.

2026-04-01T13:14:49.353759Z

> I think the better way to solve the issue is to enforce strict whitelist...? Yeah if you can limit the functions and macros used that could be enough. But I'm guessing fn is used even in simple code

doojin 2026-04-01T13:16:20.415549Z

Why not just use a runtime environment that enforces timeout and memory limit?

2026-04-01T13:17:32.792739Z

You mean the node.js option? yeah i guess that is more convenient

doojin 2026-04-01T13:18:33.407449Z

I think node.js worker thread can reliably enforce timeout, but in the case of out-of-memory error, the entire process can easily crash.

2026-04-01T13:20:36.970019Z

Hmm I see, maybe another option is shelling out to a babashka process or something?

2026-04-01T13:21:25.372139Z

(a babashka process that runs sci, so not exposing bash etc)

doojin 2026-04-01T13:22:06.195509Z

https://github.com/nodejs/node/issues/34823 makes me think I should perhaps run one SCI script at a time in each node.js process in a dynamically resizing process pool.

doojin 2026-04-01T13:22:31.974379Z

I don't think babashka can run SCI scripts?

2026-04-01T13:22:40.601419Z

It does

doojin 2026-04-01T13:23:11.517879Z

How?

2026-04-01T13:23:16.143779Z

bb -e '(sci.core/eval-string ":hello")'

doojin 2026-04-01T13:23:29.811849Z

The babashka doc didn't include sci..

doojin 2026-04-01T13:24:11.099639Z

Anyway, if babashka comes with SCI, then you can maintain a dynamically resizing pool of babashka processes. Each babashka process would run one SCI script at a time...

doojin 2026-04-01T13:25:27.227909Z

You can use babashka, node.js, or whatever if you run one SCI script at a time in each process inside a process pool.

2026-04-01T13:26:18.755879Z

Yeah, I don't know your usecase of course, but I guess that would work for some usecases

doojin 2026-04-01T13:26:57.411629Z

My supposed use case is server-side hiccup/HTML rendering from user SCI scripts. Basically, a more flexible version of shopify liquid template language.

doojin 2026-04-01T13:28:02.095119Z

You can enforce timeout by killing a process. Memory limit is enforced by the process itself or the OS.

2026-04-01T13:28:20.192939Z

Do you still need letfn and loop for that?

doojin 2026-04-01T13:28:38.196639Z

I don't know... Iterating over products may need a loop of some kind.

2026-04-01T13:28:53.267289Z

Yeah maybe for ?

2026-04-01T13:28:57.593669Z

But no real recursion?

doojin 2026-04-01T13:28:58.360279Z

Yes?

doojin 2026-04-01T13:29:16.573949Z

But, I want people to define functions for composable hiccup templates...

doojin 2026-04-01T13:29:27.828589Z

Once I introduce functions, recursion becomes possible.

2026-04-01T13:30:01.026059Z

Yeah, but you could block it. For templating sounds like you normally wouldn't need it

2026-04-01T13:30:36.835289Z

If you block recursion the problem becomes easier

doojin 2026-04-01T13:30:41.112929Z

A function with parameters is only natural for flexible hiccup templating.

doojin 2026-04-01T13:31:13.756999Z

I don't know how else I would do that...

2026-04-01T13:31:46.961699Z

Can you give an example of what you mean?

doojin 2026-04-01T13:31:54.165949Z

Also, hiccup templates can also potentially recurse... if I allow composite hiccup tags?

doojin 2026-04-01T13:33:20.913189Z

(defn abc-section
  [name]
  [:article "..." (more-function ...)])

doojin 2026-04-01T13:35:02.800629Z

As I wrote above, I'm new to clojure ecosystem. I still have to learn almost everything from scratch.

doojin 2026-04-01T13:35:27.091059Z

All I have for now is some vague intuition.

doojin 2026-04-01T13:36:24.064669Z

If you find any good way to prevent recursion, let me know. I'd very much like to stay on one JVM process if possible.

2026-04-01T13:37:44.337719Z

Yeah recursion is tricky. There are many ways to do it in normal clojure, here are some https://gist.github.com/jeroenvandijk/27ad5a6ddafb53742970053be90a13c2 I looked into this a while ago, and from what I remember that fn* issue is the blocker to control it. But maybe there is another way, I have to revisit it maybe

2026-04-01T13:41:25.786579Z

For now, I think if the rendering via sci in babashka is performant enough for you that is probably the easiest

doojin 2026-04-01T13:43:03.410949Z

The babashka documentation doesn't include sci namespace, but if it actually has SCI, I can use babashka, node.js threads, or node.js processes for limiting time and memory.

doojin 2026-04-01T13:53:12.899139Z

But, it seems babashka is somewhat restricted in what it can do. node.js threads and node.js processes might be a better option for a richer environment.

👍 1
doojin 2026-04-01T14:23:59.699749Z

If you only need to modify fn, then you should probably ask for a strict whitelist mode. That way, you can inject a modified fn into SCI without allowing fn* in SCI scripts.

doojin 2026-04-01T14:25:07.934199Z

The problem is that sci doesn't differentiate between source-level forms and expanded forms.

doojin 2026-04-01T14:38:42.710689Z

It's fixable, but it hasn't been fixed. That means I'm going to use babashka processes, node.js threads, or node.js processes. Or, you are going to have to pay him a lot if you want it to be fixed quickly. He's available for commercial service. Or, you are going to have to fix it yourself.

2026-04-01T15:06:11.692289Z

Sure 🙂

2026-04-01T15:08:04.111389Z

If you find something new, you can create an issue or comment on the issue I mentioned

doojin 2026-04-01T15:14:13.704079Z

I still think it's a better use of your time to find a suitable runtime environment for imposing time limit and memory limit on SCI scripts. In my case, as I wrote above, the simplest options are babashka processes, node.js threads, and node.js processes. JVM processes are available, but they consume more memory and are slower to boot up.

doojin 2026-04-01T15:15:18.299549Z

Even with fn trick, imposing memory limit and time limit can be a hard problem.

doojin 2026-04-01T16:27:30.893719Z

According to my research, ruby puma web server uses multi process + multi thread architecture and runs entirely "synchronous" code unlike node.js which runs asynchronous code in one thread. Each process in puma process pool has 3 ~ 5 threads. Each thread in a puma process handles one request at a time synchronously. This obviously requires a lot more RAM than just having multiple threads in one process, but it will limit the blast radius of a malicious/problematic SCI script to 3 ~ 5 SCI jobs. Additional hardware you have to purchase costs a lot less than extra developer time required for a more complex architecture.

doojin 2026-04-01T16:29:55.474289Z

Asynchronous code is difficult to control.

doojin 2026-04-01T16:31:17.532249Z

If I apply puma architecture in node.js SCI server-side renderers, node.js would still consume less memory than ruby puma.

doojin 2026-04-01T16:32:44.324729Z

Developer time is a lot more expensive than extra RAM. Time is money.

2026-04-01T16:36:00.521839Z

Yeah I guess it depends on the problem, but if you can solve it with extra RAM that's great!

doojin 2026-04-01T16:38:28.981779Z

What problem are you trying to solve that requires limiting execution steps and minimal RAM usage?

2026-04-01T16:41:01.847089Z

I haven't been working on it lately, but my intention was to have a low latency and high performance web environment. Having the sandbox in Sci directly makes it a lot leaner and efficient. But if you have a specific use case in mind with specific requirements, maybe that is not necessary at all!

doojin 2026-04-01T16:42:19.698679Z

I want users to write server-side HTML/hiccup rendering scripts in SCI.

doojin 2026-04-01T16:42:47.135489Z

A more flexible version of shopify liquid template language.

👍 1
doojin 2026-04-01T16:46:36.435509Z

I think the need to limit time and memory against untrusted sandboxes leads to puma-style architecture for simplicity.

2026-04-01T16:48:35.503019Z

Maybe, but it is hard to compare typical ruby performance with clojure. With shared memory and very good concurrency primitives etc Clojure has a lot of benefits

doojin 2026-04-01T16:49:12.220379Z

Puma probably won't kill my business.

2026-04-01T16:49:16.953159Z

That you can potentially benefit from with Sci But until there is a proper sandbox you might still have to go that puma route

doojin 2026-04-01T16:50:32.455089Z

At this point, I need to produce something quickly. If my business grows, then I can probably pay borkdude to implement robust limitation on execution steps or execution time or something.

2026-04-01T16:51:24.587309Z

Yeah of course, choose the most pragmatic solution

doojin 2026-04-01T16:51:25.295319Z

Time is money.

doojin 2026-04-02T00:32:49.530539Z

My current iteration of idea is to run one "synchronous" SCI script at a time in one node.js process in a process pool and limit the amount of heap space for each process to 32MB. Timeout is enforced by killing a process. This is easier to reason about and still consumes less memory than ruby puma workers. I'm just going to have to purchase simplicity with more hardware.

Karol Wójcik 2026-04-01T18:55:28.193399Z

Is it possible to dump SCI env state to some blob and then recreate env with all defs/defn from it?

2026-04-01T19:08:21.633759Z

Maybe you can serialize the state somehow? https://github.com/babashka/sci?tab=readme-ov-file#state

doojin 2026-04-01T19:51:47.289789Z

Or, do you want to fork the state?

borkdude 2026-04-01T19:57:46.080289Z

@karol.wojcik what's the use case?

Karol Wójcik 2026-04-01T20:02:43.859559Z

I will need both options. I’m creating RLM with embedded SCI as an interpreter. I’m querying an env with some query which spans multiple iterations and each iteration creates N vars. This env is a full context and I need to persist it, fork it if necessary, etc. Imagine a Claude Code, but vars instead of messages.

borkdude 2026-04-01T20:33:13.548929Z

fork is possible

borkdude 2026-04-01T20:33:49.648049Z

persist to disk not, unless you use host technology for it, or perhaps this works https://blog.redplanetlabs.com/2020/01/06/serializing-and-deserializing-clojure-fns-with-nippy/

doojin 2026-04-02T01:24:30.898139Z

Can you "manually" persist plain data? If you don't try to persist functions, things become easier. If everything you care to persist is EDN, you can persist it.

doojin 2026-04-02T01:34:09.624399Z

Are you trying to create a local agent or a shared backend? Just curious.

Karol Wójcik 2026-04-02T04:16:40.919169Z

Local one. Tui/web, something alike Hermes, but on SCI for us Clojurists :)

doojin 2026-04-02T04:17:13.298779Z

Then, you don't need heavy restrictions that I'm considering for my shared backend.