off-topic 2021-06-12 | Slack Archive

sova-soars-the-sora17:06:57

is there a clojure-based analytics for webpages? I am thinking it could be pretty simple: have a collection of accesses via ip address and page and also have a .js file that periodically polls to see who is currently live connected... it could probably all fit into a middleware (?)

vemv19:06:17

I'm searching for the informal "Clojure FAQ" compiled from the mailing list archives. I think it was a gist? Or at least on github

seancorfield19:06:13

Do you mean this one https://clojure.org/guides/faq or the “Rich Hickey said…” https://gist.github.com/reborg/dc8b0c96c397a56668905e2767fd697f ?

👍 6

raspasov09:06:50

I feel like https://gist.github.com/reborg/dc8b0c96c397a56668905e2767fd697f#file-rich-already-answered-that-md should be pinned in #clojure . It’s really good!

seancorfield19:06:48

@vemv always takes me a while to find that second one — I should bookmark it!

vemv19:06:01

the latter, thanks much!

vemv19:06:27

I had it starred but stupid me kept using google instead of gist search

seancorfield19:06:54

(I also had it starred but not with a name that I would have searched for it by!)

vemv20:06:13

As a more interesting follow-up question, anyone has followed https://gist.github.com/reborg/dc8b0c96c397a56668905e2767fd697f#should-i-use-a-namespace-to-bundle-functions-to-model-a-domain-object-like-a-customer-or-product deeply? IME having written parentheses for a living for a number of years, nobody really implements this open, generic codebase where functions process open maps without a predictable schema. Even when using Datomic (which fosters a schemaless design), developers tend to expect, and ensure a fairly rigid modelling. e.g. I have a Product, a User, etc, and defns that precisely handle products and users. I don't have a defn that accepts an "anything" and smartly processes all the keys that may be present (with multimethods or such?)

seancorfield20:06:48

Datomic isn’t really “schemaless” though, is it? You have to define all your attributes before you can insert data. Whereas Crux and some of the others (Asami?) do not.

borkdude20:06:28

true, but an entity can have any number of attributes, not a fixed set like in static typing

borkdude20:06:03

a bit like with spec: you describe the attributes, but not the entities as a whole (well, spec 2 let's say) as a closed world thing with a capitalized name

👍 3

seancorfield20:06:09

As someone who has maintained clojure.java.jdbc (and now next.jdbc) for many years, I’d say that level of DB interaction can be open and generic to a degree. You can’t just pour “any old attribute” into a SQL DB but you mostly don’t need any schema/type support for basic stuff, i.e., hash maps with “arbitrary” keys whose values are number, string, Boolean, date/time types.

seancorfield20:06:36

(an approach we’ve adopted there to some degree is writing a loose Spec for a table’s row and extracting the keys from that Spec and using that to select-keys on write — the interaction of optional vs nilable is still kind of a pain, but you’re at the boundary of two systems so…)

seancorfield20:06:32

I would say that we do have a lot of code that just deals with the keys it cares about and passes everything else back and forth untouched — which you could argue is “open, generic”.

👍 3

respatialized20:06:14

This is only loosely related to your question @vemv, but I've wondered about this kind of "open" vs "closed" question myself in the context of deployments and systems as well. The "traditional" way to do deployments is very artifact-centric - compile the jar, throw it on to a VM or Docker container and then perform some kind of controlled restart of the system in order to make changes to the code. The idea is that the code to execute is "closed" - this machine doesn't execute any code except what I tell it to. However, one could imagine an alternate scenario where you just literally build your system out of a bunch of REPLs. VMs running continuously with a channel open to a trusted source hidden from the public internet (for the sake of argument I'm handwaving away security concerns/PLOP/etc) to receive new forms to evaluate. When you're ready to release your code, just redefine it. You may still need some kind of mount/`component` based system in order to ensure graceful termination of stateful resources when new forms come in on the wire, but I don't think there's anything in principle that prevents this kind of approach. Beyond the ease-of-deployment aspect, you also get the benefit of a running REPL that can execute whatever code other things interacting with the system need to. Again, you may need to take care with sensitive resources like databases and such. However, the idea of a system that can be recomposed at runtime into combinations of components that weren't even necessarily foreseen at compile time is extremely intriguing to me. It seems like it would make the idea of an "API" more or less obsolete, because instead of making calls to a "closed" collection of defined endpoints, you can be "open" and just... execute some code, which is sent on the wire... as data. A runtime-recomposable system like this could save tons of design and architectural work by just giving its users a simple collection of primitive capabilities of the production system and letting users compose them, Unix style, in a more expressive language than Bash. If you want to forbid some dangerous functions like shelling out, opening an untrusted connection, or whatever, you might build a detection mechanism that filters dangerous stuff out (and maybe even terminates the connection that the suspicious stuff came in from) and make sure to log every function sent in and called by these production REPLs. Not saying it wouldn't take work to ensure the security/stability of such a system, but in a trusted environment it could be extremely powerful. Plus, is it really that much worse than shipping around https://www.docker.com/ with every app, just waiting for an attacker to shell out to? Imagining scenarios like this is where I get kinda envious of Erlang-based systems, which seem leagues ahead of anything else when it comes to controlled redeploys and maintaining the liveness of the system. Joe Armstrong described what he called a https://joearms.github.io/published/2013-11-21-My-favorite-erlang-program.html which I think inspired a lot of this. Fred Hebert has https://ferd.ca/a-pipeline-made-of-airbags.html on doing this kind of REPL based deployment and how Kubernetes falls way short of where Erlang has been for decades.

vemv01:06:16

Thanks for sharing :) https://github.com/puppetlabs/trapperkeeper is kinda like that - it's like Component, but it's also a runtime-configurable system by which devops people can enable/disable components, endpoints, etc at will hot-code-reloading Clojure code in production is possible with a "Reloaded"-like workflow. However starting from a certain scale (say 50KLOC), compiling so much code live will be just as slow as launching a new precompiled .jar in the first place. And the .jar method has rollback...

mauricio.szabo00:06:25

Wow.... I couldn't disagree more 😄. It's close to impossible to guarantee the safety/stability to this system. It's also impossible to build a detection mechanism because an attacker could redefine the detection. Also, docker doesn't ship an entire Unix system - it ships an userspace, a filesystem/network chroot where if someone shells out, he'll just see an app - and nothing more, and he's on a sandbox where he can't do much damage. Sure, he can see the DB/APIs that that service has access to, but nothing more

mauricio.szabo00:06:30

About Erlang, I've been talking to people that do Erlang in production for a while. Most people don't use "network transparency" nor "hot redeploys". The first one is because if the network is reliably (most networks are) this doesn't work that well. The second one is because it's hard to make an erlang app hot-redeployable. As for Clojure, we can make a hot-deploy on it with clojure.tools.namespace and a little work, without needing a REPL in production....

respatialized00:06:15

I'm well aware there may be serious practical concerns with what I'm proposing here. But I'm still not fully convinced that every app needs to obey the best practices of public-facing software. If this is back of house software, on a point-to-point VPN, without a public endpoint, do we need to follow the same model? The folks behind Tailscale seem to think a different type of application can be built in this context. https://tailscale.com/blog/remembering-the-lan/ "We can have the LAN-like experience of the 90's back again, and we can add the best parts of the 21st century internet. A safe small space of people we trust, where we can program away from the prying eyes of the multi-billion-person internet. Where the outright villainous will be kept at bay by good identity services and good crypto."

respatialized00:06:32

I also wonder what would happen if a dedicated effort to build a secure app took what I'm proposing as a starting point and progressively locked it down, removing potentially problematic forms like sh, def, let, etc etc. Would you eventually end up somewhere like the conventional system of an app with a predefined API and auth layer governed by PLOP? Or could you land somewhere else, with a different interaction model, before you get there that still has a degree of stability and security?

mauricio.szabo01:06:49

The second approach seems impossible to me. You have to remove sh, def for example. alter-var-root and others are out too, so you just lost the ability to hot-swap code. But then, you have to be aware of DDOS attacks, so you have to remove everything that can lead to a program never terminating: infinite ranges, and recursion. The end result seem to me that a DSL that resembles Clojure and probably isn't even turing complete. Now, you have an incredible generic DSL, without recursions or definition of vars, that's insanely awkward to program... I don't see the benefit, to be honest.

mauricio.szabo01:06:07

As for the first model - I would invest on faster development cycles, to be honest. I don't see much benefit of being able to send arbitrary commands over the wire. It's also hard to find performance problems, inconsistencies (should the API be able to access that specific data without considering this other one, for example, or is the API accessing outdated data/schemas that were already migrated to newer versions), or even the simple fact that now dead-code is impossible to find (been there - worked on a Ruby codebase where reports exposed templates that could code-interpolated). I know Clojure have a reputation for not breaking old code, but I don't really see this being approachable in medium to big codebases. I, for once, saw a company almost loose a huge sum of money because they didn't delete old code, and an old app called that old API, making invalid purchases multiple times....

2021-06-12

Channels