This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-05-31
Channels
- # announcements (6)
- # babashka (40)
- # beginners (6)
- # calva (1)
- # cider (1)
- # clerk (43)
- # clj-kondo (3)
- # clojure (93)
- # clojure-denver (8)
- # clojure-europe (52)
- # clojure-norway (20)
- # clojure-sweden (7)
- # community-development (5)
- # datascript (15)
- # datomic (30)
- # emacs (24)
- # events (15)
- # fulcro (23)
- # graalvm (12)
- # gratitude (1)
- # helix (4)
- # honeysql (4)
- # hoplon (39)
- # hyperfiddle (7)
- # introduce-yourself (1)
- # jobs (1)
- # jobs-discuss (26)
- # lambdaisland (3)
- # lsp (6)
- # matcher-combinators (2)
- # matrix (5)
- # meander (39)
- # nrepl (4)
- # nyc (1)
- # off-topic (5)
- # portal (73)
- # practicalli (1)
- # re-frame (2)
- # reitit (22)
- # releases (1)
- # remote-jobs (4)
- # shadow-cljs (5)
- # sql (17)
- # testing (1)
- # tools-deps (15)
Hi All, i am planning to implement integration test cases for our services, is it possible to do it with h2 inline db, is there any library available in clojure to support it?, or any example projects would be helpful.
to connect to a sql database you need some way to create a connection (or connection pool) and send queries for that, hikari-cp and next.jdbc work well together, then you just specify a connection spec for h2. You'll also need to bring h2 as a dependency I would recommend using the database you're actually working with for integration testing in case you're relying on some feature specific for it
We are today, saving the value to our real db using fixtures and transaction (in the trasaction having checkpoint and will roll back what ever happened within test transaction) But going forward with this approach looks risky You mean like embedded postgres which will be started when the application starts and will be used only for the integration test?
There's also the #C1Q164V29 channel, some of your questions might have been answered there
Thats great thank you ❤️
Here’s the embedded postgres thing: https://github.com/zonkyio/embedded-postgres As far as I understand it’s really embedded in process, not an external thing that’s started before the tests and shut down after Unfortunately I couldnt dind such a thing for Mysql
(ns test-utils
(:require [db.core :as db]
[migratus.core :as migratus]
[next.jdbc :as jdbc])
(:import (java.util.concurrent.atomic AtomicLong)
(org.testcontainers.containers PostgreSQLContainer)
(org.testcontainers.containers.wait.strategy Wait)
(org.testcontainers.utility DockerImageName)))
(set! *warn-on-reflection* true)
(def ^:private container-delay
(delay
(let [container (PostgreSQLContainer.
(-> (DockerImageName/parse "postgres")
(.withTag "15")))]
(.start container)
(.waitingFor container (Wait/forListeningPort)))))
(def datasource
(delay
(let [^PostgreSQLContainer container @container-delay]
(jdbc/get-datasource
{:dbtype "postgresql"
:jdbcUrl (str "jdbc:postgresql://"
(.getHost container)
":"
(.getMappedPort container PostgreSQLContainer/POSTGRESQL_PORT)
"/"
(.getDatabaseName container)
"?user="
(.getUsername container)
"&password="
(.getPassword container))}))))
(def ^:private migrations-delay
(delay
(migratus/migrate {:store :database
:migration-dir "resources/migrations/"
:db {:datasource @datasource}})))
(def ^:private
test-counter (AtomicLong.))
(defn with-test-db-info
[cb]
@migrations-delay
(let [test-table-name (str "test_" (.getAndIncrement ^AtomicLong test-counter))
^PostgreSQLContainer container @container-delay]
(jdbc/execute!
@datasource
[(format "create database %s template %s;"
test-table-name
(.getDatabaseName container))])
(try
(cb {:dbtype "postgresql"
:dbname test-table-name
:user (.getUsername container)
:password (.getPassword container)
:host (.getHost container)
:port (.getMappedPort container PostgreSQLContainer/POSTGRESQL_PORT)})
(finally
(jdbc/execute! @datasource
[(format "drop database %s;" test-table-name)])))))
(defn with-test-db
[cb]
(with-test-db-info
(fn [db-info]
(let [db (db/start-db {:jdbcUrl (str "jdbc:postgresql://"
(:host db-info)
":"
(:port db-info)
"/"
(:dbname db-info)
"?user="
(:user db-info)
"&password="
(:password db-info))})]
(try
(cb db)
(finally (db/stop-db db)))))))
create a postgres db with testcontainers, make a "template database". re-clone that template on every test
In answer to OP's question about h2: h2 works fine and no additional library is needed. But "integration tests" with a database other than "the real one" can be troublesome where their SQL dialects or features differ.
When we interop with Java directly, are we essentially "leaving" FP land? For example, if I want to use Spring directly instead of a Clojure server library, am I in for a world of hurt with mutable objects leaking into my Clojure code? How do I handle that?
Well, eh? Kinda depends. java.time is an api which has all immutable data structures so thats definitely in the clear, but clojure code is generally not "pure fp". If you pull a java library to read from SQS or something then you will be performing an essential side effect and there will be some mutable objects involved with that, maybe some that are in return values, but if you are using a java library for a unique capability generally you can box that up
Some Java libraries are better than others. Even if they’re basically all OOP and suffer from OOP-related illnesses, they don’t necessarily mutate objects everywhere
Ring is a super example of keeping interop out "at the edges"; none of the mutable stuff leaks into your Clojure. But you do not need to use wrapper libraries. To enjoy the FP benefits, you need only take care to keep side-effects out "at the edges". As for Spring, it's difficult to imagine...
I don't mean that Clojure is always pure, What I mean is Clojure is often touted as having seamless interop with Java, but I would imagine doing what Ring does and keeping mutability at the edges of your wrapper would be extremely difficult, making Java interop kind of a non-starter if you're trying to stay 90% functional
A question about chunking. I’m talking to an API, which returns a Stream of responses, slowly. I can consume them much faster than they are produced. I line-seq
that stream, and send them on.
I thought I’d do some transformation on the responses before passing them on, for which I thought eduction
would do a good job. However, it gathers 32 of them (or whatever it is) then machine guns them at me in rapid succession, after which it gathers 32 more and so on. I don’t want this involuntary caching in this case, how can I get rid of it?
I’m doing maps, filters, and custom transducers. I’d like to keep it as a transducer stack if I can.
Maybe you can make a stack first with comp:
(def tx (comp (filter ...) (map ...)))
and them apply it to the lines with sequence:
(sequence tx (line-seq reader))
sequence
exhibits the same behaviour: it chunks 32 of them, not realizing that the source is a slow API, then releases them at once.
transduce
actually offers a way around this. The nice stream representation is gone, but that’s OK. I basically just want to forward them to another channel anyway.
eduction
itself doesn't chunk. But it's an iterable, and Clojure turns iterables into chunked seqs when you iterate over them.
Not a Clojure question per-se, but has anyone found a Java regex library that cannot be trivially exploited to DoS your service? I am talking something like a NFA without capture groups or back tracking, something that just tells me if something matches the regex or not?
DFA based solutions can be exploited to produce massive memory hogs.
No clue how complete it is and whether there are any other problems but: https://algs4.cs.princeton.edu/54regexp/
Ah, and this: https://github.com/google/re2j
Actually, I'm a bit surprised to find that many. But I'll stop at 3 links. :) https://www.brics.dk/automaton/
Yeah I have evaluated this last one in the past and I don’t remember exactly what the issue was that I had with it
I’ll look at it again thanks
I'd look at the Google one more closely. There's a series of in-depth articles on RegEx and NFA and I believe RE2 was the outcome, which was eventually ported to Java.
The first article in the series, if you're curious: https://swtch.com/~rsc/regexp/regexp1.html
ah http://brics.dk has non-standard syntax
since I use user entered regexes this is a no-go
there's also this : https://github.com/fhur/regie ... which lets you specify the regex as a data structure, so you'd be able to write a spec to validate that it had no capture groups in? It says "do not use in production" but I think it's an interesting approach. You might be able to use instaparse to parse the subset of regular expressions that you want, and then spec/malli to validate that you have no capture groups (for example)? Then you'd be able to use the normal java regex engine, but ensure that the regex is "safe"?
No capture groups does not mean that it won't be susceptible to DoS: https://clojurians.slack.com/archives/C03S1KBA2/p1649688396263389
sure - but that was an example from the question ... I guess I was just looking for a way to validate the input to ensure safety rather than changing the processing model. Just suggesting another approach 😉
It might make sense to use https://github.com/lambdaisland/regal to construct regexes, and only allow a subset of its language as acceptable? e.g. a simple Malli validation
How would you determine whether the regex will be slow or not?
The RegEx that I linked to above has nothing but []
and ranges - and yet it is possible to create a DoS attack with it.
Probably a few rules of thumb would be useful, e.g. "less than 10 elements" I wouldn't expect it to be unbreakable, just less obnoxious than a naive approach :)
Yeah, I would just use a proper alternative implementation. I don't see any reason to stick to the built-in RegEx class when it isn't sufficient.
As a demonstration, try running this:
(let [s (str (apply str (repeat 10000 "a")) "b")]
(time (do (re-matches #"(?:a+)+" s)
nil)))
Note how there are no capturing groups and the RegEx is just 7 characters long.
On my machine it takes 600+ ms to complete. If you add an extra 0 to the length of the string, it completes in over 70 seconds.
One might argue that outlawing even non-capturing groups ought to be enough. But I wouldn't bet on it - there might very well be some variation with []
or {}
or maybe even without that has horrendous performance.Validating the string to be regexed against also seems a good idea :) It all really depends on one's intent though. For a common business case "let the user provide some regexes" I see myself building a sane subset of regex out of Regal. When combined with Malli one could pick a subset of features to be used (or combinations thereof). Maybe they could be really dumbed-down to a very limited flavor of regex. Anyway, determining the absolute validity of some or other approach if off-topic.
Not sure if this is the correct channel, but I’m interested in implementing an improved version of ns-tracker
using clojure.tools.namespace.*
. I tried to use clojure.tools.namespace.repl/refresh
et. al. but ran into an issue WRT *ns*
and *e
not being properly bound on the thread. My approach requires starting a separate thread to check for changes to files in the background and automatically call the repl/refresh
function. However, there are issues with this as I’m not actually running in the repl (in the case of lein run
) or I’m not in the REPL thread (in the case of lein repl
). My best guess is to start with the repl/refresh
code but modify it to account for running in a separate thread. Has anyone attempted this or know if any libraries similar to ns-tracker
that uses clojure.tools.namespace.*
?
open to changes in tools.namespace if you can help clarify what the problem is
+1 to a nicer problem statement
anyway I think I get what you're trying to accomplish. clojure.core/bound-fn
might help?
perhaps that is all I need. I was not aware of bound-fn
. I will check that out and if I still have issues, will try to clarify into something more succinct. Thanks for your help.
bound-fn will only communicate bindings from the thread you start with, but if you are not starting from a repl thread things like *ns*
won't be bound to start with
I think I understand. The thread that calls repl/refresh
will start from the main REPL thread, which should satisfy this constraint, correct?
I attempted to use bound-fn
within a thread that is started from my REPL session. Doing so resolved the errors that were occurring due to *ns*
and *e
not being properly bound. (clojure.tools.namespace.repl/refresh)
executes successfully and appears to refresh all the relevant namespaces. However, I ran into a different problem which I will try to explain via a code snippet
Note that an aliasing error occurs when trying to reload my-project.core
under the same alias as it was originally loaded under. In addition, the new function, func2
, is not loaded under the c
namespace alias. If I reload my-project.core
under a different alias, func2
is correctly loaded.
It is entirely possible I’m misunderstanding what repl/refresh
is supposed to do as this is the first time I have attempted to use it. However, my expectation here is that, with the use of bound-fn
to call repl/refresh
within the thread that is spawned from the REPL thread, my-project.core
should get refreshed and its corresponding alias, c
, should reflect the new function func2
.
I woudn't recommend to (require :reload)
when using t.n. I guess it can escape t.n's own tracking.
For best results with (refresh)
, set the refresh-dirs
explicitly.
Code like (a/thread ((bound-fn [] (repl/refresh))))
if it's persistent (`def` ed somewhere), should be in a dir that is outside the refresh-dirs
. Or alternatively, use disable-unload!
and disable-reload!
> It is entirely possible I’m misunderstanding what repl/refresh
is supposed to do as this is the first time I have attempted to use it.
refresh
reloads your namespaces, similarly to (require :reload)
in a managed fashion. Under it, you don't have to (require :reload)
namespaces individually, and keep track of their dependents, etc
btw, I guess that
(a/thread ((bound-fn [] (repl/refresh))))
was an experiment, otherwise it doesn't make much sense: you want to invoke it when files change and you want a code reload. Invoking it once, in a thread isn't as useful, and can create a race conditionthread from core.async already conveys bindings, as does future, and bound-fn captures the bindings where it is created, so wrapping it in thread like that relies on the fact that thread already conveys bindings
An example of bound-fn doing something would be like (let [f (bound-fn [] ...)] (.start (Thread. f)))
@U45T93RA6 yes, it was an experiment. I have a larger code block which connects repl/refresh
to Java WatchService
to invoke in a separate thread when files change. @U0NCTKEV8 I did not know that about core.async/thread
. I will change my code appropriately. It seems at this point I have narrowed things down to a much simpler example of something I would expect to work but isn’t. Essentially, new functions (distinct from a new implementation of an existing function) do not seem to be getting loaded after a refresh. Here is an example:
I would expect func2
to be present in the my-project.core
namespace (aliased to c
) after repl/refresh
is called, assuming func2
was added to the my_project/core.clj
after it was initially require
d above. Unfortunately, it does not seem to be working, at least for me.
Maybe you have a user.clj file that got refreshed as well? That would erase the aliases.
Otherwise the snippet looks good to me.
You can always try to reproduce it with a minimal project, over iTerm (no cider-nrepl), also please follow the advise around refresh-dirs
above
In my project, I am setting refresh-dirs
. I will try to put together a working example. My user.clj
file calls repl/disable-reload!
and the alias is still present (other function that exist like func1
in the example I gave are still callable).
(c/func1)
or (c/func2)
func1
works, func2
does not exist.
I just reproduced it with a minimal example, brand new project.
In what context? Exactly in the repl like that? Is the call to func2 actually buried in a another function?
The call is in the REPL as above.
The namespace is being removed and recreated from disk l, but the user namespace has reloading disabled, and its c alias is retaining a reference to the old version of the namespace
I see.
so namespace aliases that we want present after a refresh need to be persisted to disk somewhere, correct? user.clj
seems like a prime candidate for that role.
either we will end up with a stale reference (if reloading is disabled) or the alias will no longer exist.
(BTW, thank you to everyone for all the help with this. It is much appreciated)
No, refresh just mutates things the environment, so if you have code that is opted out of refresh, if it has static references they will become stale, but if you do things dynamically it will re-resolve things everytime and you won't have stale references
I have one-off code like this in my Cloud Run app:
(clj-http/head "")
such requests (to Clojars or directly to the IP of some other server, no DNS involved) are taking 11 seconds each.
Anything comes to mind as for what might be causing such slowness?
It might be GCP Run's intricacies (they recommend setting up NAT for certain cases) but if there's something simpler that I can do at JVM level, so much the better :)FWIW, I tried running curl --head
two times.
The first time it took 2 seconds.
The second time - 30 seconds and counting.
Could it be Clojars itself then?
The other 'probe' request is to an unrelated server In the meantime (since my last message) I've added some extra logging so I'm certain beyond any doubt that the requests are completing with HTTP 200. They're just slow (6-8 seconds in this batch)
Alright, I'm getting there... https://cloud.google.com/run/docs/tips/general#writing_effective_services Cloud Run is request-oriented, if you spawn a future that is detached from your req/response cycle, performance might degrade dramatically / that Docker instance might be even killed off I hadn't run across that document before, will be handy.