Fork me on GitHub
#clojure
<
2023-05-31
>
Ajithkumar04:05:20

Hi All, i am planning to implement integration test cases for our services, is it possible to do it with h2 inline db, is there any library available in clojure to support it?, or any example projects would be helpful.

Ben Sless04:05:19

to connect to a sql database you need some way to create a connection (or connection pool) and send queries for that, hikari-cp and next.jdbc work well together, then you just specify a connection spec for h2. You'll also need to bring h2 as a dependency I would recommend using the database you're actually working with for integration testing in case you're relying on some feature specific for it

Ben Sless04:05:41

you can use something like embedded postgres, for example

Ajithkumar04:05:22

We are today, saving the value to our real db using fixtures and transaction (in the trasaction having checkpoint and will roll back what ever happened within test transaction) But going forward with this approach looks risky You mean like embedded postgres which will be started when the application starts and will be used only for the integration test?

Ben Sless04:05:29

There's also the #C1Q164V29 channel, some of your questions might have been answered there

Ajithkumar04:05:11

Thats great thank you ❤️

jumar04:05:22

Here’s the embedded postgres thing: https://github.com/zonkyio/embedded-postgres As far as I understand it’s really embedded in process, not an external thing that’s started before the tests and shut down after Unfortunately I couldnt dind such a thing for Mysql

thanks2 2
emccue05:05:18

here is some code

emccue05:05:58

(ns test-utils
  (:require [db.core :as db]
            [migratus.core :as migratus]
            [next.jdbc :as jdbc])
  (:import (java.util.concurrent.atomic AtomicLong)
           (org.testcontainers.containers PostgreSQLContainer)
           (org.testcontainers.containers.wait.strategy Wait)
           (org.testcontainers.utility DockerImageName)))

(set! *warn-on-reflection* true)

(def ^:private container-delay
  (delay
    (let [container (PostgreSQLContainer.
                     (-> (DockerImageName/parse "postgres")
                         (.withTag "15")))]
      (.start container)
      (.waitingFor container (Wait/forListeningPort)))))

(def datasource
  (delay
    (let [^PostgreSQLContainer container @container-delay]
      (jdbc/get-datasource
       {:dbtype   "postgresql"
        :jdbcUrl  (str "jdbc:postgresql://"
                       (.getHost container)
                       ":"
                       (.getMappedPort container PostgreSQLContainer/POSTGRESQL_PORT)
                       "/"
                       (.getDatabaseName container)
                       "?user="
                       (.getUsername container)
                       "&password="
                       (.getPassword container))}))))

(def ^:private migrations-delay
  (delay
    (migratus/migrate {:store         :database
                       :migration-dir "resources/migrations/"
                       :db            {:datasource @datasource}})))

(def ^:private
  test-counter (AtomicLong.))

(defn with-test-db-info
  [cb]
  @migrations-delay
  (let [test-table-name (str "test_" (.getAndIncrement ^AtomicLong test-counter))
        ^PostgreSQLContainer container @container-delay]
    (jdbc/execute!
     @datasource
     [(format "create database %s template %s;"
              test-table-name
              (.getDatabaseName container))])

    (try
      (cb {:dbtype   "postgresql"
           :dbname   test-table-name
           :user     (.getUsername container)
           :password (.getPassword container)
           :host     (.getHost container)
           :port     (.getMappedPort container PostgreSQLContainer/POSTGRESQL_PORT)})
      (finally
        (jdbc/execute! @datasource
                       [(format "drop database %s;" test-table-name)])))))
(defn with-test-db
  [cb]
  (with-test-db-info
    (fn [db-info]
      (let [db (db/start-db {:jdbcUrl (str "jdbc:postgresql://"
                                           (:host db-info)
                                           ":"
                                           (:port db-info)
                                           "/"
                                           (:dbname db-info)
                                           "?user="
                                           (:user db-info)
                                           "&password="
                                           (:password db-info))})]
        (try
          (cb db)
          (finally (db/stop-db db)))))))

emccue05:05:23

create a postgres db with testcontainers, make a "template database". re-clone that template on every test

phill09:05:11

In answer to OP's question about h2: h2 works fine and no additional library is needed. But "integration tests" with a database other than "the real one" can be troublesome where their SQL dialects or features differ.

Dallas Surewood04:05:31

When we interop with Java directly, are we essentially "leaving" FP land? For example, if I want to use Spring directly instead of a Clojure server library, am I in for a world of hurt with mutable objects leaking into my Clojure code? How do I handle that?

hiredman04:05:42

There is no purity

emccue05:05:36

Well, eh? Kinda depends. java.time is an api which has all immutable data structures so thats definitely in the clear, but clojure code is generally not "pure fp". If you pull a java library to read from SQS or something then you will be performing an essential side effect and there will be some mutable objects involved with that, maybe some that are in return values, but if you are using a java library for a unique capability generally you can box that up

simongray07:05:24

Some Java libraries are better than others. Even if they’re basically all OOP and suffer from OOP-related illnesses, they don’t necessarily mutate objects everywhere

phill10:05:54

Ring is a super example of keeping interop out "at the edges"; none of the mutable stuff leaks into your Clojure. But you do not need to use wrapper libraries. To enjoy the FP benefits, you need only take care to keep side-effects out "at the edges". As for Spring, it's difficult to imagine...

Dallas Surewood13:05:02

I don't mean that Clojure is always pure, What I mean is Clojure is often touted as having seamless interop with Java, but I would imagine doing what Ring does and keeping mutability at the edges of your wrapper would be extremely difficult, making Java interop kind of a non-starter if you're trying to stay 90% functional

henrik08:05:03

A question about chunking. I’m talking to an API, which returns a Stream of responses, slowly. I can consume them much faster than they are produced. I line-seq that stream, and send them on. I thought I’d do some transformation on the responses before passing them on, for which I thought eduction would do a good job. However, it gathers 32 of them (or whatever it is) then machine guns them at me in rapid succession, after which it gathers 32 more and so on. I don’t want this involuntary caching in this case, how can I get rid of it?

igrishaev08:05:59

Did you try map ? Something like

(->> reader
     line-seq
     (map process-resp))

henrik08:05:37

I’m doing maps, filters, and custom transducers. I’d like to keep it as a transducer stack if I can.

igrishaev08:05:51

Maybe you can make a stack first with comp:

(def tx (comp (filter ...) (map ...)))
and them apply it to the lines with sequence:
(sequence tx (line-seq reader))

henrik08:05:33

sequence exhibits the same behaviour: it chunks 32 of them, not realizing that the source is a slow API, then releases them at once.

igrishaev08:05:36

but I don't know if it takes the items one by one or by 32

henrik08:05:04

Yeah, a bit of a conundrum

henrik08:05:29

As we’re talking, I realize I haven’t tried transduce.

henrik08:05:00

transduce actually offers a way around this. The nice stream representation is gone, but that’s OK. I basically just want to forward them to another channel anyway.

henrik08:05:34

Thanks for provoking me to think outside of the box!

igrishaev08:05:18

thanks for sharing transduce , I completely forgot about it

p-himik09:05:45

eduction itself doesn't chunk. But it's an iterable, and Clojure turns iterables into chunked seqs when you iterate over them.

roklenarcic09:05:29

Not a Clojure question per-se, but has anyone found a Java regex library that cannot be trivially exploited to DoS your service? I am talking something like a NFA without capture groups or back tracking, something that just tells me if something matches the regex or not?

roklenarcic09:05:22

DFA based solutions can be exploited to produce massive memory hogs.

p-himik09:05:10

No clue how complete it is and whether there are any other problems but: https://algs4.cs.princeton.edu/54regexp/

p-himik10:05:15

Actually, I'm a bit surprised to find that many. But I'll stop at 3 links. :) https://www.brics.dk/automaton/

roklenarcic10:05:26

Yeah I have evaluated this last one in the past and I don’t remember exactly what the issue was that I had with it

roklenarcic10:05:36

I’ll look at it again thanks

p-himik10:05:22

I'd look at the Google one more closely. There's a series of in-depth articles on RegEx and NFA and I believe RE2 was the outcome, which was eventually ported to Java.

p-himik10:05:39

The first article in the series, if you're curious: https://swtch.com/~rsc/regexp/regexp1.html

roklenarcic10:05:14

ah http://brics.dk has non-standard syntax

roklenarcic10:05:26

since I use user entered regexes this is a no-go

roklenarcic10:05:41

I’ll look at google one, thank you

👍 2
Ed11:05:46

there's also this : https://github.com/fhur/regie ... which lets you specify the regex as a data structure, so you'd be able to write a spec to validate that it had no capture groups in? It says "do not use in production" but I think it's an interesting approach. You might be able to use instaparse to parse the subset of regular expressions that you want, and then spec/malli to validate that you have no capture groups (for example)? Then you'd be able to use the normal java regex engine, but ensure that the regex is "safe"?

p-himik12:05:29

No capture groups does not mean that it won't be susceptible to DoS: https://clojurians.slack.com/archives/C03S1KBA2/p1649688396263389

Ed12:05:52

sure - but that was an example from the question ... I guess I was just looking for a way to validate the input to ensure safety rather than changing the processing model. Just suggesting another approach 😉

vemv19:05:31

It might make sense to use https://github.com/lambdaisland/regal to construct regexes, and only allow a subset of its language as acceptable? e.g. a simple Malli validation

☝️ 2
p-himik20:05:09

How would you determine whether the regex will be slow or not? The RegEx that I linked to above has nothing but [] and ranges - and yet it is possible to create a DoS attack with it.

vemv20:05:51

Probably a few rules of thumb would be useful, e.g. "less than 10 elements" I wouldn't expect it to be unbreakable, just less obnoxious than a naive approach :)

p-himik20:05:47

Yeah, I would just use a proper alternative implementation. I don't see any reason to stick to the built-in RegEx class when it isn't sufficient.

p-himik20:05:02

As a demonstration, try running this:

(let [s (str (apply str (repeat 10000 "a")) "b")]
  (time (do (re-matches #"(?:a+)+" s)
            nil)))
Note how there are no capturing groups and the RegEx is just 7 characters long. On my machine it takes 600+ ms to complete. If you add an extra 0 to the length of the string, it completes in over 70 seconds. One might argue that outlawing even non-capturing groups ought to be enough. But I wouldn't bet on it - there might very well be some variation with [] or {} or maybe even without that has horrendous performance.

vemv21:05:10

Validating the string to be regexed against also seems a good idea :) It all really depends on one's intent though. For a common business case "let the user provide some regexes" I see myself building a sane subset of regex out of Regal. When combined with Malli one could pick a subset of features to be used (or combinations thereof). Maybe they could be really dumbed-down to a very limited flavor of regex. Anyway, determining the absolute validity of some or other approach if off-topic.

mafcocinco16:05:48

Not sure if this is the correct channel, but I’m interested in implementing an improved version of ns-tracker using clojure.tools.namespace.*. I tried to use clojure.tools.namespace.repl/refresh et. al. but ran into an issue WRT *ns* and *e not being properly bound on the thread. My approach requires starting a separate thread to check for changes to files in the background and automatically call the repl/refresh function. However, there are issues with this as I’m not actually running in the repl (in the case of lein run) or I’m not in the REPL thread (in the case of lein repl). My best guess is to start with the repl/refresh code but modify it to account for running in a separate thread. Has anyone attempted this or know if any libraries similar to ns-tracker that uses clojure.tools.namespace.*?

Alex Miller (Clojure team)16:05:00

open to changes in tools.namespace if you can help clarify what the problem is

vemv16:05:21

+1 to a nicer problem statement anyway I think I get what you're trying to accomplish. clojure.core/bound-fn might help?

hiredman17:05:42

or just using binding to setup the vars that need a binding

mafcocinco18:05:18

perhaps that is all I need. I was not aware of bound-fn. I will check that out and if I still have issues, will try to clarify into something more succinct. Thanks for your help.

hiredman18:05:14

bound-fn will only communicate bindings from the thread you start with, but if you are not starting from a repl thread things like *ns* won't be bound to start with

mafcocinco18:05:38

I think I understand. The thread that calls repl/refresh will start from the main REPL thread, which should satisfy this constraint, correct?

mafcocinco14:06:58

I attempted to use bound-fn within a thread that is started from my REPL session. Doing so resolved the errors that were occurring due to *ns* and *e not being properly bound. (clojure.tools.namespace.repl/refresh) executes successfully and appears to refresh all the relevant namespaces. However, I ran into a different problem which I will try to explain via a code snippet

mafcocinco14:06:38

Note that an aliasing error occurs when trying to reload my-project.core under the same alias as it was originally loaded under. In addition, the new function, func2 , is not loaded under the c namespace alias. If I reload my-project.core under a different alias, func2 is correctly loaded.

mafcocinco14:06:01

It is entirely possible I’m misunderstanding what repl/refresh is supposed to do as this is the first time I have attempted to use it. However, my expectation here is that, with the use of bound-fn to call repl/refresh within the thread that is spawned from the REPL thread, my-project.core should get refreshed and its corresponding alias, c, should reflect the new function func2.

vemv14:06:04

I woudn't recommend to (require :reload) when using t.n. I guess it can escape t.n's own tracking. For best results with (refresh) , set the refresh-dirs explicitly. Code like (a/thread ((bound-fn [] (repl/refresh)))) if it's persistent (`def` ed somewhere), should be in a dir that is outside the refresh-dirs . Or alternatively, use disable-unload! and disable-reload!

vemv14:06:00

> It is entirely possible I’m misunderstanding what repl/refresh is supposed to do as this is the first time I have attempted to use it. refresh reloads your namespaces, similarly to (require :reload) in a managed fashion. Under it, you don't have to (require :reload) namespaces individually, and keep track of their dependents, etc

vemv14:06:00

btw, I guess that

(a/thread ((bound-fn [] (repl/refresh))))
was an experiment, otherwise it doesn't make much sense: you want to invoke it when files change and you want a code reload. Invoking it once, in a thread isn't as useful, and can create a race condition

hiredman14:06:35

thread from core.async already conveys bindings, as does future, and bound-fn captures the bindings where it is created, so wrapping it in thread like that relies on the fact that thread already conveys bindings

hiredman15:06:08

An example of bound-fn doing something would be like (let [f (bound-fn [] ...)] (.start (Thread. f)))

mafcocinco16:06:44

@U45T93RA6 yes, it was an experiment. I have a larger code block which connects repl/refresh to Java WatchService to invoke in a separate thread when files change. @U0NCTKEV8 I did not know that about core.async/thread . I will change my code appropriately. It seems at this point I have narrowed things down to a much simpler example of something I would expect to work but isn’t. Essentially, new functions (distinct from a new implementation of an existing function) do not seem to be getting loaded after a refresh. Here is an example:

mafcocinco16:06:47

I would expect func2 to be present in the my-project.core namespace (aliased to c) after repl/refresh is called, assuming func2 was added to the my_project/core.clj after it was initially required above. Unfortunately, it does not seem to be working, at least for me.

vemv16:06:22

Maybe you have a user.clj file that got refreshed as well? That would erase the aliases. Otherwise the snippet looks good to me. You can always try to reproduce it with a minimal project, over iTerm (no cider-nrepl), also please follow the advise around refresh-dirs above

mafcocinco16:06:40

In my project, I am setting refresh-dirs . I will try to put together a working example. My user.clj file calls repl/disable-reload! and the alias is still present (other function that exist like func1 in the example I gave are still callable).

hiredman16:06:17

How are func1 and func2 actually getting called?

mafcocinco16:06:36

(c/func1) or (c/func2)

mafcocinco16:06:44

func1 works, func2 does not exist.

mafcocinco16:06:59

I just reproduced it with a minimal example, brand new project.

hiredman16:06:34

In what context? Exactly in the repl like that? Is the call to func2 actually buried in a another function?

mafcocinco16:06:54

The call is in the REPL as above.

hiredman16:06:29

The namespace is being removed and recreated from disk l, but the user namespace has reloading disabled, and its c alias is retaining a reference to the old version of the namespace

mafcocinco16:06:11

so namespace aliases that we want present after a refresh need to be persisted to disk somewhere, correct? user.clj seems like a prime candidate for that role.

mafcocinco16:06:39

either we will end up with a stale reference (if reloading is disabled) or the alias will no longer exist.

mafcocinco16:06:20

(BTW, thank you to everyone for all the help with this. It is much appreciated)

hiredman16:06:01

No, refresh just mutates things the environment, so if you have code that is opted out of refresh, if it has static references they will become stale, but if you do things dynamically it will re-resolve things everytime and you won't have stale references

👍 4
vemv19:05:30

I have one-off code like this in my Cloud Run app:

(clj-http/head "")
such requests (to Clojars or directly to the IP of some other server, no DNS involved) are taking 11 seconds each. Anything comes to mind as for what might be causing such slowness? It might be GCP Run's intricacies (they recommend setting up NAT for certain cases) but if there's something simpler that I can do at JVM level, so much the better :)

hiredman20:05:03

are they completing successfully or timing out after 11 seconds?

vemv20:05:08

completing, otherwise they would have thrown an ex

p-himik20:05:20

FWIW, I tried running curl --head two times. The first time it took 2 seconds. The second time - 30 seconds and counting. Could it be Clojars itself then?

vemv20:05:53

The other 'probe' request is to an unrelated server In the meantime (since my last message) I've added some extra logging so I'm certain beyond any doubt that the requests are completing with HTTP 200. They're just slow (6-8 seconds in this batch)

vemv20:05:21

Alright, I'm getting there... https://cloud.google.com/run/docs/tips/general#writing_effective_services Cloud Run is request-oriented, if you spawn a future that is detached from your req/response cycle, performance might degrade dramatically / that Docker instance might be even killed off I hadn't run across that document before, will be handy.

jumar05:06:52

You can try tcpdump or similar to analyze the traffic?

jumar05:06:47

With dig you should be able to see if dns resolution takes a long time