Fork me on GitHub
#clojure
<
2018-07-06
>
orestis15:07:14

I want to make a composable query system — starting from a root query that enforces core business logic, add extra filtering, sorting, paginations (limit/skip) etc. I’m pondering whether to make the result of this query to be something Seqable/Countable/Others — so consumers of the API would be able to call normal count/map/filter/reduce operations, rather than custom functions to do things like that. I think the way to do it is use a Record and implement the relevant interfaces right? Also, does this seem like a good idea or should I force consumers of the API to explicitly “realize” the various results, making this ultimately side-effectful operation explicit?

orestis15:07:14

Ideally also I’d like to be able to cache the results and reuse them on later invocations — but discard them if the query itself changes.

emccue16:07:20

what is core business logic

emccue16:07:50

both in respect to what you are trying to do and in general

emccue16:07:06

I see alot of fuzzy clouds that say business logic in books

emccue16:07:18

I think you would be fine just returning a LazySeq

emccue16:07:02

You probably dont have to implement the protocols yourself in that case

orestis16:07:29

I’ve made a thread in Clojureverse here: https://clojureverse.org/t/lazy-sequences-deferred-results-composable-queries-oh-my-please-help/2427 — it’s hard to capture nuance on Slack…

orestis16:07:48

I had this idea literally 10 minutes before leaving work and I know it’s going to torment me all weekend 🙂

noisesmith16:07:48

I'd caution against using laziness for deferred results if timing of realization or scope of some resource (eg. db connection) is in play. It's relatively common to use laziness to implement streaming in Clojure code, but it's a very common source of errors.

noisesmith16:07:33

also It's idiomatic to avoid creating new data types until there's some feature of a data type you actually need

noisesmith16:07:56

so yeah, use hash-maps, vectors, lazy-seqs etc. until you can prove they are not sifficient

emccue16:07:00

to my knowledge the java mongodb driver's results are all Iterables already

emccue16:07:25

the most you should need to do is wrap those in lazy-seqs/futures

noisesmith16:07:42

why would a future help?

noisesmith16:07:59

also the way to get a lazy-seq from an iterable is iterator-seq

emccue16:07:14

it wouldnt, but there is an async version the mongo driver so it might be needed

noisesmith16:07:30

@orestis from your link > Consumers of the API can just call seq if they want to execute the query. what does that mean?

orestis16:07:00

Perhaps I meant vec instead?

orestis16:07:47

The point was to force execution of the query, so you get back a plain non-lazy collection.

noisesmith16:07:56

OK - so the idea is that it's something that acts like a lazy-seq and executes some communication with a remote server when realized? I was confused because seq doesn't force lazy results.

emccue16:07:14

I know dynamo gives back a PaginatedQueryList

noisesmith16:07:28

also that's not how existing laziness works - forcing it still gives you a lazy thing back, it's just been realized already

emccue16:07:06

Rather than targeting full lazyness on each item with lazy-seq maybe realizing it in chunks is best

orestis16:07:01

Perhaps I should narrow the context — this is for a relatively simple web API, where you never want to stream data, and the number of items you get back is kept relatively small, e.g. in the hundreds.

noisesmith16:07:03

In my experience laziness to delay interactions with a stateful API is a source of many problems. Especially when mixed with chunking.

emccue16:07:56

Maybe you should ignore lazyness then

emccue16:07:03

I can count to 100, so whatever

orestis16:07:23

So the laziness part is not specifically needed — just that you can still modify the query but also execute it — and if you’ve executed it once, you don’t have to execute it again.

noisesmith16:07:12

do you actually want a caching layer?

orestis16:07:30

Perhaps I’m still influenced by mutability — e.g. I realize now that this is how Django’s QuerySets work — you create a QuerySet, that once executed caches the results. But in Clojure, executing my query would just give me a collection back.

noisesmith16:07:40

by "modify" do you mean make a new slightly different immutable object, or actually mutating an object representing a query?

orestis16:07:54

And then I could package the original query with the collection back together if I need to modify the query and execute again.

emccue16:07:15

wait a tick

noisesmith16:07:15

I think an actual caching layer would make this all a lot simpler

orestis16:07:23

I’d like to keep everything immutable — I don’t see a need for mutating anything.

emccue16:07:24

just memoize it

noisesmith16:07:49

the problem with memoizing is it can balloon memory - with an actual cache you can control that

emccue16:07:09

some equivalent of pythons lru_cache then

noisesmith16:07:09

for sure, use memoize, if you are OK with every query you ever send staying in memory until reboot

noisesmith16:07:37

clojure core.cache is good - it plays nicely with immutable things and lets you decide how to implement the actual mutable storage part

orestis16:07:54

The more I think about it, the more I realize laziness is completely irrelevant to what I actually wanted to ask.

noisesmith16:07:59

and it includes multiple caching algorithms, including LRU

orestis16:07:34

In the sense that I can’t think of where I would find it useful, and I was just influenced by the inherent laziness of lazy-seqs etc.

noisesmith16:07:36

@orestis yeah, if you squint at it laziness can look like a cache, but when you need a cache, I'd say use an actual cache :D

orestis16:07:21

I’ll have a look at those, thanks!

emccue16:07:37

<dependencies>
    <dependency>
      <groupId>org.clojure</groupId>
      <artifactId>core.cache</artifactId>
      <version>0.7.1</version>
    </dependency>
  </dependencies>

emccue16:07:44

in the deps for memoize

noisesmith16:07:31

right, most of what it adds to core.cache is implicitly deciding to use an atom

noisesmith16:07:37

where core.cache lets you use anything

orestis16:07:46

What about the idea of implementing the Seqable/Countable interfaces so that the query gets executed when seq/vec/count is called — in contrast to calling something like (q/execute query) or (q/count query) (vs. (vec query), (count query))

noisesmith16:07:05

once again, seq doesn't force lazy things

orestis16:07:45

Ah, but it wouldn’t be a lazy thing — it would be my record, that when asked to return a Seqable thing, it would just execute the query and give back the results of that.

noisesmith16:07:48

well, incidentally it forces 1 chunk because it checks the head, but...

emccue16:07:18

honestly, even though it brings you into core functions q/execute is easier to read than vec

emccue16:07:36

Its pretty clear what it does

noisesmith16:07:48

@orestis does it do the same query again when you ask for a seq again?

orestis16:07:02

@noisesmith Good question — I’d like to say no, it caches the results, but in all honesty, I can’t see a use case where that was ever going to be needed — so perhaps, the answer is yes, when you call seq, the query is executed again.

noisesmith16:07:22

also how do you manage the cache? who owns it?

orestis16:07:23

It’s all backed by a mutable database, so I can’t have more guarantees there.

emccue16:07:10

at work our database has a time based lru cache for some "hot" operations

dpsutton16:07:33

why is this better than just a function that gets the results and then a plain collection of the results? why combine them in this way?

emccue16:07:39

so we invalidate if too much time has passed or if we have too much in the cache

orestis16:07:41

It would be nice to have a per-request cache to ensure at least consistent results. It might even be needed with Mongo’s lack of joins (this is a legacy thing so I can’t switch databases just yet).

orestis16:07:24

@dpsutton This is the question I’m asking 🙂 Perhaps I’m overthinking things, Clojure gives you enough tools that you start to think “how would it look if I applied protocols here”.

orestis16:07:40

The whole idea results out of trying to figure out what something like (find-widgets) should return. Does it return a list of widgets, hitting the database? Or does it return something that can be further filtered before hitting the DB? And how would the API look like for these use cases?

orestis16:07:37

There are some core rules relating to permissions and other things (e.g. filter things which are soft-deleted), but there are other rules that are mostly presentation related and must be added on top of the core business rules.

noisesmith16:07:18

when I used mongo with clojure we had a function that put all queries through a ttl cache, due to the domain logic we knew that caching in an N minute timespan was always OK

noisesmith16:07:42

the ttl cache was done with core.cache and an atom as the in-memory store

noisesmith16:07:38

the same in-memory cache was used for all queries (simple enough since atoms keep things concurrency safe, and we had low enough throughput of fresh queries that mutating the atom wasn't a bottleneck)

emccue16:07:39

list of widgets, hitting the database

emccue16:07:06

if you need to filter more, just put that in your query

emccue16:07:22

¯\(ツ)

noisesmith16:07:00

Clojure lets you make powerful assumptions about what you can do with immutable data. Unless you can enforce the same semantics on a DB level (eg. Datomic), don't try to blur the distinction of data operations in clojure with query logic.

orestis16:07:15

Hehe, that’s an option — but then you end up with something like (find-widgets db user filter sort paginate) which kinda looks off to me.

noisesmith16:07:45

because it will lead to bad bugs when devs assume "data is sane and works in functionally / mathematically meaningful ways" and your query layer can't back that up

orestis16:07:23

We will have one of those find-widgets for every entity in our system, and we will need to compose them to do joins, so if you want to eventually change the signature of these functions you’re in a lot of pain.

dpsutton16:07:21

(find-widgets db user query-map) and then you just build an interpreter for query-map. A "signature change" is just a change to your query interpreter

orestis17:07:40

@noisesmith Right, I agree with that — when I realized laziness is not worth it and is irrelevant to my API design 🙂

orestis17:07:30

@dpsutton and I guess my mini-DSL could just be a few functions with meaningful names that construct that query-map?

dpsutton17:07:26

that's what i would do. it's just data so no macros involved, transform maps and vectors into the final query that mongo will interpret

orestis17:07:16

One thing that I might need to do is to switch from a plain query to an aggregation pipeline, according to the specific query. You can do some joins with a pipeline that I could leverage.

orestis17:07:29

Perhaps the correct thing to do is use find-widgets-query vs find-widgets, then just have an execute-query. Then the intent of the functions is clear.

orestis17:07:25

Thanks for all the input everyone, I need to go but I will update my Clojureverse topic if you’re interested in following up there. Much appreciated @noisesmith @emccue @dpsutton

orestis17:07:39

(I knew I shouldn’t have started this on a Friday evening 🙂 )

otwieracz19:07:10

Do you have any suggestions how can I implement --verbose switch for clojure.tools.logging? That is, enable log/debug output to console.

Roy Truelove19:07:46

What’s the underlying logging engine that you’re using? (eg log4j, logback etc)

Roy Truelove19:07:50

(and, does it have to be --verbose or can it be any command line param?)

otwieracz19:07:58

How can I know what's the overlying engine?

otwieracz19:07:12

I've just included clojure.tools.logging and started using it.

otwieracz18:07:26

So yeah, it's SLF4J (sorry for the delay!)

otwieracz19:07:08

I am probably going to ask it on Clojureverse.

otwieracz19:07:14

To keep this info somewhere.

ag22:07:06

how do I use fnil with java functions, I forgot.. need something like (fnil .intValueExact)

noisesmith22:07:44

java methods are not functions, to use fnil you need a function

noisesmith22:07:58

(fnil #(.intValueExact %) n)

ag22:07:28

wrong number of args passed to: core/fnil

noisesmith22:07:37

after my edit?

ag22:07:01

what's n here?

ag22:07:25

oh... sorry... being stupid

ag22:07:06

(fnil #(.intValueExact %) 0M) this is what I'm looking for

noisesmith22:07:09

ins)user=> (def n-fnil (fnil #(.intValueExact %) (biginteger 0)))
#'user/n-fnil
(cmd)user=> (n-fnil nil)
0
(ins)user=> (n-fnil (biginteger 88888888888888))
ArithmeticException BigInteger out of int range  java.math.BigInteger.intValueExact (BigInteger.java:4550)
(cmd)user=> (n-fnil (biginteger 8888888))
8888888

colindresj22:07:06

If I have a map, say {:a 1 :b 2 :c 3 :d 4}, and I have a set #{:a :b}, what’s the best way to get the first key existing in the map and the set?

lilactown22:07:12

does the ordering matter?

colindresj22:07:17

Let’s say no

noisesmith22:07:57

(first (set/intersection s (set (keys m))))

noisesmith22:07:20

or maybe (key (first (select-keys m s)))

noisesmith22:07:30

but the second one could NPE on the key call

noisesmith22:07:05

(first (keys (select-keys m s))) - no more npe

noisesmith22:07:31

but "first" isn't really a meaningful thing for sets and maps

noisesmith22:07:56

(beyond the fact that first implicitly calls seq which gets you a thing that acts like a list that is...)

dpsutton22:07:33

(select-keys {:a 1 :b 2 :c 3} #{:a :b}) you can just select keys to see the "intersection"

dpsutton22:07:03

whoops. should read faster 🙂

noisesmith22:07:23

often, I type faster than I think

noisesmith22:07:42

not to mention faster than a reasonable person reads for comprehension

colindresj22:07:59

Thanks for the options

dpsutton22:07:05

@colindresj what are you trying to achieve? often times when you're asking for the first of a set or map you need a simple change one stack frame up

colindresj22:07:03

I need a function that will take any arbitrary map and return the first matching key from a set of keys

dpsutton22:07:17

ah well there ya go then lol

dpsutton22:07:44

somtimes an XY problem is really an XY problem

colindresj22:07:49

Actually doesn’t need to be a set, could be any iseq, I just like that it’s represented visually in a set

colindresj22:07:04

Yeah this is pretty straightforward, albeit kind of uncommon

noisesmith22:07:10

well, select-keys means you no longer get the first match

colindresj22:07:51

First in my case is just first to be encountered, not necessarily in a specific order

noisesmith22:07:59

if the input might be a seq, and the order matters

(first (filter #(contains? m %) keylist))

colindresj22:07:18

Yeah above is what I had originally, just wanted to see other options

noisesmith22:07:00

if nil/false are never the val, (first (filter m keylist)) also works

didibus23:07:13

Anyway to have a deftest that takes argument?

didibus23:07:30

I'd like to prepare some inputs for my tests in test-ns-hook, and pass them to the tests that needs them.

didibus23:07:44

Other then dynamic biondings?

dpsutton23:07:36

can you not def them?

didibus23:07:35

Not in this case, well, preferably not.

didibus23:07:44

I feel the answer is no, I guess I'll just use dynamic vars