datalevin

2024-10-19T14:50:27.959319Z

So I’m trying to understand functions, queries and caching a bit better. I have two questions. *Question 1* I’m trying to filter for messages that have content longer than 200 characters:

(d/q '[:where
       [?u :message/content ?content]
       [(> (count ?content) 200)]
       :find (count ?content).]
  @*conn*)
  
=> nil
Doesn’t return anything. Why? I’m assuming you can’t nest functions? Separating out the functions still doesn’t return anything either though:
(d/q '[:where
       [?u :message/content ?content]
       [(count ?content) ?n]
       [(> ?n 200)]
       :find (count ?content).]
       @*conn*)
=> nil       
If I implement my own greater than function this works:
(defn my> [a b]
  (> a b))

(d/q '[:where
       [?u :message/content ?content]
       [(count ?content) ?n]
       [(app.scratch/my2> ?n 200)]
       :find  (count ?content).]
       @*conn*)
=> 5       
Is this because > is special for range queries? *Question 2* If I implement a function for checking spam based on length like below:
(defn spam? [x] (> (count x) 200))

(d/q '[:where
       [?u :message/content ?content]
       [(app.scratch/spam? ?content)]
       :find (count ?content).]
       @*conn*)
=> 5       
I get the expected result. However, if I change the function and run the query again.
(defn spam? [x] (> (count x) 200000))

(d/q '[:where
       [?u :message/content ?content]
       [(app.scratch/spam? ?content)]
       :find (count ?content).]
       @*conn*)
=> 5
The above should return nil (as I have no content that length). So some caching is happening, my question is what are the conditions for this sort of caching (also where’s the best place to look in the source code)? Is there a way to tell datalevin that the function has changed?

2024-10-20T13:47:40.167109Z

Both issues create. I’ll see If I can create PRs for them in the next few days https://github.com/juji-io/datalevin/issues/287 https://github.com/juji-io/datalevin/issues/288

🙏 1
Huahai 2024-10-19T15:34:46.755619Z

Question 1, the current built-in > does not resolve its arguments. Remember, we cannot call eval, so we need to walk the arguments to resolve them.

Huahai 2024-10-19T15:35:50.969449Z

Question 2, the caching key is based on parsed-q data structure, looks like it should be based on the structure after the references inside are resolved.

Huahai 2024-10-19T15:36:51.964159Z

There's no condition for caching, all query results are cached. Until a transaction happens to clear all cache.

Huahai 2024-10-19T15:37:02.368709Z

the code is in query.clj

Huahai 2024-10-19T15:37:51.536759Z

q-result function does the caching

👀 1
Huahai 2024-10-19T15:39:36.680089Z

could you file issues for these two?

👍 1
2024-10-23T13:47:15.143459Z

PR for 288, but as mentioned this might not be something that is worth fixing (performance implications). https://github.com/juji-io/datalevin/pull/289

Huahai 2024-10-24T04:57:48.166689Z

Merged

Huahai 2024-10-24T04:57:55.360459Z

Thanks!