xtdb

Nik 2025-11-04T03:02:45.053499Z

Hi Guys, I'm using xtdb v1. Is following intentional? My understanding was that if I use single variable for :find it will create (effectively) set of elements If I use :order-by I get duplicates in result

(biff/q db '{:find ?p
             :order-by [[?p :desc]]
             :in [?user]
             :where [[?tx :tx/payee ?p]
                     [?tx :tx/created-by ?user]]}
           "nikhil")
;; => ("Zomato"
;;     "Zomato"
;;     "Lucky Chan"
;;     "Fruit Vendor"
;;     "Devanshee"
;;     "Cinepolis"
;;     "Cinepolis"
;;     "Anand")
but I don't then, expectedly, I get de-duplicates list
(biff/q db '{:find ?p
                :in [?user]
                :where [[?tx :tx/payee ?p]
                        [?tx :tx/created-by ?user]]}
           "nikhil")
;; => ("Fruit Vendor" "Zomato" "Devanshee" "Lucky Chan" "Cinepolis" "Anand")

2025-11-04T15:20:45.046959Z

yeah I think I remember having a discussion about :order-by turning off de-duplication and that it's an intended/known behavior. Don't remember the motivation though. Also note that biff/q is adding a (map first ...); if you call xt/q directly (and wrap the :find value in a vector) then the first example will give you an actual set.

🙌🏼 1
refset 2025-11-04T21:26:07.064309Z

Hey @nikwarke it's definitely the intended behaviour - per this PR: https://github.com/xtdb/xtdb/pull/975/files#diff-6c6576892a723f0244c94ef6f4ce3132aa1d8e6280f07a59e2d90c7580eef6a4L2998-R2999 I believe the main reasoning is that deduping over large result sets isn't totally "free". There may be more to it though. I had a quick dig through the internal Slack archives but it seems the decision was discussed live on a video call and all I have is this summary: > conclusion about bag semantics: > - q without order-by, limit, offset returns a set (no dupes) > - q with order-by, limit, or offset returns an ordered vector (dupes) > - open-q returns a lazy seq (dupes) James might remember more 🙂 (seeing as he also implemented the prior PR which temporarily https://github.com/xtdb/xtdb/pull/662 here)

refset 2025-11-04T21:27:17.104039Z

> I remember having a discussion about :order-by turning off de-duplication yes, your discussion (on Zulip) is what kicked off the change in the PR above 😅

🙌 1
🤣 1
jarohen 2025-11-04T22:02:34.000449Z

I recall the decision not to introduce a breaking change here, but not much more than that I'm afraid. in those days we also tried to keep compatibility with Datomic too, so that likely influenced the call - whereas with open-q (being lazy) we couldn't dedupe the results without maintaining the whole set in memory

🙏 1
Nik 2025-11-05T23:00:12.746899Z

Thanks everyone! I hope this helps someone who is wondering about the same question in future also 😄

🙏 2