pathom

Hendrik 2023-06-06T06:18:25.089869Z

Datomic cloud does not support excision and cannot be used for storing personal data (GDPR). One solution is to store personal data in a KV store and store the key in datomic. Pathom could do the job of replacing the key with the value in data queries. I think of 2 ways to accomplish this: 1. Have a resolver with input :my-attribute-private and output :my-attribute. Pro: simple to implement (could be autogenerated with fulcro rad). Simple to understand what is going on. Cons: maybe batch reads not possible?. 2. A plugin which walks the generated query, looks up all private attribute keys, batch reads them from KV store, and replaces them with the values: Pro: only one read to KV store is needed. Cons: plugin hides what is going one. makes implementation more complex What would be the prefered solution in pathom to achieve this? And specific to solution 1. Would it be possible to declare resolvers in a way, where only one batch read is done? I am thinking of: Regarding solution 1.: Would it be possible to have a resolver like this?

; some alias
  {::pc/input  #{:user/email-private}
   ::pc/output [:kv-key]}
  {::pc/input  #{:user/name-private}
   ::pc/output [:kv-key]}
  {::pc/input  #{:kv-key}
   ::pc/output [:user/email]}
  {::pc/input  #{:kv-key}
   ::pc/output [:user/name]}

(def some-resolver
  {::pc/sym     `some-resolver
   ::pc/input   #{:kv-key}
   ::pc/output  [:kv-value}]
   :pc/batch? true
   ::pc/resolve (fn [env input] ...)})
And would this work if the keys are on different edges in the query like [{:me [:user/name]} {:friends [{:user/email}]}]

wilkerlucio 2023-06-07T18:00:55.012719Z

you can't use a shared intermediate attribute, because inside an entity every attribute is realized at most once, so you can't have 2 different values for the same key (`:kv-key` in this case)

wilkerlucio 2023-06-07T18:03:31.152369Z

there are 2 ways we can get the batch to work across different attributes: 1. using a dynamic resolvers 2. using a single resolver for all attributes and delimiting which attributes by plan inspection - this I think can work for your case, I'll get you an example

wilkerlucio 2023-06-07T18:05:35.653129Z

(I may take some hours to give the example, can write it after I finish work things today :))

wilkerlucio 2023-06-08T01:37:06.308749Z

here is an example, I'm assuming here that you store the private data in your kv-store using the ID for the user as part of the key, following this idea this is how you can make a batch that works both across multiple attributes and entities, in the example you will see the request for 2 attributes of 2 entities (4 keys to read from kv-store), and this makes a single call to the resolver to realized all of them at once:

(ns com.wsscode.pathom3.batch-shared-read-kv
  (:require [com.wsscode.misc.coll :as coll]
            [com.wsscode.pathom3.connect.indexes :as pci]
            [com.wsscode.pathom3.connect.operation :as pco]
            [com.wsscode.pathom3.connect.planner :as pcp]
            [com.wsscode.pathom3.interface.eql :as p.eql]))

(def my-kv-store
  {[:user/id 1 :user/name-private]  "Secret Name"
   [:user/id 1 :user/email-private] ""
   [:user/id 2 :user/name-private]  "I'm a secret"
   [:user/id 2 :user/email-private] ""})

;; assuming you will have some efficient way to pull multiple keys at once from your
;; kv store, this is where the impl for it lives
(defn batch-attributes-read [kv-store keys]
  (select-keys kv-store keys))

(pco/defresolver kv-items [env inputs]
  {::pco/input  [:user/id]
   ::pco/output [:user/name-private
                 :user/email-private]
   ::pco/batch? true}
  (let [ids             (map :user/id inputs)
        requested-attrs (->> env ::pcp/node ::pcp/expects keys)
        kvs-keys        (for [id   ids
                              attr requested-attrs]
                          [:user/id id attr])
        kv-data         (batch-attributes-read my-kv-store kvs-keys)]
    ; check what you read from the kv store
    (tap> kv-data)

    ;; unpack keys
    (mapv
      (fn [id]
        (into
          {}
          (map
            (fn [attr]
              (coll/make-map-entry attr (get kv-data [:user/id id attr]))))
          requested-attrs))
      ids)))

(def env
  (-> {}
      (pci/register
        [kv-items])))

(comment
  (p.eql/process env {:users [{:user/id 1}
                              {:user/id 2}]}
                 [{:users
                   [:user/name-private
                    :user/email-private]}]))

wilkerlucio 2023-06-08T01:47:42.552129Z

note that you should add every possible key in the ::pco/output there, but it will only fetch the keys the user request (this is that the ::pcp/expects do, it filters what the engine is expecting this resolver to return according to the plan dependencies)

Hendrik 2023-06-08T04:46:19.233559Z

Ah nice. ::pcp/expects does the trick. Thank you so much for your advice :)

wilkerlucio 2023-06-06T16:19:09.601429Z

hello Hendrik, I think this is a great usage for Pathom, my suggestion is just to keep things explicit, for each private attribute, make a correspondent one that reads the value from kv-store, if you have to do it a lot, make a helper function that generates the resolver, so you can end up with things like:

(private-attr :user/name)
which will generate the resolver outputting :user/name-private. About batch, you shouldn't have to worry, Pathom will track dependencies and use batch where it can (you don't need to batch your own private-attr resolvers)

Hendrik 2023-06-06T20:38:40.461679Z

Thanks for your response. I think, that I go with the explicit way and probably combine it with Fulcro RAD to autogenerate the resolvers. I have 2 more questions: Given these resolvers:

{::pc/input  #{:user/id}
   ::pc/output [:user/email-private}

  {::pc/input  #{:user/name-private}
   ::pc/output [:user/email}
and this query [{[:user/id 42] [:user/name]}] : I do not need this query [{[:user/id 42] {:user/name-private [:user/name}}] because Pathom is able to connect the dots, correct? And about batching: So I do not need to declare :batch true ? But how would Pathom batch? Ideally there should be one call per query to the kv, which contains all the lookups from that query.

wilkerlucio 2023-06-06T20:41:10.515759Z

ah, sorry, I was assuming batch only for datomic, if you like to batch the kv reading, you should use :batch true, this can batch per attribute, but if you want to batch across multiple private attributes, then we need to design things a bit different

Hendrik 2023-06-07T04:54:56.398319Z

Would this solution with an intermediate alias work?

;add a aliases to map from all private attributes to an intermediate key  
{::pc/input  #{:user/email-private}
 ::pc/output [:kv-key]}
{::pc/input  #{:user/name-private}
 ::pc/output [:kv-key]}

;look up all kv-keys
{::pc/input  #{:kv-key}
 ::batch true
 ::pc/output [:kv-value]}

;add aliases from the kv-value to all attributes
{::pc/input  #{:kv-value}
 ::pc/output [:user/name]}
{::pc/input  #{:kv-value}
 ::pc/output [:user/email]}
The only downside I see with this is: I can produce a chain like this :user/email-private => :kv-key=> :kv-value => :user/name So I can write wrong queries which will happily run and return results.