Fork me on GitHub
#datalog
<
2022-10-12
>
Ben Sless04:10:48

Anyone have an example of correctly ingesting a tools analyzer AST to a datalog DB?

respatialized15:10:35

Take a look at #C03FF6W62A3 - I’m not sure whether the project uses tools.analyzer but it shares scope and goals with what you describe

lilactown16:10:50

I just did this with clj-kondo's analysis data

lilactown16:10:16

it was pretty straight forward

lilactown16:10:27

for var usages, I used the filename + line + col for their ID

Ben Sless16:10:50

you didn't go through tools.analyzer.ast/ast->eav?

lilactown16:10:16

since I wasn't using tools.analyzer directly i didn't think of that

lilactown16:10:41

i'm not sure if tools.analyzer has the same data format as clj-kondo, but here's the basis of what I did https://gist.github.com/lilactown/71faeea4cf908746e54d453ca8b57ea6

Ben Sless18:10:59

Success!

(def common
  #{:op :form :env :raw-forms :top-level :tag :o-tag :ignore-tag :loops})

(defn qualify
  [n k]
  (keyword (name n) (name k)))

(def core
  (into
   []
   (map #(ast/prewalk
          (ast/postwalk % (fn [{:keys [op] :as ast}]
                            (->> ast
                                 (reduce-kv
                                  (fn [m k v]
                                    (cond
                                      (identical? :children k) (assoc! m k (mapv (partial qualify op) v))
                                      (common k) m
                                      :else (assoc! (dissoc! m k) (qualify op k) v)))
                                  (transient ast))
                                 persistent!)))
          index-vector-nodes))
   (ana.jvm/analyze-ns 'clojure.core)))

(ast/ast->eav (first core))

(def counter (atom 1000))
(def cache  (atom {}))
(def rcache (atom {}))

(defn replace-with-eid
  [e]
  (if-let [e' (get @cache e)]
    e'
    (let [e' (swap! counter inc)]
      (swap! cache assoc e e')
      (swap! rcache assoc e' e)
      e')))

(defn prepare-datoms
  [datoms]
  (->> datoms
       (into []
             (map (fn [[e a v]]
                    (let [v (if (nil? v) ::nil v)
                          e' (if (map? e)
                               (replace-with-eid e)
                               e)]
                      [e' a v]))))
       (into []
             (mapcat (fn [[e a v]]
                       (let [v (or (get @cache v) v)
                             v (if (nil? v) ::nil v)]
                         (if (and (vector? v) (not= :form a))
                           (map (partial d/datom e a) v)
                           [(d/datom e a v)])))))))

(def raw-datoms (->> core ast.query/db))
(def datoms (->> raw-datoms prepare-datoms))

(def schema
  (->> raw-datoms
       (into {:children {:db/cardinality :db.cardinality/many}}
             (comp
              (filter (comp #{:children} second))
              (map peek)
              cat
              (distinct)
              (map (fn [k]
                     (let [n (name k)]
                       [k (cond-> {:db/valueType :db.type/ref}
                            (and (not= "class" n) (str/ends-with? n "s"))
                            (assoc :db/cardinality :db.cardinality/many))])))))))

(def conn (d/create-conn schema))

(d/transact! conn [[:db/add 999 :db/ident ::nil]])

(doseq [datom datoms]
  (try
    (d/transact! conn [datom])
    (catch Exception e
      (println datom)
      #_(println e)
      (throw e))))

Ben Sless18:10:14

Now let's see if I can avoid qualifying it

Ben Sless18:10:43

Nope, there are cases where meta is a value and where it's an AST node 😠