Fork me on GitHub
#beginners
<
2022-09-03
>
Hayden M08:09:50

Hello! I'm learning Clojure and I've been stuck on something for a day now. Before I hit the books again to learn some new concepts that might help, I was wondering how others would approach the problem below? What I'm trying to do is import a bank statement (csv file), e.g.

Date,Unique Id,Tran Type,Cheque Number,Payee,Memo,Amount
2022/01/18,2022011101,A/P,,"Sharesies NZ","A/P AM623847",-10.00
2022/05/20,2022012301,D/D,,"WGTN CITY COUNCIL","D/D WCC RATES 107655 19 MADEUPLANE",-139.47
And then match each bank statement line against a set of "category matching rules" that I've created
{:category "Rates"
  :match-on [{"Payee" "WGTN CITY COUNCIL"}]
  :exclude false
  :type :expense}

 {:category "Investing"
  :match-on [{"Payee" "Sharesies NZ"}]
  :exclude false
  :type :expense}
And I want to produce an output something like this, with the ultimate aim of storing this in a database and serving it up on a webpage.
[{:month "May"
 :year "2022"
 :category "Rates"
 :total-expenses 139.47
 :total-income 0.00
 :transactions [{"Date" "2022/05/20",
                  "Unique Id" "20,2022012301",
                  "Tran Type" "D/D",
                  "Cheque Number" "",
                  "Payee" "WGTN CITY COUNCIL",
                  "Memo" "D/D WCC RATES 107655 19 MADEUPLANE",
                  "Amount" "-139.47"}
{:month "January"
 :year "2022"
 :category "Investing"
 :total-expenses 10.00
 :total-income 0.00
 :transactions [{"Date" "2022/01/18",
                  "Unique Id" "18,2022011101",
                  "Tran Type" "A/P",
                  "Cheque Number" "",
                  "Payee" "Sharesies NZ",
                  "Memo" "AM623847",
                  "Amount" "-10.00"}]
This would be easy for me to do imperatively, but I'm trying to learn how this should look in a language like clojure. What I've tried so far is many variations of things like this where I try to isolate an individual bank statement line, and then iterate over all the category matching rules trying to find a match for the bank statement line, and then feed that into filter to return only the category matching rules that apply to that bank statement line.
(defn rule-matches-bank-line? [bank-line]
  (fn [mapping]
    (let [match-on-rules (:match-on mapping)]
      (->> (reduce #(keys %2) #{} match-on-rules)
           (map #(= (%1 %2) (bank-line %2)) match-on-rules)))))

(defn categorise-bank-line [mappings bank-line]
  (filter (rule-matches-bank-line? bank-line) mappings))
That particular piece of code returned all of the category mappings, regardless of matching or not 😢 Any help hugely appreciated and any general feedback too (like are there better data structures for this and anything else like that). I'm loving the language so far, but have a long way to go!

Hayden M09:09:09

Ok I figured out why it returned every category mapping - it's because map returned a sequence and inside was true or false. So calling first on it made it work. But it still feels ugly...

(defn rule-matches-bank-line? [bank-line]
  (fn [mapping]
    (let [match-on-rules (:match-on mapping)]
      (->> (reduce #(keys %2) #{} match-on-rules)
           (map #(= (%1 %2) (bank-line %2)) match-on-rules)
           (first)))))

(defn categorise-bank-line [mappings bank-line]
  (filter (rule-matches-bank-line? bank-line) mappings))

rolt09:09:46

the problem is not fully specified, match-on is an array of maps, are you supposed to match if at least one map satisfies the line ?

rolt09:09:38

like [{"Payee" "Sharesies NZ"} {"Payee" "WGTN CITY COUNCIL"}] would match both lines ?

rolt09:09:06

also, can the map contains multiple props ? and in that case i assume you need to match every values if i go with those assumptions:

(defn match-props? [bank-line props] (= (select-keys bank-line (keys props)) props))

(defn match-rule? [bank-line rule]
  (some (fn [props] (match-props? bank-line props)) (:match-on rule)))
(untested, parentheses are probably wrong)

gratitude-thank-you 1
rolt09:09:18

try to divide the problem and name things if you're stuck like that, it's easier to debug simple functions in the repl

👍 1
rolt09:09:51

note that if your maps always have a single entry, it's easier to represent them with a pair

👍 1
Magnus11:09:52

So just for fun I was thinking that this is a nice pattern matching exercise. The matcher is a macro though, so no dynamically created rules. But it's more flexible with the rules than any simple lookup can be... So if we accept the cost of rewriting the rules, you can get something like this :

(ns hayden-match
  (:require [clojure.data.csv :as csv]
            [ :as io]
            [clojure.string :as str]
            [clojure.core.match :refer [match]]))


(defn csv-data->maps [csv-data]
  (map zipmap
       (->> (first csv-data) ;; First row is the header
            (map keyword) ;; Drop if you want string keys instead
            repeat)
       (rest csv-data)))

(def data
  (with-open [reader (io/reader "src/haydendata.csv")]
    (-> (doall (csv/read-csv reader))
        csv-data->maps)))

(defn rate-category [m]
  (assoc m :category :Rates :type :expense))
(defn invest-category [m]
  (assoc m :category :Investing :type :expense))
(defn test-category [m]
  (assoc m :category :Testing :type :goblin))

(defn rule-matcher [entry]
  (match [entry]
    [{:Payee "WGTN CITY COUNCIL"}] (rate-category entry)
    [{:Payee "Sharesies NZ"}] (invest-category entry)
    [{:Payee "WGTN CITY COUNCIL"}] (invest-category entry)
    [{:Payee "Top bidder"}] (test-category entry)
    [{:Payee "Microinc"}] (test-category entry)
    :else (assoc entry ::failure "failed to parse..")))


(defn get-month 
  "seriously bad date getter. will bite! not house trained!"
  [date]
  (-> (str/split date #"/")
      second
      Integer/parseInt))

(defn totals [grpd-data]
  (reduce (fn [m [[month category] transactions]] ;; needs additional group by year
            (conj m
                  {:month month
                   :category category
                   :total-expenses (->> (filter #(= :expense (:type %)) transactions)
                                        (map :Amount)
                                        (map #(Float/parseFloat %))
                                        (reduce +))
                   :total-income (->> (filter #(= :income (:type %)) transactions)
                                      (map :Amount)
                                      (map #(Float/parseFloat %))
                                      (reduce +))
                   :transactions transactions}))
          []
          grpd-data))


(defn parse-transactions [data]
  (-> (group-by (juxt #(get-month (:Date %))
                      #(:category %)) (map rule-matcher data)) ;; also group by here!
      totals))

(comment 
  (parse-transactions data)
  ;; => [{:month 1,
  ;;      :category :Investing,
  ;;      :total-expenses -10.0,
  ;;      :total-income 0,
  ;;      :transactions
  ;;      [{:category :Investing,
  ;;        :type :expense,
  ;;        :Payee "Sharesies NZ",
  ;;        :Memo "A/P AM623847",
  ;;        :Cheque Number "",
  ;;        :Date "2022/01/18",
  ;;        :Unique Id "2022011101",
  ;;        :Amount "-10.00",
  ;;        :Tran Type "A/P"}]}
  ;;     {:month 5,
  ;;      :category :Rates,
  ;;      :total-expenses -139.47,
  ;;      :total-income 0,
  ;;      :transactions
  ;;      [{:category :Rates,
  ;;        :type :expense,
  ;;        :Payee "WGTN CITY COUNCIL",
  ;;        :Memo "D/D WCC RATES 107655 19 MADEUPLANE",
  ;;        :Cheque Number "",
  ;;        :Date "2022/05/20",
  ;;        :Unique Id "2022012301",
  ;;        :Amount "-139.47",
  ;;        :Tran Type "D/D"}]}]

  
  )

👍 1
amazed 1
Hayden M18:09:02

Thank you both for the insights! Appreciate it. I'll rewrite what I've done both ways and see what I can learn from it

Steph Crown12:09:01

Hi everyone. I use VSCode to write my codes. What VSCode extension would you recommend I use to prettify my Clojure scripts on save?

pez12:09:57

Hello! Calva can do that for you. And then some. https://calva.io/ We help each other in #calva

sheluchin15:09:10

str/split-lines doesn't return trailing newlines. What can I use if I want trailing newlines? "a\n\nb\n\n" => ["a" "" "" "b" "" ""]