Fork me on GitHub
#meander
<
2020-07-31
>
chucklehead04:07:27

hello all, I suspect I shouldn't be using scan here, but not sure what the right operator would be. The goal is to pull out multiple nested values/attrs from the sequence in the :content of the trackInformation element. When I uncomment the latitude element below, the search stops matching, but works if I e.g. comment out speed/altitude and only scan for latitude.

(-> ( "./corpus/example.xml")
    (clojure.data.xml/parse :skip-whitespace true)
    (m/search
     (m/$ {:tag ::fdm/fltdMessage
           :attrs {:msgType "trackInformation"
                   :flightRef ?flight-ref
                   :acid ?aircraft-id
                   :airline ?airline
                   :depArpt ?departure-airport
                   :arrArpt ?arrival-airport
                   :sourceTimeStamp ?source-ts}
           :content ({:tag ::fdm/trackInformation
                      :content (m/scan
                                (m/$ {:tag ::nxcm/speed
                                      :content (?speed)})
                                (m/$ {:tag ::nxce/simpleAltitude
                                      :content (?altitude)})
                                #_(m/$ {:tag ::nxce/latitudeDMS
                                        :attrs ?latitude-dms}))})})
     {:message/source-ts (read-instant-timestamp ?source-ts)
      :flight/ref (Integer/parseInt ?flight-ref)
      :flight/aircraft-id ?aircraft-id
      :track/speed (Integer/parseInt ?speed)
      :track/altitude ?altitude}))
A trimmed down sample of the actual data is https://gist.github.com/casselc/09b1ed46a86b500cabd2e14ada1a3719.

Jimmy Miller15:07:49

So I played with this a bit but couldn't get it directly working. The reason scan doesn't work here is that it assumes all its elements are sequential.

(m/search [1 2 3 4 5 6]
  (m/scan 3 4 ?x)
  ?x)

;; =>

5

(m/search [1 2 3 4 5 6]
  (m/separated 3 4 ?x)
  ?x)

;; => 

5 6
But separated doesn't work either. Maybe something weird is happening with the nested $? I will try a bit later.

chucklehead12:08:10

thanks, I think I understand scan a little better now. Played around a bit more and ended up with:

chucklehead12:08:10

(defn xml->tracks
  [xml]
  (m/search
   xml
   {:tag ::tx/tfmDataService
    :content ({:tag ::tx/fltdOutput
               :content (m/scan
                         {:tag ::fdm/fltdMessage
                          :attrs {:msgType "trackInformation"
                                  :flightRef ?flight-ref
                                  :acid ?aircraft-id
                                  :airline ?airline
                                  :depArpt ?departure-airport
                                  :arrArpt ?arrival-airport
                                  :sourceTimeStamp ?message-ts}
                          :content ({:tag ::fdm/trackInformation
                                     :content (m/scan
                                               {:tag ::nxcm/qualifiedAircraftId}
                                               {:tag ::nxcm/speed
                                                :content (?speed)}
                                               {:tag ::nxcm/reportedAltitude
                                                :content ({:tag ::nxce/assignedAltitude
                                                           :content ((m/or
                                                                      {:tag ::nxce/simpleAltitude
                                                                       :content (?altitude)}
                                                                      {:tag ::nxce/blockedAltitude
                                                                       :attrs {:min (?altitude)}}
                                                                      {:tag ::nxce/visualFlightRules
                                                                       :attrs {:altitude ?altitude}}
                                                                      {:tag ::nxce/altitudeFixAltitude
                                                                       :attrs {:preFixAltitude ?altitude}}))})}
                                               {:tag ::nxcm/position
                                                :content (m/scan
                                                          {:tag ::nxce/latitude
                                                           :content ({:tag ::nxce/latitudeDMS
                                                                      :attrs ?latitude-dms})}
                                                          {:tag ::nxce/longitude
                                                           :content ({:tag ::nxce/longitudeDMS
                                                                      :attrs ?longitude-dms})})}
                                               {:tag ::nxcm/timeAtPosition
                                                :content (?track-ts)}
                                               .
                                               (m/or
                                                {:tag ::nxcm/ncsmTrackData
                                                 :content (m/seqable
                                                           .
                                                           (m/or
                                                            {:tag ::nxcm/nextEvent
                                                             :attrs {:latitudeDecimal ?next-lat
                                                                     :longitudeDecimal ?next-long}}
                                                            (m/let [?next-lat nil ?next-long nil]))
                                                           ..1)}
                                                {:tag ::nxcm/ncsmRouteData
                                                 :content (m/scan
                                                           {:tag ::nxcm/nextPosition
                                                            :attrs {:latitudeDecimal ?next-lat
                                                                    :longitudeDecimal ?next-long}})}))})})})}
   (merge {:flight/ref (Long/parseLong ?flight-ref)
           :flight/aircraft-id ?aircraft-id
           :flight/airline ?airline
           :flight/arrival-airport ?arrival-airport
           :flight/departure-airport ?departure-airport
           :track/speed (Integer/parseInt ?speed)}
          (when (and ?next-lat ?next-long)
            {:track/next-latitude (Double/parseDouble ?next-lat)
             :track/next-longitude (Double/parseDouble ?next-long)}))))

chucklehead12:08:10

Not especially happy with the last bit, but haven't been able to figure out a better way that doesn't generate 'extra' entries. What I'm trying to express is that the element after timeAtPosition will be either a ncsmTrackData or ncsmRouteData tag, but if it's a ncsmTrackData it may or may not contain a nextEvent with a lat/long.

noprompt18:08:11

@U015879P2F8 I’m gonna take a swing at this. 🙂

chucklehead18:08:23

if you haven't already I'd really appreciate a review of where I've gotten since then to see if it makes sense or there's a better/more efficient/performant/etc way to go about it. I just updated the https://gist.github.com/casselc/09b1ed46a86b500cabd2e14ada1a3719 with a copy/paste from my ns where I've been playing around with the data

chucklehead18:08:41

thanks for taking a look

noprompt18:08:45

Do you have a deps.edn to go along with this?

noprompt18:08:30

I’m pulling together the deps myself.

chucklehead18:08:21

{:paths ["src" "resources"]
 :deps {org.clojure/clojure {:mvn/version "1.10.1"}
        org.clojure/data.xml {:mvn/version "0.2.0-alpha6"}
        org.clojure/core.async {:mvn/version "1.3.610"}
        environ {:mvn/version "1.2.0"}
        com.solacesystems/sol-jcsmp {:mvn/version "10.9.0"}
        meander/epsilon {:mvn/version "0.0.480"}
        datascript {:mvn/version "1.0.0"}
        clojure.java-time {:mvn/version "0.3.2"}
        }
 }

🎉 3
noprompt19:08:45

This is actually pretty incredible, I must admit.

noprompt19:08:07

I think this is the first time I’ve seen someone use defsyntax this way.

chucklehead19:08:49

that can't be good

noprompt19:08:08

No it’s great actually.

noprompt19:08:24

I’m just now realizing that it I need to expose something for expanding a pattern conveniently.

chucklehead19:08:51

I initially built it up as one giant find just trying to basically transliterate the patterns from the xml until it all sort ofworked, and then wanted some way to break up the individual pieces

noprompt19:08:10

That makes perfect sense.

chucklehead19:08:15

eventually I'll want to do some other message types and certain elements are reused

noprompt19:08:38

Yah, and that’s a totally valid use case. I’m glad you’re messing with this.

chucklehead19:08:41

I actually have good xsd for all of this so there's probably some way to meander that into exactly what I want

chucklehead19:08:37

I'm fairly new to clojure altogether and have just been wrangling around with this data as a way to experiment/learn with something concrete

noprompt19:08:55

Nice. Actually, I think you’d be the second person to join the channel in as many weeks who is both new to Clojure and to Meander. I couldn’t ask for better perspectives. 🙂

chucklehead19:08:01

so far meander's describe the shape you have and the shape you want approach is my favorite clojure thing I've found

noprompt19:08:11

I’m really happy to hear that.

noprompt19:08:21

This example is pretty interesting. I’m going to toy around with it a little bit. I think it’s exposed some opportunities for improvement too.

chucklehead19:08:33

what I'd like to make is something where I could essentially write the matching/extracting patterns in a less verbose/hiccup-style syntax and expand that into the actual xml pattern prior to matching

chucklehead19:08:08

but wasn't quite sure how to get there from here and started working on something else

noprompt19:08:12

Do you have a sketch of what’s in your mind?

chucklehead20:08:44

I guess I'd like to be able to write a pattern something like this to match that data:

[::fdm/fltdMessage {:msgType "trackInformation"
                    :flightRef (m/app Long/parseLong ?flight-ref)}
 [::fdm/trackInformation ; implicit separated(?) of content
  :start-of-seq ;made-up notation to anchor start of seq
  [::nxcm/qualifiedAircraftId]
  [::nxcm/speed (m/app Integer/parseInt ~speed)]
  [::nxcm/position
   [::nxcm/latitude
    [::nxcm/latitudeDMS {m/app dms->decimal ?latitude}]]]
  (m/or
   [::nxcm/ncsmTrackData ?some-data]
   [::nxcm/ncsmRouteData ?some-data])
  :? ; made-up notation for 0 or 1 of the preceding element. Implicit nil for bound vars?
  :end-of-seq]]

chucklehead20:08:34

that conflates a few things I've been bumping into, but hopefully gets the idea across

chucklehead20:08:35

I struggled (I still struggle, but I used to, too) quite a bit with really grokking the semantics of the sequence operators/notation and optionality. With next lat/long for instance, trying to express that if it was present it would be in one of two mutually exclusive elements, but might not be there at all.

chucklehead20:08:03

Eventually I scrolled far enough in the cookbook to see the optional value stuff and felt less alone at least

noprompt22:08:15

(m/rewrite xml-edn
  {:tag ?tag, :attrs ?attrs, :content (m/seqable (m/cata !xs) ...)}
  [?tag ?attrs . !xs ...]

  ?x
  ?x)

chucklehead19:08:34

Thanks, sorry to take so long to get back, hadn't had a chance to work on this project... Hopefully I'm not being obtuse about this. I was hoping to use the hiccup syntax as a sort of DSL to write the matching/capturing patterns. I was thinking I'd wrap the hiccup in some helper function/macro and use that wrapped form directly in the match pattern position for find/search/rewrite/etc and the helper would transform my hiccup-syntax pattern into it's map representation while preserving any logic variables, memory variables, or other operators in the pattern). i.e. I'd write something like:

(m/find example
          (from-hiccup-pattern
           [:message {:type "X"}
            [:detail {:field ?y}
             [:first (m/app Float/parseFloat ?content)]
             (m/$ [:second-nested {:q ?q}])]])
          {:y-val ?y
           :q-val ?q
           :parsed-content ?content})
and it would essentially expand to:
(m/find example
        {:tag :message
         :attrs {:type "X"}
         :content (m/seqable {:tag :detail
                              :attrs {:field ?y}
                              :content (m/seqable
                                        {:tag :first
                                         :content (m/seqable
                                                   (m/app Float/parseFloat ?content))}
                                        (m/$ {:tag :second-nested
                                              :attrs {:q ?q}}))})}
        {:y-val ?y
         :q-val ?q
         :parsed-content ?content})
I can see how I could use your rewrite to transform my input to a hiccup format that I could then match against (possibly as another step within the same rewrite?) . In my head I'd envisioned going about it the other way so that the hiccup to xml pattern expansion would happen at compile-time rather than xml to hiccup at parse time. I haven't used cata or memory variables yet, and not really all that familiar with macros/quoting so I suspect I'm overlooking a trivial way to do what I want, or possibly the distinction I'm worried about isn't accurate/doesn't matter with the way meander works.