etaoin

adham 2023-11-08T11:44:54.269519Z

Hey, I'm using etaoin to do some scraping, is there any way that I can get the text of a the tooltip caused by an active element? Such as the attached media where I want "Tuesday, November 7, 2023 at 7:19 PM" I tried to use OCR but the screenshot quality from etaoin is too low and I haven't found a way to zoom in the view. Any help with either approaches?

lread 2023-11-08T12:24:52.972389Z

Hi @adham.rasoul do you have an example public web page? What web browser are you using?

adham 2023-11-08T12:31:45.655499Z

Yes this https://www.facebook.com/Cognitect/posts/pfbid0oWQuewGELmYMLiVELC6e93iyfZZcwHYT3xVSeLwRX7j24WNPi9VfGkJ4HjgpbtMUl shows it. By hovering over "July 17, 2018" by mouse or getting there by invoking

(e/fill-active driver
               (k/chord k/tab))
until you get there

lread 2023-11-08T12:45:57.526159Z

Oh, I don't have a Facebook account and have no plans to ever have one. Would you happen to have another example?

adham 2023-11-08T12:46:35.166889Z

This shouldn't require a Facebook account, I can open it with Etaoin and clear the login prompt, let me supply a minimal example

lread 2023-11-08T12:48:06.398879Z

Oh ya, ok, I see

lread 2023-11-08T12:50:27.115599Z

I'm still sipping my first coffee of the day, will take a peek sometime soon.

adham 2023-11-08T12:51:13.517159Z

(def driver (e/firefox))
(e/go driver "")
;; Remove the login prompt
(e/click driver
         [{:css ".xc9qbxq > i:nth-child(1)"}])
;; Tab to the date
(repeatedly 8
              (fn []
                (e/fill-active driver
                               (k/chord k/tab))))

adham 2023-11-08T12:51:46.824169Z

@lee I'm sipping my second coffee so all good, thanks!

adham 2023-11-08T13:27:46.224209Z

Slack has something similar, when I inspect element I find a data-ts attribute with a unix timestamp as its value but I am unable to find a similar thing in the FB one. They also both show an event button in the Firefox Inspector.

lread 2023-11-08T13:51:52.101289Z

@adham.rasoul, it is a bit tricky because the DOM is manipulated on hover. I fiddled with the browser inspector and saw the parent element dynamically gets an aria-describedy attribute on link hover, this, in turn, seems to point to an element that has the full date text. Here's what I tacked onto the end of your example above:

(let [parent-el (e/query driver :active "..")
      describe-elem-id (e/get-element-attr-el driver parent-el "aria-describedby") ]
  (e/get-element-text driver {:id describe-elem-id}))
;; => "Tuesday, July 17, 2018 at 10:47 AM"
Does that work for you?

lread 2023-11-08T13:54:48.679519Z

To note: this seems to be returning the full date in the local time zone.

adham 2023-11-08T14:03:01.323379Z

Yes this works for me! Didn't think to monitor the DOM as I hover over it, I'll try to learn to do this and add it to my toolbelt, thank you for your time and help No worries about the local time zone that can be fixed after parsing, the big thing is learn here is to work with a dynamic DOM I have one question, what does the syntax ".." mean here? Is it a sort of catch-all classes?

lread 2023-11-08T14:04:54.180789Z

It means parent. So (e/query driver :active "..") asks for the parent element of the current active element.

adham 2023-11-08T14:08:46.792149Z

I see and I did some googling and this looks it comes form an https://stackoverflow.com/questions/28237694/xpath-get-parent-node-from-child-nodesyntax, am I correct?

lread 2023-11-08T14:09:45.603049Z

Yep, the .. is an XPath thing.

lread 2023-11-08T14:11:43.297249Z

Relevant etaoin docs are https://cljdoc.org/d/etaoin/etaoin/1.0.40/doc/user-guide#_simple_queries_xpath_css, which points to these docs on https://www.w3schools.com/xml/xpath_syntax.asp.

adham 2023-11-08T14:13:13.784999Z

Understood, I need to sit down for these details soon since I'll be working more with Etaoin, my approach at the moment is Ctrl-F through the -amazingly well written- guide, thank you again this clears all my questions

lread 2023-11-08T14:13:54.903039Z

Glad to be of help! Drop by anytime if you get stuck again.