This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-02-17
Channels
- # announcements (1)
- # aws (40)
- # babashka (37)
- # beginners (305)
- # chlorine-clover (15)
- # cider (5)
- # cljs-dev (40)
- # clojure (62)
- # clojure-europe (13)
- # clojure-nl (4)
- # clojure-spec (10)
- # clojure-sweden (2)
- # clojure-uk (59)
- # clojurescript (9)
- # core-async (13)
- # cursive (5)
- # data-science (2)
- # datascript (2)
- # datomic (29)
- # emacs (8)
- # fulcro (58)
- # lambdaisland (9)
- # leiningen (2)
- # lumo (3)
- # mid-cities-meetup (1)
- # midje (1)
- # off-topic (28)
- # shadow-cljs (32)
- # spacemacs (3)
- # sql (5)
- # tools-deps (1)
- # tree-sitter (1)
- # vscode (2)
- # yada (2)
Hello! After more than 3 years in the making, I am proud to announce the release of Skyscraper 0.3.0, a scraping framework that helps you build structured dumps of whole websites.
Home: https://github.com/nathell/skyscraper/
Major improvements in 0.3.0:
• Skyscraper has been rewritten from scratch to be asynchronous and multithreaded, based on core.async.
• Skyscraper now supports saving the scrape results to a SQLite database.
• In addition to the classic scrape
function that returns a lazy sequence of nodes, there is an alternative, non-lazy, imperative interface (`scrape!`) that treats producing new results as side-effects.
• reaver (using JSoup) is now available as an optional underlying HTML parsing engine, as an alternative to Enlive.
See NEWS.md for a complete list.
I’m particularly happy about the database abilities of this release – for a glimpse of what it can do, see https://cljdoc.org/d/skyscraper/skyscraper/0.3.0/doc/database-integration.
Happy scraping! 🏛️