This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-07-03
Channels
- # announcements (13)
- # babashka (1)
- # beginners (4)
- # calva (13)
- # clojure (13)
- # clojure-europe (12)
- # clojure-norway (2)
- # community-development (7)
- # conjure (9)
- # core-typed (5)
- # cursive (5)
- # data-science (15)
- # datomic (2)
- # fulcro (10)
- # hyperfiddle (3)
- # leiningen (1)
- # missionary (65)
- # nbb (6)
- # off-topic (3)
- # shadow-cljs (11)
- # specter (3)
- # vim (8)
New release of https://github.com/simongray/datalinguist (0.2.171), the Clojure wrapper for Stanford CoreNLP. This release... • bumps the CoreNLP dependencies to 4.4.0 (the latest) • adds support for Tregex (grammatical constituency tree pattern matching) • adds support for TokensRegex (token-based pattern matching) • removes the ML contribution by Carsten Behring at his own request
Great work, thanks for doing this!
Having had no previous experience with NLP libraries, I was wondering why I couldn’t get your examples to work. Then I realized that I had to download CoreNLP first from https://stanfordnlp.github.io/CoreNLP/ and add stanford-corenlp-4.4.0/*
to the classpath. Everything worked fine after that.
Is this what you are supposed to do? It wasn’t mentioned in the readme, so I was wondering if I did something wrong here or if it is more obvious to people who have already worked with CoreNLP.
Hm… CoreNLP should be already added to the classpath if you’re using this as a library. However, you do need to download a language model to get most of the examples to work. This is mentioned in the README: https://github.com/simongray/datalinguist#language-models
This is strange, then maybe something with my setup wasn’t right. I have a minimal deps.edn
that looks like this:
{:paths ["stanford-corenlp-4.4.0/*"]
:deps {edu.stanford.nlp/stanford-corenlp$models-english {:mvn/version "4.4.0"}
dk.simongray/datalinguist {:mvn/version "0.2.171"}}}
So I also had the language model already in there, but I had to add the path to get it to work. In my source file, I required [dk.simongray.datalinguist :refer :all]
and recreated the example in your readme.Here you go: https://github.com/simongray/datalinguist-example This works for me using the Clojure CLI in IntelliJ/Cursive.
Thanks for the example! You used edu.stanford.nlp/stanford-corenlp$models
instead of specifying the language model like I did before and it seems like this was the problem. I wasn’t aware that I also need the models library without any language suffix, but now it works fine.
It's also confusing to me 😁 basically, most annotators need some data to work and in the case of English this data is not in a single place. Part of the motivation of this library is to make CoreNLP more accessible, since using it from Java is even more confusing IMO and requires some significant boilerplate.
If I could, I would just include all of the official language models in datalinguist, but that constitutes a multi-gigabyte dependency.
Yeah, I wouldn’t even know where to start when using the library from Java. Maybe a little note in the “Language models” section of your readme would clarify that you also need the package without the suffix? Otherwise I think it is really easy to setup.
New release of https://github.com/ivarref/double-trouble (0.1.96), a library to handle re-tried Datomic transactions and similar situations. This release adds:
• A set-and-change function, :dt/sac
, that cancels a transaction if a value does not change.
• A just-increment-it function, :dt/jii
, that increments the value of an attribute.