Fork me on GitHub
#data-science
<
2023-08-14
>
adham11:08:50

Hey all, I'm using tablecloth to do some manipulation for an API endpoint. I have the following column selected

:_unnamed [8 1]:

|       :value |
|--------------|
| 111078990000 |
|  78649032000 |
|  69773400000 |
|  67950225000 |
|  77160392640 |
|  93741110400 |
|  81171112500 |
| 117252500000 |
How can I convert this into a vector? I might have missed this in the docs, I searched for "convert", "change back" and so on The output I want is [111078990000 78649032000 ... 117252500000] Found this solution from reading TML (tech.v3.datatype/->array (revenue-values :value)), is this the correct approach?

genmeblog13:08:47

Each column is a sequence, so call (vec (ds :value)) where ds is your dataset var.

genmeblog13:08:29

or just (ds :value) of you need a sequence.

adham14:08:44

Thanks, that works wonderfully

adham13:08:12

I'll add those functions to my tools to inspect and understand TML and Tabletloth, I also need to better read the docs, thank you again for your time

👍 2
Sam Ritchie18:08:01

is anyone here familiar with the kixi.stats implementation?

Sam Ritchie18:08:10

I’m looking for details on the Gamma distribution sampling function…

henrygarner18:08:39

Hey Sam, how is it manifesting?

Sam Ritchie18:08:12

@henrygarner woohoo, glad I caught you!!

Sam Ritchie18:08:26

I caught it from the linter showing me that the cond-> call would ALWAYS take the inc branch;

Sam Ritchie18:08:57

that led me to finding the paper describing the implementation so I could confirm that we should in fact only increment when alpha (or k , depending on how you like your names) was < 1

Sam Ritchie18:08:17

anyway, with this change things should match the paper, and there is a nice speedup too

Sam Ritchie18:08:52

added benchmark results

Sam Ritchie19:08:55

@henrygarner another Q… I just put up another PR to run tests via github actions. are you by any chance open to • a PR to convert the cljs tests to shadow-cljs? • a PR to convert the build from leiningen to deps.edn? the advantage of the latter is that folks could depend on kixi.stats as a git dependency between releases

henrygarner19:08:14

@U017QJZ9M7W brilliant, thanks! No objection from me at all

Sam Ritchie19:08:54

@henrygarner working on a PR now to get clj-kondo linting passing

Sam Ritchie19:08:40

@henrygarner innnnteresting, lots of failures on the CI box that aren’t present for me locally…

henrygarner19:08:20

Ah yes, several tests in the distribution namespace are probabilistic and will sometimes fail in presence of outliers. Not ideal for ci

Sam Ritchie19:08:48

it looks like something about the CI machine pushes us outside the prescribed deltas

2
Sam Ritchie20:08:33

let’s see if I can relax it a touch

Sam Ritchie20:08:36

or if that is a bandaid

Sam Ritchie20:08:18

okay, works with slightly relaxed bounds

Sam Ritchie20:08:15

this one is good to go, I noted some spots where it felt like there miiiight be a bug from some unused variable etc

Sam Ritchie21:08:18

@henrygarner this PR converts the build to deps.edn and shadow-cljs: <ttps://github.com/MastodonC/kixi.stats/pull/47> I based it off of the kondo PR, so if you merge that one first I’ll rebase this off of the squashed merge commit and we can go from there

Sam Ritchie16:08:41

hey @henrygarner, just checking in here to see if this looks good