2024-11-25 data-science | Clojure Slack Archive

data-science 2024-11-25

Daniel Slutsky 2024-11-25T10:38:36.430749Z

https://scicloj.github.io/tableplot/ is getting close to Beta stage. While it still needs better documentation and a few missing features, we are close to deciding that the API is relatively stable. If anybody wishes to try it out and provide some feedback, that would be of great help. The main current question is: Does the API make sense, or should we change anything about it, such as function or parameter names, or certain behaviors?

👍 1

🎉 2

Daniel Slutsky 2024-12-06T14:57:04.084479Z

> Is there a reference page that lists the main functions and supported keywords? @smith.adriane here is an initial attempt to do that: https://scicloj.github.io/tableplot/tableplot_book.plotly_reference

👍 2

phronmophobic 2024-11-25T14:42:38.629169Z

> Keep developing mainly the Plotly.js-based API. Does that imply that the plotly API is preferred over the other API?

phronmophobic 2024-11-25T14:43:42.718659Z

Is there a reference page that lists the main functions and supported keywords?

Daniel Slutsky 2024-11-25T14:47:50.735819Z

> Does that imply that the plotly API is preferred over the other API? Yes. Maybe I should state that. It seems easier to extend for our needs.

Daniel Slutsky 2024-11-25T14:48:09.121549Z

> Is there a reference page that lists the main functions and supported keywords? No 😳 Soon!

Daniel Slutsky 2024-11-25T14:48:12.987829Z

Thanks for looking into it.

phronmophobic 2024-11-25T14:51:36.762749Z

If the plotly API is preferred, then I would rearrange the docs to focus on that API. I might even move most of the docs for the other API to its own section at the end.

👍 1

2024-11-25T17:21:50.457999Z

that is a big shift in the library for me (moving away from vega-lite/hanami) I hadn't quite realised that from the outside

Daniel Slutsky 2024-11-25T17:56:26.744129Z

Thanks. We are not moving away from Vega and classical Hanami, but might not extend it much in the near future. The hopes for additional functionality (coordinates, facets, etc.) will be easier to implement in Plotly. In any case, I'd be glad to discuss if you think we should and could extend the Vega-Lite/Hanami part further. Learning from your perspective will help a lot in prioritizing.

Daniel Slutsky 2024-11-25T19:00:52.040299Z

@otfrom does it seem to make sense? Both are supported and will be supported, but one of them seems easier to extend.

Gent Krasniqi 2024-11-25T12:57:19.230059Z

For someone that hasn't tried the API of either tech.ml.dataset or tablecloth, does someone know of an overview, or ideally a side-by-side comparison, of how they differ? (Basically something that 'sells' the added value of tablecloth)

genmeblog 2024-11-25T13:55:19.583739Z

tablecloth is a wrapper on the top of the tech.ml.dataset The main goal was to tidy up the API (which was a case a the very beginning of TMD) and add some missing stuff like pivoting and operations on grouped datasets. Generally the main idea was taken from R dplyr and tidyr packages.

genmeblog 2024-11-25T13:57:19.569749Z

tablecloth should work well for midsize datasets (say: millions of rows), while tech.ml.dataset allows to work on bigger stuff, many more sources and has some paths highly optimized, for example: grouping and aggregations in one step.

genmeblog 2024-11-25T13:59:52.152119Z

So, no simple answer for your question 🙂 You can use both libraries, entities are the same.

Gent Krasniqi 2024-11-25T14:59:34.784219Z

Thanks. I guess what I wanted to get at was: for those midsize datasets, how much more convenient is the tablecloth API... if it's just "once in a while it will spare you one line of code or one additional transformation", to me it wouldn't be worth it for one additional place to look at documentation. (and possibly to trace bugs of the abstraction on top of TMD)

Gent Krasniqi 2024-11-25T15:00:20.972919Z

I guess I will start experimenting with TMD directly and reconsider if I find anything cumbersome.

Nick McAvoy 2024-11-25T23:30:47.469329Z

I'm just getting into this myself, but for what it's worth, I just picked up tablecloth for my problem, under the assumption it's easier/higher-level. I've found it to be nice to work with.

Nick McAvoy 2024-11-25T23:31:55.302439Z

So, if you get going with TMD, and find it also-nice, then I guess you can just keep going. If you find it ponderous though (and I really have no idea), you could take a look at tablecloth.

Clojurians Log v2

data-science 2024-11-25