Fork me on GitHub
#datahike
<
2022-05-16
>
Bhougland16:05:08

Hello everyone,

Bhougland16:05:00

I am thinking about using Dolt (https://docs.dolthub.com/introduction/what-is-dolt) to keep track of master data file changes over the lifetime of a project. I like that it uses GIT commands and allows comments for each change. However, i would love to use Clojure and a datalog type db instead. Would this be a good usecase for my needs? If not, is there anything in the Clojure ecosystem that is similar? I don't need all the functionality of Dolt, just to recognize updates to tables and attach comment reasons, and possibly rollback if needed.

respatialized19:05:38

As someone who dreams of taking a sabbatical to build a system like this, I feel compelled to issue a disclaimer. it's certainly possible to do, and there's a lot about the Clojure Datalog DBs that make them better candidates than other databases. But even if dolt does more than you need, you need to ask the question of whether maintaining such a Clojure/Datalog based data versioning system yourself is worth the effort as compared with solving the problem you need Dolt to solve. I found myself in a similar situation with DVC, a similar tool. As much as I might have wanted to replace it with something cool like Datahike, at the end of the day, I couldn't justify the maintenance burden it would impose on me.

Bhougland21:05:23

Trust me, I completely understand where you are coming from, but I am trying to break away from the Python ecosystem (Funny enough, I am also looking at DVC). I just thought this would be a good project to start to learn many part of Clojure, as I am new to the community. My workplace is not expecting the consultants to create something like this, I would just like to create it in my spare time. I guess I am just wondering if this is the right tool for the job if I decided to attempt it.

respatialized22:05:30

Here's a project that you might be interested in taking a look at (a different part of the data/ML lifecycle): https://github.com/jcpsantiago/bulgogi

kkuehne10:05:54

@UT2EHQN7Q you can use Datahike to track changes in your data model over time when using the history feature. But things like branch and merge we don't have.

Bhougland12:05:24

Does Datahike have Redundancy Elimination?

whilo06:05:48

The data structures we use are persistent and therefore can share structures, but it depends on how you write to them whether you actually share structure in practice. It is definitely possible to implement branching and we have prototyped merging two Datahike databases here https://github.com/replikativ/datahike/blob/336-native-image-cli/doc/cli.md#merging, but got sidetracked by implementing faster write operations to our indices first.

grounded_sage00:05:15

What do we have to do to get the native-image support on main? @U1C36HC6N @UB95JRKM3

Bhougland16:05:49

By the way, this implementation is on a system I don't control and the revision aspect isn't something that is native to that software.

🙌 1