Fork me on GitHub
#datascript
<
2016-12-09
>
mikeb04:12:28

With datascript, how to handle data sets that are larger? For instance there are ~5 million records in a particular table which I'd like to search, far too many to load into memory at once. How are others dealing with larger data sets and general application structure?

Niki04:12:38

How do you get such dataset into the browser in the first place?

mikeb04:12:37

That's the question, and of course, I could never load it into the browser... how are others dealing with larger sets of data that you can't fit in browser memory.

Niki04:12:26

By only loading what can fit there, I presume

Niki04:12:56

It's probably heavily dependent on the use case

Niki04:12:50

If you need to search really large dataset, you have to do that on a server, I guess. In a real database

Niki04:12:11

That can stream data from disks

mikeb04:12:49

I love the idea of using datascript, just trying to make it work across larger sets is the only problem.

mikeb04:12:56

@tonsky thanks for the help!

Niki04:12:30

You can't do that, at least not yet

Niki04:12:41

But Datomic might be a good fit for you

mikeb04:12:52

Here's a thought, run datascript on top of nodejs server side, then stream results back to browser. Just a thought. As you say datomic is probably the best answer in this case.

Niki04:12:25

No, datascript can run on JVM just fine

Niki04:12:49

The problem being it can operate on data that is not loaded in memory

Niki04:12:29

Whole database must be loaded into memory first

Niki04:12:02

While Datomic e.g. can load database segments on demand

Niki04:12:19

You don't touch it == it won't load it

Niki04:12:13

So it can grow really large but small queries will still be able to run relatively quick and in constrained memory

misha15:12:19

@tonsky is it coincidence or an intent that {:db/cardinality :db.cardinality/many} values are returned as [], not as #{}?

Niki16:12:11

I'm not sure

Niki16:12:07

Is that a problem?

Niki16:12:19

I would not rely on that fact btw

Niki16:12:36

It definitely might change

leov16:12:39

now that I dived into dat* a bit more, I have two more questions

leov16:12:53

one of it is - why d/q query returns relations as sets?

leov16:12:02

relations are multisets

leov16:12:08

they /can/ contain duplicates

leov17:12:08

second question is (I still need more input by redoing the same in datomic) is this - am I correct that in datomic db.fn/addEntity and db.fn/retractEntity are responsible for all the entity upsertion logic?

leov17:12:21

this is not a question, though.

leov17:12:52

..went reimplementing insertion logic with datomic to compare datascript and datomic..

leov17:12:05

I KNOW PULL-FU!

leov17:12:35

also I wonder if tx-report can contain things like [1 2 3 4 :db/retract] [1 2 3 4 :db/assert] or it autocollapses such things

leov17:12:57

also too bad datomic is closed-source :(((((((

misha21:12:06

Semantically :db.cardinality/many attribute's value is a set. Vectors imply order. Every time I look at the pulled entity somewhere in the state, I need to double check whether looking-like-sorted :db.cardinality/many attrs are actually sorted or are just a coincidence. Also it feels weird to (into #{}) :db.cardinality/many attr every time I need it as an actual set. Of course, in js we need all the performance we can get, so returning #{} instead of [] by sacrificing performance just for purity's sake would be silly.