datomic 2015-12-10 | Slack Archive

currentoor01:12:49

I've got a Cassandra cluster in a data center setup for Datomic (using the provided CQL scripts) and a locally running transactor connected to it.

Launching with Java options -server -Xms1g -Xmx1g -XX:+UseG1GC -XX:MaxGCPauseMillis=50
Starting datomic:cass://<IP-ADDRESS>:9042/datomic.datomic/<DB-NAME>?user=iccassandra&password=369cbbab59f6715bfde80cce13cde7cc&ssl= ...
System started datomic:cass://<IP-ADDRESS>:9042/datomic.datomic/<DB-NAME>?user=iccassandra&password=369cbbab59f6715bfde80cce13cde7cc&ssl=

But when I try to launch the web console with bin/console -p 8080 staging datomic:cass://<IP-ADDRESS>:9042/?user=<username>&password=<password>&ssl=false I get this error in the browser

Cannot support TLS_RSA_WITH_AES_256_CBC_SHA with currently installed providers trying to connect to datomic:cass://<ip address>:9042/?user=<username>&password=<password>&ssl=false, make sure transactor is running

currentoor01:12:18

Has anyone seen this error before?

currentoor02:12:08

If I try to programmatically connect it says Caused by: java.lang.IllegalArgumentException: Cannot support TLS_RSA_WITH_AES_256_CBC_SHA with currently installed providers.

currentoor02:12:17

in a stack trace.

paxan02:12:35

@domkm: if you're trying to throw informative exceptions from a transaction function, just use clojure.core/ex-info

domkm02:12:31

@paxan: ex-info doesn't differentiate between state and argument exceptions but those Datomic exception classes do.

paxan02:12:54

Fair point @domkm. I've settled on just using ex-info in our txn functions based on the recommendation from one of the datomic people.

roelof08:12:41

Can datomic be a good solution for a ecommerce solution made in clojure. Or a accounting app made in clojure ?

robert-stuttaford08:12:49

hell yes, and hell yes

robert-stuttaford08:12:11

both require a full audit trail to be sound. datomic excels at that

roelof08:12:17

oke, then I have to search for a good tutorial / book to learn datomic and the way I have to make the queries

robert-stuttaford08:12:34

see the 3 pinned items in here

robert-stuttaford08:12:39

there’s no book yet

robert-stuttaford08:12:51

but there are great videos from the folks who make Datomic

roelof08:12:42

sorry for a newby question. How can find the pinned items ?

robert-stuttaford08:12:45

also see http://learndatalogtoday.org

robert-stuttaford08:12:03

open the People list and then click pinned items just above all the people in the list

roelof08:12:43

thanks, learned another thing about slack

robert-stuttaford08:12:44

👍 also a bunch of recipes in the clojure cookbook here: https://github.com/clojure-cookbook/clojure-cookbook/tree/master/06_databases, 6-10 through 6-15

ping08:12:46

Q: we are evaluating datomic fora social network app, and we are storing large body of text.

ping08:12:02

I have read tht datomic is not suitable for storing a lot of text.

ping08:12:30

is this still true as of the latest release?

robert-stuttaford09:12:43

it is. you can use a blob store (DynamoDB or similar) to store the actual text bodies, and store the keys in Datomic, and so benefit from all Datomic’s capabilities until the point where you need to retrieve this text

robert-stuttaford09:12:36

actually, clarification is needed. Datomic’s performance is pressured when it is given large strings, but it’s totally capable of dealing with many, small strings

robert-stuttaford09:12:52

which case describes your problem?

robert-stuttaford09:12:44

@ping ping

joseph09:12:48

@robert-stuttaford: Hi robert, these days, I am working to import the 20 million data from mysql to datomic, and applied the policies about pipe-blocking and batching, but it kept happening the java.util.concurrent.ExecutionException: clojure.lang.ExceptionInfo: :db.error/transactor-unavailable Transactor not available {:db/error :db.error/transactor-unavailable} error

robert-stuttaford09:12:22

@joseph, for import jobs, you need to tweak your transactor memory settings. what storage are you using?

joseph09:12:37

I used to consider it's the problem of memory, coz it's shown in the log outofmemory

robert-stuttaford09:12:37

your import is crushing the transactor

robert-stuttaford09:12:10

http://docs.datomic.com/capacity.html#data-imports

joseph09:12:19

memory-index-threshold=32m
memory-index-max=4g
object-cache-max=2g

joseph09:12:33

I gave 8 GB to transactor

robert-stuttaford09:12:49

what storage are you using? dynamo or something else?

robert-stuttaford09:12:41

the other issue is that threshold. it’s going to index to storage whenever it reaches that threshold. so you might want to increase that quite a bit

joseph09:12:41

no, just dev mode, we are still in the experiment

robert-stuttaford09:12:13

ok. try increasing that threshold to 128 or even 256mb

robert-stuttaford09:12:37

you should also give it time between batches to catch up with itself

joseph09:12:22

ok, and we have around 120 variables, and each variable has around 100 000 datum, should I do the request-index after importing each variable's data?

robert-stuttaford09:12:45

that’s a wise idea

robert-stuttaford09:12:05

import one variable’s datoms, request index, wait for it to go back to sleep

robert-stuttaford09:12:12

are you batching the transactions for those 120,000?

joseph09:12:40

no, batching around 200-400 datom

robert-stuttaford09:12:41

so 2000 datoms at a time, say

joseph09:12:55

robert-stuttaford09:12:11

ok. if the values are small, then you can go higher than that

joseph09:12:04

when you were saying "wait for it to go back to sleep", do you mean I should wait for the return of request-index_

robert-stuttaford09:12:16

you can also tag the transaction itself if you want to keep track of which source values you’re transacting - e.g. [:db/add (d/tempid :db.part/tx) :source-range “variable-A__4001-6000”]

robert-stuttaford09:12:34

wait for its CPU usage to die down

robert-stuttaford09:12:44

i’ve never used request-index, reading docs

robert-stuttaford09:12:54

hey vijay

vijaykiran09:12:21

Hi Robert!

robert-stuttaford09:12:28

yeah request-index returns immediately. you can wait for the deref of http://docs.datomic.com/clojure/#datomic.api/sync-index to return

robert-stuttaford09:12:19

(do (d/request-index conn) @(d/sync-index conn (d/basis-t (d/db conn))))

robert-stuttaford09:12:36

something like that

joseph09:12:58

right

robert-stuttaford09:12:56

let me know how it goes, i am curious to learn from your use-case

joseph09:12:26

no problem:grinning:

joseph09:12:56

testing now, but I am a little unclear about the reason to increase the index threshold instead of decreasing

robert-stuttaford09:12:33

well, if you’re manually controlling when you index, then doing so is only necessary to prevent it from indexing before you’re ready

robert-stuttaford09:12:58

it might start indexing before you’re done transacting all the datoms for a variable

joseph09:12:11

hmm...the transactor crash after first variable's sync-index...

robert-stuttaford09:12:35

stacktrace?

joseph09:12:42

ok, I tried again, it goes well now, but it's obviously slower than before...

robert-stuttaford09:12:58

i prefer slow and correct to fast and incorrect

robert-stuttaford09:12:17

getting to fast and correct is a matter of tuning, which might not be worth the time investment if you achieve your goal before you get there

robert-stuttaford09:12:34

i say that as someone who’s been there XD

joseph09:12:25

yes, of course correct is most important, used to batch around 150 datoms, it also works, and almost the same speed as now...

robert-stuttaford09:12:30

i recommend you reach out to @michaeldrogalis in #C051WKSP3, i think they might be working on some sort of SQL->Datomic ETL tool. your case might just be a great test case for them if that happens to be true

joseph09:12:20

ah....nice, thanks a lot

ping09:12:32

@robert-stuttaford: I’m refactoring a legacy publishing system, and in midst of experimenting with datomic for it hence the large text requirement.

robert-stuttaford09:12:50

so it is large text blobs?

ping09:12:18

I’m not familar with text blobs tbh, we were using old mongo

ping09:12:33

and never really concerned too much with text size issue.

robert-stuttaford09:12:00

how big is your biggest string?

robert-stuttaford09:12:28

10s of kb? 100s of kb? 1s of mb?

ping09:12:06

average around 20kb

robert-stuttaford09:12:56

oh you can stick that in Datomic no problem

ping09:12:06

really? 😄 yay

ping09:12:26

but why the warning about 1k string limit ?

ping09:12:35

maybe those are old version

robert-stuttaford09:12:43

we’re using strings of that size and it’s totally fine

robert-stuttaford09:12:59

how many records are you talking?

ping10:12:09

at what size should I be concerned?

ping10:12:43

in context of mongo’s document, not alot, around 800k docs so far

robert-stuttaford10:12:41

ok. so, the pressure that large strings puts on the system is that indexing takes longer, and less datoms are stored per index segment, which means more have to be retrieved from storage when satisfying query

ping10:12:57

ahhh

robert-stuttaford10:12:14

@bkamphaus and @luke can both comment with more detail than that

ping10:12:27

tht make sense.

robert-stuttaford10:12:30

so it’s not like it’s a boolean GOOD or BAD; it’s a slow degradation as your size and volume increases

ping10:12:39

got it.

robert-stuttaford10:12:38

personally i think you might consider spending a day writing a migration and put a whole bunch of data in and write some queries, and see how it all feels.

ping10:12:51

yeah that’s what I am thinking too

ping10:12:14

I have never heard of ppl trying to build somehting like http://medium.com backed by datomic

robert-stuttaford10:12:40

my biased opinion is that the benefits Datomic will bring you will far outweigh any perf costs you might pay. and, there are ways to deal with it if you do find that the perf pressure is too great (put strings in KV store, store keys in Datomic)

ping10:12:45

then again, I have not really come across datomic being positioned as some kind of all-purpose backend storage

robert-stuttaford10:12:00

we use it for everything

ping10:12:47

interesting. My plan B is to keep storing large text in mongo doc, and datomic refer to it by key/id

ping10:12:19

but ofcourse, they would mean 2 backend storage to maintain and possibly n+1 queries

ping10:12:26

to get the text data.

ping10:12:02

Curious, are you using dynamodb or mix of dynamodb and other db?

robert-stuttaford10:12:14

we use DDB as a Datomic backend only

robert-stuttaford10:12:36

no other storages or direct-use dbs, aside from Redis as a post-query cache for some hot pages

robert-stuttaford10:12:50

we used memcached as a 2nd tier cache for Datomic as well

ping10:12:06

got it, thanks for your input. very helpful.

robert-stuttaford10:12:26

100%, happy to assist

joseph11:12:09

@robert-stuttaford: I am a little bit confused about more peers. because I met one situation, that kind of read around 1 million datum's value from datomic, and query failed every time, I considered that's because the result is too big and out of memory. So I am thinking of if more peers will help it?

robert-stuttaford11:12:13

the result set of a Datalog query has to be able to fit into memory

robert-stuttaford11:12:44

the datoms under consideration do not have to, but if they don’t, you’ll have cache churn as it cycles index segments in and GC cleans up

robert-stuttaford11:12:13

you can, however, lazily walk over the datoms yourself, building up some sort of a result

robert-stuttaford11:12:41

this talk has stuff about doing that <ttp://www.infoq.com/presentations/datomic-use-case>

robert-stuttaford11:12:55

the relevant api is http://docs.datomic.com/clojure/#datomic.api/datoms

robert-stuttaford11:12:15

this way you can lazily walk your million datoms, performing functional transformations (filtering, mapping, reducing, etc), and either arrive at an end result, which still has to fit into ram, or do some sort of processing and commit results to some sort of I/O so that your results no longer need to fit into ram

robert-stuttaford11:12:21

i hope all that makes some sense

joseph11:12:22

yes, that's also what's I am think of, the limited ram is one problem, but these days I read some info about the coordination among peers, and get confused about if more peers can help...

robert-stuttaford12:12:56

the fact that you can hold on to a database value indefinitely solves the timing problem

robert-stuttaford12:12:14

doesn’t matter how long the query phase takes, you don’t have to worry about the database changing on you

robert-stuttaford12:12:45

this allows you to perform all the work on a single peer, in a lazy-sequence fashion, perhaps parellelising some of the work along the way

robert-stuttaford12:12:05

i point you at http://onyxplatform.org again, as it’s built for precisely this sort of work coordination

joseph12:12:29

thanks, that's very helpful

joseph12:12:16

btw, the experiment fails with the same reason, and the strange thing is there is neither error nor warn in the log

robert-stuttaford12:12:33

transactor unavailable?

joseph12:12:37

yes,

joseph12:12:45

here is my logback.xml config file:

joseph12:12:50

26   <logger name="datomic.transaction" level="DEBUG"/>
 27 
 28   <!-- uncomment to log transactions (peer side) -->
 29   <logger name="datomic.peer" level="DEBUG"/>
 30 
 31   <!-- uncomment to log the transactor log -->
 32   <logger name="datomic.log" level="DEBUG"/>
 33 
 34   <!-- uncomment to log peer connection to transactor -->
 35   <logger name="datomic.connector" level="DEBUG"/>
 36 
 37   <!-- uncomment to log storage gc -->
 38   <logger name="datomic.garbage" level="DEBUG"/>
 39 
 40   <!-- uncomment to log indexing jobs -->
 41   <logger name="datomic.index" level="DEBUG"/>

robert-stuttaford12:12:57

check the transactor's logs - anything in there?

robert-stuttaford12:12:22

i would uncomment the last one indexing jobs

robert-stuttaford12:12:43

i have to go. good luck

joseph12:12:57

ok, thanks again

2015-12-10

Channels