Fork me on GitHub
#datomic
<
2018-08-02
>
dominicm07:08:36

I'm not using ions, but I'm just curious (because I'm also using CodeDeploy). What role does AWS Step Functions take in the process?

henrik07:08:26

I have no idea of the actual answer, but would like to invite you to #ions-aws. We set it up because there’s a lot to figure out with AWS/Ions, and we’re kind of saturating the #datomic channel. 🙂

henrik10:08:53

When creating an entity where some parts may or may not be nil, what’s good practice? Do I transact regardless and store nil, or should I weed out the nils beforehand?

Alex Miller (Clojure team)10:08:46

I believe it’s invalid to transact a nil value for an attribute

Alex Miller (Clojure team)10:08:15

Ideally in Clojure it’s best to just avoid having nil values in an entity in the first place

henrik10:08:50

Yep, it blew up ^_^ I’m considering writing a protocol that just rips out anything that’s nil and wrap all transactions in it. Can you think of any reason why this would be a bad idea?

henrik11:08:11

Actually, blows up on UUIDs.

Alex Miller (Clojure team)13:08:11

I promise you that you’ll be happier 6 months from now if you spend the time to avoid making nil attribute values in the first place

Alex Miller (Clojure team)13:08:44

then blowing up on nils is a feature, not a problem

henrik13:08:03

I’m not sure I get that choice. I’m converting wildly varying XML blobs from scholarly publishers and sticking them in Datomic. Sometimes stuff is missing. Sometimes VITAL stuff is missing. I only get the choice of checking attribute by attribute, or the entire thing at the same time.

Alex Miller (Clojure team)13:08:27

if stuff is missing, just don’t make an attribute?

Alex Miller (Clojure team)13:08:33

you may need to rework some ingest code, but I promise you from experience this is a path that will result in less code and less pain in the long run

henrik13:08:54

Yeah, maybe my pipeline is wonky. So I do, 1. XML -> EDN (generic) 2. EDN -> EDN (for Datomic) 3. Transact. So step two, can look like this, for example:

(defn prepare-data [{:keys [external-ids issn website title description]}]
  (let [online-issn (:online issn)
        print-issn (:print issn)]
    [{:journal/id (java.util.UUID/randomUUID)
      :journal/external-ids (mapv prepare-external-id external-ids)
      :journal/issn-print {:identity/issn print-issn}
      :journal/issn-online {:identity/issn online-issn}
      :journal/website {:internet/URL website}
      :journal/title title
      :journal/description description}]))

henrik13:08:33

Essentially, take a bunch of stuff and stick them in a template.

henrik13:08:19

So, is it preferable to wrap each attribute in a conditional?

henrik13:08:20

Or a list of conditional assocs?

Alex Miller (Clojure team)14:08:02

cond-> tends to be very helpful in stuff like this

Alex Miller (Clojure team)14:08:27

(cond-> {:journal/id (java.util.UUID/randomUUID)}    ;; etc, the part that's always there
  ;; check each optional thing and assoc if needed
  print-issn (assoc :journal/issn-print {:identity/issn print-issn})   
  online-issn (assoc :journal/issn-online {:identity/issn online-issn}})

Alex Miller (Clojure team)14:08:04

which you can read as: start with an init map if print-issn exists, assoc it into the map (otherwise pass along) if online-issn exists, assoc it into the map (otherwise pass along)

Alex Miller (Clojure team)14:08:25

the starting object threads into the first arg of assoc

Alex Miller (Clojure team)14:08:12

once you’ve seen this form a couple times, it becomes very easy to read

Joe Lane14:08:20

ive also been using this before transacting data (into {} (remove #(nil? (second %)) some-map))

Joe Lane14:08:55

could probably be converted to work with nested data

Alex Miller (Clojure team)14:08:29

as I said above, I promise you the better long-term strategy across your app is to avoid ever creating or passing around nil attributes in the first place

Alex Miller (Clojure team)14:08:06

it requires a little more up-front care but Clojure and Datomic both come from an aesthetic where that is preferred

Joe Lane14:08:16

Agreed about just not making nils. I made the mistake initially and have regretted it in record time. Luckily I may still have time to fix it.

henrik14:08:12

I would love to, but to do that I would have to make my case with those third parties 🙂

henrik14:08:54

Why is cond-> preferable to just wiping out the nils in one go? Is it more obvious what’s going on?

Alex Miller (Clojure team)14:08:17

inevitably you need to handle nested nils too - then you’re doing recursive walks and modification of your data

Alex Miller (Clojure team)14:08:46

instead making something, then removing parts of it, it is better to just make what you want in the first place

Alex Miller (Clojure team)14:08:03

if you’re taking data from external sources, then that’s not an option of course

henrik14:08:48

Thank you for the advice @U064X3EF3 🙂

Mark Addleman14:08:09

i have a datalog query against datomic cloud that takes in a large number of entity ids as a parameter. the :in clause has a binding like [?entid ...] and each iteration produces a result that is independent the other iterations (there’s no join across the entity ids in the in clause - i’m pretty sure that’s not possible anyway). so, from a performance standpoint, which is better: executing as a single query or breaking the entity ids up into batches and executing the queries in parallel (this risks overloading datomic and getting back :cognitect.anamoly/busy and retrying).

Mark Addleman14:08:44

i’ve been trying different query strategies but datomic’s awesome caching makes getting consistent timings across runs pretty difficult 🙂

Mark Addleman14:08:13

if the datomic query planner doesn’t already do this, it would be awesome if it could detect this situation and execute the query in parallel automaticaly 🙂

favila16:08:47

Don't quote me on this but I am pretty sure clauses of a query are evaluated in parallel already

favila16:08:37

unless you can get parts of the query running on different machines I don't think there's any advantage to splitting

favila16:08:35

make sure the parameter you destructure as [?entid ...] is a true vector

favila16:08:51

not a set or seq or something that doesn't allow direct index access

Mark Addleman16:08:22

> evaluated in parallel that is in line with my early exploration. increasing the parallelism on the client side does not seem to improve overall query performance.

Mark Addleman16:08:54

> make sure the parameter you destructure as [?entid ...] is a true vector interesting. why does direct index access matter? i would have thought simply being a seq would be sufficient

favila16:08:32

under the hood is the clojure.core.reducers/CollFold protocol, which has an efficient parallel implementation for vectors but not for seqs

favila16:08:04

the parallelism in query is most likely implemented with reducers

👍 4
currentoor17:08:57

has anyone here setup up datomic with AWS RDS postgres? i’ve been struggling for two days on this issue 😅 AWS doesn’t give you a postgres user, so I modified the setup SQL scripts to do this

CREATE DATABASE datomic
 WITH OWNER = currentoor
      TEMPLATE template0
      ENCODING = 'UTF8'
      -- TABLESPACE = pg_default
      LC_COLLATE = 'en_US.UTF-8'
      LC_CTYPE = 'en_US.UTF-8'
      CONNECTION LIMIT = -1;

CREATE TABLE datomic_kvs
(
 id text NOT NULL,
 rev integer,
 map text,
 val bytea,
 CONSTRAINT pk_id PRIMARY KEY (id )
)
WITH (
 OIDS=FALSE
);
ALTER TABLE datomic_kvs
 OWNER TO currentoor;
GRANT ALL ON TABLE datomic_kvs TO currentoor;
GRANT ALL ON TABLE datomic_kvs TO public;

CREATE ROLE datomic LOGIN PASSWORD 'datomic';
Just replaced postgres owner with my user and commented out the table space line (it defaults creating datomic to pg_default anyway. All this appears to work, but trying to spin up a transactor locally connected to this postgres instance results in
Launching with Java options -server -Xms1g -Xmx1g -XX:+UseG1GC -XX:MaxGCPauseMillis=50
Starting datomic:sql://<DB-NAME>?jdbc:, you may need to change the user and password parameters to work with your jdbc driver ...
System started datomic:sql://<DB-NAME>?jdbc:, you may need to change the user and password parameters to work with your jdbc driver
Critical failure, cannot continue: Lifecycle thread failed
java.util.concurrent.ExecutionException: org.postgresql.util.PSQLException: ERROR: relation "datomic_kvs" does not exist

currentoor18:08:52

in the psql console i see this

mydatabase=> \l
                                     List of databases
    Name    |   Owner    | Encoding |   Collate   |    Ctype    |     Access privileges     
------------+------------+----------+-------------+-------------+---------------------------
 datomic    | currentoor | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 mydatabase | currentoor | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 rdsadmin   | rdsadmin   | UTF8     | en_US.UTF-8 | en_US.UTF-8 | rdsadmin=CTc/rdsadmin
 template0  | rdsadmin   | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/rdsadmin              +
            |            |          |             |             | rdsadmin=CTc/rdsadmin
 template1  | currentoor | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/currentoor            +
            |            |          |             |             | currentoor=CTc/currentoor
(5 rows)

mydatabase=> \d
             List of relations
 Schema |    Name     | Type  |   Owner    
--------+-------------+-------+------------
 public | datomic_kvs | table | currentoor
(1 row)

currentoor18:08:04

so datomic_kvs definitely exists

currentoor18:08:52

oh looks like i might have figured out my mistake

currentoor18:08:43

i was creating the relation datomic_kvs in mydatabase, i needed to make it in the datomic database

ghadi19:08:52

We're having trouble starting another Datomic Cloud cluster -- we have an older one, but the new one in US-East-2 is running into CF resource failures

ghadi19:08:17

Is there a better place to ask questions? I have CF failure screenshots.

Joe Lane20:08:04

@ghadi Silly question, do you have more than 5? I ran into an issue where I wasn’t able to have more than 5 at a time (never resolved it, just spun down some clusters)

eoliphant21:08:42

are cross db queries working in cloud?

kenny22:08:59

How do you get the basis-t in Datomic cloud?

Joe R. Smith22:08:47

@kenny lookup :t in your db val

kenny22:08:19

@solussd Thank you.

👍 4