Fork me on GitHub
#datomic
<
2017-06-01
>
deg09:06:48

I'm having trouble writing a seemingly simple query... I have a bunch of tuples

[<id1> :category :x]
[<id1> :value 100]
[<id2> :category :y]
[<id2> :value 200]
[<id3> :category :x]
[<id3> :value 300]
I want to count how many ids I have of each category. So I want to get back something like {:x 2 :y 1}. How do I write this query?

robert-stuttaford09:06:50

[:find ?category (count ?categorised) :where [?categorised :category ?category]]

deg09:06:07

Perfect, thanks.

deg09:06:41

Can I do this on multiple attributes simultaneously? That is, a query that would return {:category {:x 2 :y 1} :color {:blue 3 :red 5} :shape {:square 2 :ellipse 6}}? Or is that better done in multiple queries?

misha10:06:36

@U3JJ35GUT:

(ds/q
  '[:find ?a ?v (count ?e)
    :in $ [?a ...]
    :where [?e ?a ?v]]
  @conn [:exchange/to-unit :exchange/from-unit])
=>
([:exchange/from-unit 11 36]
 [:exchange/to-unit 6 3]
 [:exchange/to-unit 11 40]
 [:exchange/from-unit 35 1]
 [:exchange/to-unit 4 1]
 [:exchange/from-unit 101 1]
 [:exchange/from-unit 9 4]
 [:exchange/from-unit 105 1]
 [:exchange/from-unit 4 9]
 [:exchange/to-unit 35 12]
 [:exchange/from-unit 31 2]
 [:exchange/from-unit 57 1]
 [:exchange/from-unit 103 1]
 [:exchange/to-unit 72 1]
 [:exchange/to-unit 24 8]
 [:exchange/to-unit 57 6]
 [:exchange/from-unit 24 1])

misha10:06:04

in this example ?v is an id (instead of value :x)

misha10:06:34

next, just reduce over the results

robert-stuttaford09:06:22

why not give it a try? if you don’t get it right, multiple queries is fine. i find i prefer multiple smaller, focused queries

deg09:06:12

Yup. I just gave it a few tries, but got back various cross products. I think I'll stick to the multiple smaller queries. Thanks.

robert-stuttaford09:06:19

yep, that’s one of the reasons why i prefer what i do - i can reason about them, heh

pedroteixeira15:06:43

hello, for a "on premise" setup, I was considering postgresql as storage for datomic.. but then just read this post https://blog.clubhouse.io/using-datomic-heres-how-to-move-from-postgresql-to-dynamodb-local-79ad32dae97f about moving from postgres. Are there other opinions/docs about recommendations for on premise / small (few users, one host) datomic setup? latest/stable postgresql versions would probably be ok, right? or some other storage is more "native/optimized" to datomic implementation? it would be usefull to know current statistics "on premise" choices, if those could be shared some how.. perhaps by a poll?

devth19:06:16

is it sufficient to backup underlying SQL storage for production, or should the datomic backup-db from-db-uri to-backup-uri feature be used?

favila19:06:52

@devth Backing up the storage backs up everything (i.e. all datomic databases in the table and all segments, including possible garbage segments) and cannot be moved to a different kind of storage. backup-db backs up a datomic db at a time, does not backup unused nodes (if not an incremental backup), and can be restored to arbitrary storages

favila19:06:41

so they have different abilities and tradeoffs. Whether storage-only backup is sufficient for you depends on what you need

devth19:06:28

basic guarantees about not losing data is really all i need at this point. we use Cloud SQL, which has an Automatic Backups feature. ideally i enable this and not have to worry about doing backup-db.

favila19:06:52

Ah, we use cloud sql too. we also run backup-db periodically

devth19:06:07

just in case? or do you have specific use cases?

favila19:06:29

isolating dbs or moving to different storages

favila19:06:00

we have a problem with CloudSQL where we can't seem to reclaim mysql space

favila19:06:40

my dim memories of mysql admin were that innodb never gave up filespace, even if it was optimized and compacted

devth19:06:57

ah. have you guys considered Cloud SQL Postgres (currently beta)?

favila19:06:16

sometimes when we do a big import, or delete a bunch of large datomic dbs, we will drop the table and restoredb everything

favila19:06:35

no, we need a BAA

favila19:06:49

only mysql is covered

devth19:06:31

ah, ok. cool, good to know. i think i'll start with Automated Backups then later maybe setup backup-db if and when I need it

devth19:06:39

do you use Minio for S3-compatible interface in GCP?

favila19:06:09

I didn't think it was possible to intercept the hostname construction done by backup/restore

devth19:06:30

oh, i'm not sure as I haven't tried it yet

favila19:06:37

we would really like to just backup to google cloud storage directly. We've asked for that feature for years

favila19:06:45

they have an s3 compatible interface, too

devth19:06:45

would make sense 🙂

devth19:06:50

GCP cloud storage does?

devth19:06:04

nice. i did not know that.

favila19:06:04

but you need control of the hostname

favila19:06:28

i.e. s3:// urls only take buckets

favila19:06:41

I guess with some /etc/hosts trickery it might work

favila19:06:15

anyway, we use to use gcsfuse to mount the gcs bucket, and then read/write from that

favila19:06:31

but we had some zero-length segments once that caused silent db corruption, so we stopped

favila19:06:50

now we mount a drive onto an instance

favila19:06:54

like animals

devth19:06:59

I don't see a feature request for Google Cloud Storage backups on http://receptive.io

devth19:06:02

adding it

devth19:06:59

Backup to Google Cloud Storage

devth19:06:28

and put most of my available priority on it

devth19:06:55

there's also an existing feature Allow use of S3 "compatible" storage alternatives

favila19:06:17

ah, the one I thought was for cloud storage I misread

favila19:06:30

it's for cloud datastore

favila19:06:50

which would be nice but is not essential. cloudsql works fine

jdkealy21:06:39

if i have multiple instances listen to the same transaction queue, will the message be duplicated across instances or does reading it take it off the queue