Fork me on GitHub
#portkey
<
2018-05-03
>
viesti05:05:20

so we'd have something similar to -rest-json-call for query string protocol?

viesti05:05:39

if not the same time but parameterized

baptiste-from-paris08:05:36

{“name” #{},
 “http” #{“method” “requestUri”},
 “input” #{“shape”},
 “output” #{“resultWrapper” “shape”},
 “errors” #{}}

baptiste-from-paris08:05:11

and here are the keys that I found for the query protocol

baptiste-from-paris08:05:22

errors contains some shapes representing errors

baptiste-from-paris08:05:57

so really simplified compared to rest-json

baptiste-from-paris08:05:52

so my goal is to translate each shape (`map` list, atomic ones) into query-string or form-params

viesti09:05:38

if I have time, I could take a look at grabbing docs from separate files, maybe on some evening

baptiste-from-paris09:05:48

if you can check also if everything is fine for the LATEST thing

viesti11:05:30

depends if our little one goes to sleep easily 🙂

baptiste-from-paris12:05:03

In the next few weeks, it’s open-source, no deadline 🙂

cgrand12:05:33

>>> Takeaways • The future is granular, interactive and massively parallel. • Many applications can benefit from this “Laptop Extension” model. • Better platforms are needed to be built to support “bursty” massively-parallel jobs.

viesti12:05:11

nodding on the last bullet

viesti12:05:35

it’s too easy to just throw users at a database and then wonder why stuff doesn’t work performantly

cgrand12:05:24

Could you tell more?

viesti15:05:46

one story: Loading a single large file can take really long if not parallelized. Lesson to learn is to split the file by number of slices (vCPU) on the cluster. In one case a 20GB dump file (4.3GB gzip compressed) load went from 1h to 7minutes.

viesti15:05:45

Tables can be distributed by a column (distkey) and queries with joins that work on distkey column result in query executing in parallel on each node. Joins that don’t match distkey can lead to redistributing large amount of data during the query, which leads to headache.

viesti15:05:58

Similar things are to watch out when using spark, but haven’t used a real distributed cluster. Would like to see a Spark cluster used with powderkeg 🙂

baptiste-from-paris12:05:26

@cgrand is it a conférence you attended ?

cgrand12:05:58

nope some slides I came across

baptiste-from-paris12:05:28

you are working on distributed computing ?

baptiste-from-paris12:05:53

I really don’t understand how you can split an encoding job between 5000' thread

viesti13:05:08

at my current project, we have this 12 node Redshift cluster

viesti13:05:31

I have a feeling that I'm slowly morphing into a dbadmin

baptiste-from-paris14:05:07

I have very good feedbacks on ClojuTRE

baptiste-from-paris14:05:16

I have to go this year

viesti15:05:55

I think 2014 was my first ClojuTRE, has been really good event every time 🙂

viesti15:05:46
replied to a thread:Could you tell more?

one story: Loading a single large file can take really long if not parallelized. Lesson to learn is to split the file by number of slices (vCPU) on the cluster. In one case a 20GB dump file (4.3GB gzip compressed) load went from 1h to 7minutes.

viesti15:05:45
replied to a thread:Could you tell more?

Tables can be distributed by a column (distkey) and queries with joins that work on distkey column result in query executing in parallel on each node. Joins that don’t match distkey can lead to redistributing large amount of data during the query, which leads to headache.

baptiste-from-paris15:05:02

The attribute stuff is so uncool

baptiste-from-paris15:05:39

question, should we create different (spec/conformer) based on the protocol to format the request or use only specs globally and rework the input

baptiste-from-paris15:05:57

[{:topic-arn "2w",
   :label "h",
   :awsaccount-id ["8L" "" "" "" "ff"],
   :action-name ["1" "m" "n" "ne" "" "GR" "" "56" "6"]}
  {"TopicArn" "2w",
   "Label" "h",
   "AWSAccountId" ["8L" "" "" "" "ff"],
   "ActionName" ["1" "m" "n" "ne" "" "GR" "" "56" "6"]}]

baptiste-from-paris15:05:51

this exercise on :portkey.aws.sns.-2010-03-31/add-permission-input returns vectors for the list “type”

baptiste-from-paris15:05:05

If I read the doc well, this translate to something like that =>

baptiste-from-paris15:05:25


        ?TopicArn=arn%3Aaws%3Asns%3Aus-east-1%3A123456789012%3AMy-Test
        &ActionName.member.1=Publish &ActionName.member.2=GetTopicAttributes
        &Label=NewPermission &AWSAccountId.member.1=987654321000
        &AWSAccountId.member.2=876543210000 &Action=AddPermission &SignatureVersion=2
        &SignatureMethod=HmacSHA256 &Timestamp=2010-03-31T12%3A00%3A00.000Z
        &AWSAccessKeyId=(AWS Access Key ID)
        &Signature=k%2FAU%2FKp13pjndwJ7rr1sZszy6MZMlOhRBCHx1ZaZFiw%3D 

baptiste-from-paris15:05:45

with ActionName.member.1=Publish &ActionName.member.2=GetTopicAttributes as values

baptiste-from-paris15:05:01

kind of the same pattern for maps

baptiste-from-paris15:05:42

so structure, map and list are the 3 compound types

baptiste-from-paris15:05:00

but we can exclude structure as it’s a global wrapper

viesti19:05:19

I might be off but spec.tools has a way of changing a conformer: https://github.com/metosin/spec-tools/blob/master/README.md#spec-coercion

baptiste-from-paris19:05:11

I wrote something like that =>

baptiste-from-paris19:05:18

(defn conformed-input->query-protocol-input
    "Given a conformed input, transform the input to be compliant to
  the `query` protocol of AWS."
    [input]
    (into {} (for [[k v] input]
               (cond
                 (string? v) [k v]
                 (vector? v) (into {} (map-indexed (fn [i v'] [(str k ".member." i) v'])) v)
                 (map? v) (into {}
                                (comp
                                 (x/for [[k' v'] %]
                                   [[(str k ".entry.") k'] [(str k ".entry.") v']])
                                 (map-indexed (fn [i [[k1 v1] [k2 v2]]]
                                                [[(str k1 i ".key") v1] [(str k2 i ".value") v2]]))
                                 cat)
                                v)))))

baptiste-from-paris19:05:52

I am not sure it’s the best way but it works

cgrand19:05:28

My plan was/is to decouple spec and transformation. (Wip in my old wip branch)

baptiste-from-paris19:05:42

you mean not use spec conformers ?

baptiste-from-paris19:05:18

what we know is that description files are the same for the 5 protocols

baptiste-from-paris19:05:24

so specs are working that way

baptiste-from-paris19:05:57

but output/conforming has little differences

baptiste-from-paris19:05:24

by the way, same goes for the output specs

baptiste-from-paris19:05:17

does it recall you something @cgrand =>

baptiste-from-paris19:05:26

(concat (map #(str "ser-" %) inputs) (map #(str "req<-" %) input-roots)
                     (map #(str "deser-" %) outputs) (map #(str "resp->" %) output-roots))

cgrand05:05:37

Yes I categorize shapes according to their usage (one shape may have several usages) and I create one spec for each usage.

baptiste-from-paris19:05:10

the shape-seq makes me really nervous

cgrand05:05:49

Starting from a shape it returns shapes it depends on.

baptiste-from-paris19:05:18

@cgrand when you have the time, it would be awesome of you can explain to me the overall archetecture that you were looking for

cgrand05:05:30

Hours of commits exegesis can save you minutes of pair programming.

baptiste-from-paris19:05:50

from there I can try to draw some stuff and then go back to your code

cgrand05:05:37

Yes I categorize shapes according to their usage (one shape may have several usages) and I create one spec for each usage.

cgrand05:05:30

Less conforming indeed.

cgrand05:05:49

Starting from a shape it returns shapes it depends on.