Fork me on GitHub
#onyx
<
2017-11-08
>
asolovyov08:11:07

OMG it was stuck because out-chan is an atom! Why did it not throw an exception there?! šŸ˜ž

lucasbradstreet08:11:55

Hmm, it definitely should have thrown an exception. Weird

lmergen08:11:25

i've been noticing as well that some plugin exceptions get 'eaten' by onyx sometimes without providing output

lmergen08:11:46

never really was able to figure the details out

lmergen08:11:58

i'll do a deeper dive next time i encounter such a situation

lucasbradstreet08:11:12

Iā€™ll try to reproduce the out chan issue tomorrow. If I can reproduce it, it should be easy to fix

asolovyov08:11:00

I'll check if it's an Onyx or a core.async trouble

asolovyov08:11:30

yeah seems like an Onyx bug

asolovyov08:11:45

if I catch and print exception it propagates

asolovyov08:11:07

I mean it occurs šŸ™‚

lmergen10:11:16

yes, this is what i noticed as well

asolovyov14:11:50

okay I'm almost done with onyx-http! Retry works, happy case works, the only thing which is not working yet is that exception from async-exception-fn is not propagated. This is because write-batch is not called and I'm not sure how to force it.

asolovyov14:11:29

last in-flight-writes value I see is 1, so it obviously is not finished yet

asolovyov14:11:11

completed? is called constantly though

michaeldrogalis15:11:47

I'll check it out -- that's a nasty bug. šŸ˜•

michaeldrogalis17:11:15

@lmergen @asolovyov At what stage of the lifecycle is the exception being squashed? Im having trouble reproducing it

lmergen17:11:48

it's been a long time since i worked on onyx-sql

lmergen17:11:55

but i believe it was during write-batch

michaeldrogalis17:11:16

Hmm. Ill try with a different plugin.

lmergen17:11:28

i know that i do change logging config a bit, perhaps itā€™s related to that ?

lmergen17:11:57

iā€™ll make sure to better inspect it next time i encounter it

Forrest17:11:55

Hi there, Iā€™m new to Onyx and am running into an issue when trying to add S3 checkpointing to my job. I configured my peer config with the s3.storage option and all the subsequent configuration options (auth-type, bucket, region, etc) and it looks like my Onyx job can talk to the bucket I specified, but I am getting a 403 - Access Denied response. Has anyone encountered that before?

Forrest17:11:04

For context, the job is running in an AWS EC2 instance with an appropriate role (S3 full access) and the S3 bucket has granted permissions to that role to perform all S3 operations.

michaeldrogalis17:11:33

Okay, gonna keep trying to track it down anyway. These bugs are really annoying when they crop up

lucasbradstreet17:11:50

@forrest.thomas Hmm. It sounds like you donā€™t have the right permissions setup still, but Iā€™m not sure what could be wrong.

Forrest17:11:45

i thought so as well, but when I use the AWS CLI from that EC2 I can read/write to the bucket

Forrest17:11:40

normally, I would look at the object ACL to see what was happening, but I donā€™t think I can do that in this case since Onyx is the creator of the checkpoint and uploads it from in-memory (as far as i can tell)

michaeldrogalis17:11:09

Is Onyx running in a container? Maybe it's not seeing the same AWS keys as your CLI process?

Forrest17:11:52

itā€™s running on the host

Forrest17:11:31

I also have an encryption policy set on the bucket that requires aes256. I set that configuration option as well. is it possible that is getting lost somehow?

lmergen17:11:44

what do you get if you try to access the file using the aws cli on the host ?

lmergen17:11:17

(or rather, try to touch a file)

michaeldrogalis17:11:19

I doubt it, but you may want to try using a fresh bucket with no encryption just to check it out

Forrest17:11:22

i can access the bucket and all of its contents using the AWS CLI

lucasbradstreet17:11:29

I think itā€™s encryptionā€™s fault

lucasbradstreet17:11:49

I see the config option but I donā€™t think weā€™ve actually implemented it (looking at the code now)

michaeldrogalis17:11:54

I stand corrected šŸ˜›

lucasbradstreet18:11:03

If you can test whether it works without encryption, Iā€™m happy to add it today.

lucasbradstreet18:11:09

That was an oversight.

Forrest18:11:10

sure, let me try

Forrest18:11:09

works fine now

lucasbradstreet18:11:09

All of the code is already there, we just donā€™t pass the setting through.

Forrest18:11:11

thats the one

lucasbradstreet18:11:20

OK cool, Iā€™ll have a fix shortly.

Forrest18:11:31

thats awesome! thank you SO much!

Forrest18:11:05

I forgot to mention. Iā€™m using Onxy 10, will this fix be for that version or just 11+?

lucasbradstreet18:11:11

Iā€™d like to make it against 0.12, which we should be releasing today, but you should be able to upgrade to 0.12 pretty seamlessly.

lucasbradstreet18:11:39

There are some breaking changes in the last two versions, but itā€™s not too bad.

Forrest18:11:12

OK, well, I guess I needed to upgrade eventually. šŸ˜›

Forrest18:11:17

thanks again!

lucasbradstreet18:11:27

No worries. Iā€™ll let you know when thereā€™s something to try.

Forrest18:11:34

:thumbsup:

Travis19:11:32

Just wondering if anyone has any methods/best practices for doing blue/green or just general job upgrading methods

asolovyov20:11:01

@lucasbradstreet I'm afraid I need more of your help to finish onyx-http šŸ™‚ I have no idea why it doesn't call write-batches after some time even though completed? returns false...

asolovyov20:11:22

I'm going to leave right now but any pointers are really welcome, I would be happy to finish it tomorrow šŸ™‚

fellows21:11:06

We're updating to Onyx 0.12.0 in anticipation of the S3 encryption fix. I've made the appropriate fixes for the breaking changes (we've been running 0.10.0), and we're testing against 0.12.0-alpha4. It appears that some window triggers are firing twice (I'm seeing two copies of the same state getting emitted), which is breaking a lot of our tests. When I try 0.11.0.1 it all works fine. Seems like a bug in 12?

lucasbradstreet22:11:19

What trigger type are you using?

lucasbradstreet22:11:13

This could be due to the new support for watermarks causing a new event-type ā€œ:watermarkā€ which is not being handled correctly by whatever trigger is being used.

fellows22:11:39

That one is a segment trigger

lucasbradstreet22:11:56

And 0.11 is ok. Hmm, Iā€™ll check it out shortly. Thanks

fellows22:11:24

Ok, great. Thanks!

lucasbradstreet23:11:38

@fellows could you please give 0.12.0-20171108.231118-16 a go?

fellows23:11:16

Looks like that has the same problem.

lucasbradstreet23:11:05

Are you using :trigger/fire-all-extents? true?

lucasbradstreet23:11:32

Also, could you check from the trigger firing whether itā€™s firing on :job-completed?

lucasbradstreet23:11:55

Check the event-type of the state-event. That will fire when the job completes, which is probably happening in your tests.

lucasbradstreet23:11:07

If you could also check the :extents key in the state-event from your sync or emit, that would help.

fellows23:11:19

Definitely not using fire-all-extents?.

lucasbradstreet23:11:52

Did you happen to drop a :trigger/refinement :discarding from the trigger, but you didnā€™t add :trigger/post-evictor in?

fellows23:11:58

Is the state-event available inside the event in the trigger/emit function?

fellows23:11:21

No, we dropped an accumulating refinement.

lucasbradstreet23:11:28

ok, nothing changed there then

fellows23:11:46

I tried both types of post-evictor, actually, and neither had an obvious effect.

lucasbradstreet23:11:41

state-event is the fourth argument to trigger/sync and trigger/emit

fellows23:11:13

Ah, sorry, I've been calling that window-data.

fellows23:11:18

I'll take a look.

lucasbradstreet23:11:36

no worries. Yeah, if you can get me more info about the scenario of the fires thatā€™d help

fellows23:11:48

Bingo. The first time it's :new-segment, the second time it's :job-completed

fellows23:11:42

:extents is (1) the first time and nil the second

lucasbradstreet23:11:23

Cool, we did a better job of sealing on job-completed now. We do it like that because maybe you have a trigger set to fire on every 10 elements, but then you had one more element be added and you still want it to flush when you complete the job.

fellows23:11:27

So that will trigger even if it's going to emit a state that's identical to the previous one? Is there a way to turn that off, or do I need to specifically check for the event-type every time?

lucasbradstreet23:11:21

Thereā€™s no way for it to know what the previous state was, otherwise the memory consumption would increase by default. You should ignore the job completed event, or dedupe yourself, or use an evictor and combine your outputs as theyā€™re evicted/synced