clojure 2020-07-17 | Slack Archive

Ramon Rios09:07:38

Does someone here that uses next.jdbc could help me with something? I created a query, but for some reasons, the properties are not in the order query.

Ramon Rios10:07:57

{:select [:customer.customer_number :contact.name :contact.street :contact.house_number :contact.phone :contact.mobile :contact.fax :contact.email
                           [:city.name :city] :city.zip_code [:mlv.ml_value :country]]
                  :from [[:contacts :contact]]
                  :join [[:customers :customer] [:= :contact.id :customer.contact_id]
                         [:cities :city] [:= :contact.city_id :city.id]
                         [:countries :country] [:= :country.id :city.country_id]
                         [:multi_language_values :mlv] [:= :country.name_id :mlv.ml_string_id]]
                  :where [:= :mlv.language_id 1]}

That's my query

{:cities/city aaaaa, :contacts/house_number 11, :cities/zip_code 1111, :contacts/phone 11111, :contacts/mobile 1111111, :contacts/fax , :customers/customer_number 111111, :multi_language_values/country Aaaa, :contacts/name aaaaa, :contacts/street Caaa, :contacts/email [email protected]}

And the results are not coming in the order as it should be.

Thomas10:07:18

I’m not familiar with next.jdbc but you get back a map and maps are unordered

Janne Sauvala11:07:22

There is #sql channel and there are plenty of people familiar with next.jdbc (including the maintainer of that library) for further questions 🙂

Johannes F. Knauf11:07:52

@UNST81B9P If I understand your question correctly, you are surprised of the order of the returned columns. @U015FHSUCGM already pointed out the unordered nature of map here. I want to provide you the alternative solution in case you need different behaviour: You can use https://github.com/seancorfield/next-jdbc/blob/develop/doc/result-set-builders.md to modify the result behaviour and e.g. return an array of fields instead of a map.

Johannes F. Knauf11:07:16

@UNST81B9P Does that solve your problem somehow?

Ramon Rios12:07:56

@U0135SQQC82 let me check that. Thanks

Ramon Rios12:07:12

On my case it won't work bc i use a map for turn this into a excel

kenny16:07:45

We log json to stdout. Sometimes we accidentally log a value that is huge. This causes the log line to span multiple lines in a log aggregation tool. Since it spans multiple lines, the tool does not recognize the status of the message (e.g., ERROR). This breaks alerts that look for ERROR messages. Unfortunately, huge log lines most often occur only when an error occurs. Logging a huge value is a bug, but the result is terrible -- I never even know that it happened! I could cap the logged json string but that may also break the log aggregation tool due to unparsable json. How do folks logging json typically deal with very long log lines?

phronmophobic16:07:02

it sounds like you're logging in some format like:

ERROR: {"my":"json"}

it sounds like you want to log in an append only format where each log message is some easily deserializable data

phronmophobic16:07:08

if you can change the logging format, then it will reduce the complexity in the later stages of the pipeline

kenny16:07:58

No. The whole log line is json. Leaving json log lines makes lots of other things trickier. Everything downstream would then need to implement a log line parser. Keeping logs as easily parsable data is quite valuable.

phronmophobic16:07:57

why would the json span multiple lines then?

phronmophobic16:07:13

can't any json be represented without any line breaks?

kenny16:07:45

Afaik, all log aggregation tools limit line length at some number. If it crosses the threshold, the line will be broken into multiple pieces.

phronmophobic16:07:21

> Keeping logs as easily parsable data is quite valuable. that was the point I was trying to make.

phronmophobic16:07:09

breaking the log line into multiple lines seems undesirable.

kenny16:07:34

This is how all log aggregation tools work, afaik.

phronmophobic16:07:33

that is not an enforced behavior that I'm aware of. anyway, at some point, your logging tool is converting the log message into json, you could have it truncate/shorten/summarize data at that point before it's serialized into json

kenny16:07:43

Curious, which tool are you using that does not have limits like this? Yeah - that's the piece I'm after. It's not clear how to do that in an efficient manner. Walking the structure is expensive.

phronmophobic16:07:41

the structure must be walked to serialize it

phronmophobic16:07:51

for more traditional logging, my experience is using python tools. I've also written some myself. for similar use cases, i've also used tools like kafka

phronmophobic16:07:32

depending on your use case, simply using kafka might work

phronmophobic16:07:47

I'm not really sure what your setup is like so it's hard to say

phronmophobic16:07:41

I haven't typically used specialized "logging" tools. I typically just use regular data processing tools. the priorities usually are 1. get the logs to durable storage like s3 as soon as possible 2. have the logs searchable

kenny16:07:35

Interesting. That's far different from my experience. We are currently using Datadog logs. I have used CloudWatch logs, Honeycomb, and, what I'd call, a traditional ELK logging stack. ELK does give you more configurability but at a steep management cost. I've never shipped logs to s3.

phronmophobic16:07:16

makes sense. I still pick and choose managed services. I've never been super happy with the logging managed services to date. there are some managed services that work well.

Darin Douglass16:07:27

we've hit the same problem too with our logging code. we've resorted to simply truncating part of the data we know to be problematic. at the moment this works just fine as we get the necessary data from each log that we need for alerts/dashboards/etc

kenny16:07:32

@UGRJKK74Y Do you always know the part that is problematic?

Darin Douglass16:07:01

we do. we use structured logs and know exactly what goes into each log line. the logs that are troublesome are the one for generic errors that contain everything in the kafka record (which typically consists of vectors of 100s of things)

phronmophobic16:07:46

do have an ETL pipeline for data analytics? every company I've worked at ends up having some sort of data pipeline and we just end uphaving logs as another source rather than logs as a separate thing

Darin Douglass16:07:04

at the moment most of our logs are more traditional logs. though the case i mentioned above is our first push towards more event-based logging

Darin Douglass16:07:30

which lends itself to the "logs as a data source" idea you mentioned

kenny16:07:40

We do the same. The problem is that, as it happens, some maps logged happen to contain arbitrarily large maps. Should certainly be more diligent about logging this stuff but I was hoping to protect against it happening.

Darin Douglass16:07:08

yeah if you don't know what is arbitrarily large i'm not sure of a good solution off the top of my head. you could pipe into ES instead of a logging aggregator. that should be able to handle things well enough

phronmophobic16:07:00

if you're not having performance issues currently, then in the part of your logging code where it serializes the data into json. you should be able to just see how long the string is and simply decide if you want to reserialize it smaller/truncated/summarized at that point. that means you're only analyzing the messages when they're too large. if there aren't that many, then it shouldn't have much of a performance impact

phronmophobic16:07:08

additionally, if the messages that are bloated are similar, then you can potentially just check the problematic keys/values before serializing and that should be pretty quick

Darin Douglass16:07:45

yeah that what we're doing:

(log/error exception ::exception (update data :value truncate))

😛

😁 3

pithyless20:07:28

@U083D6HK9 perhaps you can use fipp (directly or as an inspiration) for truncating the data payload?

(fipp.edn/pprint {:a (range 10)
                    :b (zipmap (range 10)
                               (range 10))}
                   {:print-length 3})
  ;; {:a (0 1 2 ...), :b {0 0, 7 7, 1 1, ...}}

2020-07-17

Channels