Fork me on GitHub
#fulcro
<
2022-09-08
>
janezj19:09:21

When working with reports I found that when I enter a string into report control everything works as expected until there are no characters outside latin1. When I enter Dončić into control, "Dončić" appears in db. But when I hit start-report! 1. only "Don" will be passed as param to load transaction :app.ui.sup.trade-items/label "Don", 2. react_devtools_backend.js:4026 ERROR [com.fulcrologic.rad.routing.history:102] - DOMException: Failed to execute 'btoa' on 'Window': The string to be encoded contains characters outside of the Latin1 range. at goog.crypt.base64.encodeString Probably I should convert the string to utf-8 before base64? But first I would like that someone confirms the problem with own app with html5-history.

tony.kay20:09:14

I can tell you that Fulcro in general makes no promises if you work outside of UTF-8. Everything assumes that’s what you’re using, and I have no plans on dealing with other encodings in any of the internals.

tony.kay20:09:45

So I strongly suggest you set up your html/js env to always use UTF-8

roklenarcic07:09:00

Perhaps routing history only works with latin1 in url?

janezj07:09:58

test:

(require '[com.fulcrologic.fulcro.algorithms.do-not-use :refer [base64-encode base64-decode]])
  (require '[goog.crypt.base64 :as b64])
  (require '[goog.crypt :as crypt])
  (require '[clojure.string :as str])
;; correct
  (let [bytes (crypt/stringToUtf8ByteArray (clj->js "Dončić"))] ;; First convert our JavaScript string from UCS-2/UTF-16 to UTF-8 bytes
    (b64/encodeString (str/join "" (map char bytes))))
;; error
  (base64-encode "Dončić")

tony.kay22:09:13

Good catch. I remember changing that because the old way was super slow on large values... I was hoping it would not matter but clearly I had thought otherwise earlier. Oh well. Should probably change it back and use the new implementation with a different name in the performances case.

👍 1
tony.kay19:09:52

Released in 3.5.25

janezj20:09:13

Thanks, fast-base64, great so there are both implementations. Are you sure that you won't have problems not supporting non latin1 characters in your apps? I am just asking because I don't know how in US you are writing some foreign names? For example is HR recording names differently as they written in legal documents? This is probably the case and saves a lot of problems. I am going to ask some HR guys how they are handling foreign alphabets.

tony.kay20:09:29

Well, the fast is only on if you turn it on. I had a use-case where it mattered a lot, and there were no encodings, so it can be useful, but this is a do-not-use ns, and should only affect internals like this one thing you found

Lennart Buit10:09:59

Haha I originally wrote those b64 functions. I can imagine them being super slow. JS engines are not helping with their base64-decode-functions-being-only-defined-for-latin-1-characters.

roklenarcic10:09:51

Why would base64 have anything to do with whether it’s latin-1 or not? Base64 encode is by nature a byte encoding scheme, so what data it’s encoding should not matter. Look at any Base64 encoder in Java it takes byte[]. The source of bytes is where there can be a difference… ASCII string can be used as bytes directly, but UTF-8 string needs to be converted to bytes first.

Lennart Buit12:09:18

I must admit that the details escaped me over the years 😛.

tony.kay14:09:52

because it is base64 ecoding the bytes of the string, and the encoding of the string has to be known. In js, evidently, UTF8 internal strings are not the guaranteed default, so you have to first ask it to re-encode the string in UTF8 so that you have a consisitent known thing to decode later.