Formatting code should be unnecessary https://maxleiter.com/blog/formatting
Interesting article! For me having formatting standard is only useful in terms of collaboration: 1. I don't want to see in the PR lines that were not changed and only formatted (and when multiple people use different formatting it happens a lot...) 2. When reading codebase it is more tiring when there is multiple different formattings used. In practice to accomplish that I'm always adding some tool like zprint that you can hook up to every ide, or just run a command at the end of PR to format all files. But indeed it is annoying to not be able to view code as you please, maybe it could be a somewhat good idea to run your own formatting through all files at the beginning of the work on new branch, and than after you finish running the team one.
In Clojure I feel I've rarely found the issue of formatting too bad. There doesn't seem to be as many dramatically different styles. Like no one does parens on their own line, parens on their own line indented by 8 spaces or anything like that haha.
> so you could view the source however you wanted
And you can still do it without any issues.
> We didn't store source code
Were they using any VCS, I wonder?
> it'd be easier if we just pushed minified code
Ah yes, defenestrate git blame and anything else that works with files on a line basis. Of course, the knee-jerk answer could be "the tools should adapt", but the new question is - which tools? All of them?
Keeping it simple allows for a great flexibility.
The easiest fix to stop fighting about formatting is to... stop fighting about formatting. Maybe vote for a central standard, maybe flip a coin - that's it. Can be faster than reading that whole article, let alone writing it.
And linters got some undeserved flak there. Even if you only store IR, it doesn't mean that there's nothing to lint there.
> We didn't store source code >> Were they using any VCS, I wonder? It appears so: > The R1000 had a lot of bleeding-edge features: incremental compilation, semantic analysis, version control, and first-class debugging all built-in.
Regarding tools and change: i don't think the tools need to change. You could still use git blame where you diff hunks in your own "favorite" formatting. You'd just do it downstream of the actual diff (which the lang itself could provide, e.g. like https://www.unison-lang.org)
> It appears so: [...]
Was it something specific only to R1000 that couldn't possibly be used with anything else?
I just don't see how else it would work, unless it was a truly universal tool that understood everything.
> You could still use git blame where you diff hunks in your own "favorite" formatting.
Source->IR->source does not produce the same source in general.
It can be that a single line changed in the source produces multiple changes in the IR and results in multiple lines being changed in the newly generated source.
I guess the lang needs the ability to produce semantic diffs for this to be possible, which Unison is an example of. But for our mainstream langs, I guess we're out of luck.
> The easiest fix to stop fighting about formatting is to... stop fighting about formatting. Maybe vote for a central standard, maybe flip a coin - that's it.
I agree with this. prettier demonstrated that a formatting tool, which takes care of all formatting, can end the debate effectively.
It's really confusing, but if I understand, you weren't able to see the Ada source back? You can only see a pretty printing of the DIANA parse tree? That would kind of suck? I'm not even sure how do you go about changing the code after, if there's no more Ada code?
But ya, this is a fun problem that people like to have. That's the truth haha
You can easily run a format step pre-commit, and run a different format step post-checkout. So remote is in some standard format and local is in whatever format you want.
You can certainly easily do that but then again git blame flies out the window. Unless there's extra tooling that's aware about your formatter and can map between remote and local versions.
Git blame you do on remote, so it would work no?
Local you know you're always to blame 😂
> Git blame you do on remote I do it locally. But even if it were remote, you'd still have to do the mapping in your head. Not quite an ergonomic experience.
Ya, that's fair. But at least each person can decide if you'd rather reformat to your preference on local, or have easily mapped git blame 😂
And then the differences in preferences get kicked down the road as a rusty can, with some people writing stuff like ;; TODO (John): ... and ;; Commented out for now. - Mark, and others shouting at them not to pollute the code with names since git blame already shows who has written that line. :)
I think it's both true that we could do a lot better than "files in folders", but that there is also a ton of infrastructure and tooling built on "files in folders" that you would have to recreate to be productive in a world not based on "files in folders"
go fmt fan here. a language already has so many preferences, might as well bake-in formatting too. add it to the build pipeline and your pre-push hooks and you're done.. think_beret If compilers/interpreters were to format code files as a first step, then there would be no discussion 🤷
Like one of the problems I thinking about right now is clojure stacktraces. A java StackTraceElement has: • getClassName() • getFileName() • getLineNumber() • getMethodName() • isNativeMethod() There's not really any direct way to reference a clojure form on the JVM unless you want to do something hacky.
but it does support filenames and line numbers
@smith.adriane true... say more?
Files offer more than meets the eye though. The flexibility and ease of working with them is often underestimated, but it's also why they're used and are still being used. It always reminds me of Windows registry versus Linux just using files for config.
> Files offer more than meets the eye though. Hah, yeah... Perhaps too much flexibility.
I don't have any solutions, but I thought this could stir a discussion lol
I also don't have any solutions, but I can list lots of problems.
Like when clojure requires a namespace via (require ...) , it goes to the class path and searches for munged path names (either in a jar or on the filesystem). There's no way to tell the compiler to use some code in a database or something.
Wouldn't a custom classloader help with this?
oh, maybe
That could be neat
Apart from that, at least on Linux it would still be possible even without classloaders. Since you can mount some "classloader" as a "directory" with "files" in there.
My mouse is a "file". Doesn't mean that it's a file though. :)
I find notes are another good example. Notes started as powerful note apps with complex structured data models often backed by some database. Over time, those apps were replaced in favor of files in folders. Just plain markdown or org files.
@smith.adriane Isn't that what source-maps are for? You can index form -> file, line, col
One not very polished futuristic idea is that if the changes (diffs) to a minified source code could be expressed as CRDTs that are associative, commutative, and idempotent, then the formatting & version control problem could be solved.
But again... passing the CRDT "checker" does not guarantee working code, so we could be back to square one.
> You can index form -> file, line, col Well, you need to go from file, line -> form. StacktraceElements don't include column numbers.
At least the rebase problem will (probably?) be gone
Ya, but how much of this is just a Java complication, versus anything to do with code as text files?
> One not very polished futuristic idea is that if the changes (diffs) to a minified source code could be expressed as CRDTs that are associative, commutative, and idempotent, then the formatting & version control problem could be solved. This would not work. CRDTs implement conflicting merges by deterministically clobbering one of the options. That's fine for realtime text editing, but not great for code.
Ya, but how much of this is just a Java complication, versus anything to do with code as text files?There's a bunch of tooling and infrastructure that implicitly expects code as text files. Java is one, but there's also version control, diff tools, search tools, comments, meaningful whitespace, you need convenient ways to add/remove namespaces, tools.deps, maven, etc etc.
> CRDTs implement conflicting merges by deterministically clobbering one of the options. Yes, good observation. And not only for text, but also for data structures, I think.
I think search tools are fine (read), only the ones that meaningfully change (write) the actual code are problematic (even comments, perhaps can be solved, but again, new tooling probably)
> CRDTs implement conflicting merges by deterministically clobbering one of the options.
I still wonder if there are gains to be had, though. But that's a very good thing to keep in mind. It will happen but I think if the changes are truly granular (vs line, for example) it will happen less often, and can be dealt with more easily than with today's options (aka git )
more simply 🙂 hopefully
git already allows you to use whatever merge tool you want, so I don't think using text or some other format makes a huge difference in this particular case.
well but git is still fundamentally file/text/line based, isn't it?
A change to the line means the whole line is "new"
sort of. It's fundamentally a key value store with very odd restrictions on keys and values. There are a bunch of tools that work with the git key/value store that are oriented around text. You can write tools that treat the keys and values as more than text.
Like, take (map inc [1 2 3])
If I change this to (map dec [1 2 3])
You change to (mapv inc [1 2 3])
... we cannot merge this without conflict (I believe)
Technically, there's no "conflict"
It depends on your merge tool, which is configurable
Uhm... ok... It will help me merge it, but it will still be recorded like the whole line was replaced, twice
Right?
> You can write tools that treat the keys and values as more than text. That's interesting, I need to explore it more.
So merge tools sorta paper over the problem by easing the pain of merging it; but the changes are recorded like the whole line change (if I understand correctly)
> Uhm... ok... It will help me merge it, but it will still recorded like the whole line was replaced, twice It will logically act as if the whole "file" was replaced. There are some techniques for actually storing the result on disk more efficiently.
Oh, damn, lol... didn't realize that... But that makes sense
I actually recommend reading https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain. I think it's interesting.
So every time you make a change to a file, it's a full copy-on-write of the whole file?
by design, git does not store diffs. It stores content. There are techniques for efficiently storing two similar versions of a file without actually storing two full copies of a file.
So every time you make a change to a file, it's a full copy-on-write of the whole file?Logically speaking, but the implementation is slightly more intelligent than that (slightly, it's called git for a reason).
I can see how that's the simplest way to implement. Given the constraints of "files"
> by design, git does not store diffs. It stores content. Yeah. I think that's the crux of the problem. Again, no refined solutions on my end, at the moment.
Did Linus name it after himself? 😹
(Nothing against him, joking)
> Did Linus name it after himself? 😹 He did. > "I'm an egotistical bastard, and I name all my projects after myself. First 'https://en.wikipedia.org/wiki/Linux_kernel', now 'git'."
hahah... so "yes"
Never realized git has a meaning
TIL
> Yeah. I think that's the crux of the problem. Again, no refined solutions on my end, at the moment. I think storing content vs storing diffs is a wise decision. In practice, you don't care about the diffs, you care about the content. It's much easier to make sure you're correctly storing and retrieving the thing you stored vs correctly reproducing the thing you care about based on diffs. It also has some efficiency implications for projects with large histories. If you store the content, it's easy to jump to the latest commit since you don't need to "replay" thousands of commits.
Easier, sure. Always easier to copy the whole thing. But we could have said that about immutability also: much easier to copy the entire collection 😉
If collections are just a few items and a few bytes, that makes sense. For text files, it also (possibly) makes sense.
I do sometimes, when it's likely that I want to add something at the end, Rich comments for example.
(comment
...
;; experimentation
...
;
)I don't find last line hanging parens on a more nested form that bad. I think I do that quite often with threading macros.
,,,
,,,
(cond->> data
(cond1 ,,,) (,,,)
(cond2 ,,,) (,,,)
)))(
map
(
fn [x]
(
do-something
x
)
)
coll
)
This is the way(I expect you didn't take that seriously 😂)
I didn't, but nevertheless feared eye cancer. 😜
Ya I was imagining something more drastic like that one haha. Occasional hanging closing on last line is fine I think.