Fork me on GitHub
#clojure-uk
<
2018-11-10
>
alexlynham08:11:39

how big is too big for a namespace?

alexlynham08:11:50

about ~200 lines and I start thinking “hmm, I should break this down”, but should I instead just turn my monitor sideways?

yogidevbear08:11:36

IMO it depends on the problem space you're dealing with. I don't see an issue with a file growing larger than 200 loc if all the code is very relevant to the namespace

👍 8
yogidevbear08:11:05

I'd rather look in one file for that code than jump between 5 different files

3Jane08:11:29

IMO: No function should be larger than a screen (horizontal). For a namespace, count the number of publicly available functions and cap it at a aesthetically reasonable number.

practicalli-johnny09:11:07

I like this idea of a function never being larger than the screen (with text still readable of course). Thanks Jane

folcon14:11:42

Definitely subscribe to this, really don’t like functions being big, whereas I break down namespaces conceptually. So for example I may have a really large namespace that’s all about say page rendering, I’ll group functions based on their conceptual use within that namespace and if a particular part of the page gets complex and thought about as it’s own thing (like a file uploader or editor), I’ll split that out into it’s own namespace =)… I find it really fiddly to open 20 files to read a thing that someone has split up because it was the right thing to do. Whereas if I can reason about it independently of everything else, that makes sense for it to live in it’s own file :)…

👍 4
mccraigmccraig09:11:26

i start thinking about breaking it down at about ~300 lines @alex.lynham - although i have some namespaces around which have grown much bigger than than and haven't yet had the treatment

dominicm09:11:13

I've seen 2k namespaces

dominicm09:11:44

The refresh on that time wasn't fun

practicalli-johnny09:11:31

@alex.lynham I suggest there is no 'size' limit to a namespace (you can always code fold and have much less code showing). To me, namespaces are a way to logically group behaviour (functions) and information (data structures), so as you think about the aspects or components of your codebase then namespaces should evolve fairly naturally. I start with one namespace and split out into more when it makes sense to logically separate parts of the system. One caveat I would say, if your namespace is too big to load into your editor comfortably, then its probably too big 🙂

👍 4
alexlynham10:11:35

yeah was just thinking about it because most clojure codebases I’ve seen have sorta-mostly smallish namespaces and then at least one whopper

practicalli-johnny10:11:05

Interesting to know. At the last company, we had a fairly small main namespace (which I think is generally a good sign of a well organised project) and about a dozen or more namespaces. I think there was around 8,000 lines of code. Probably the largest single code file contained all the configuration/environment variables (using aero to manage multiple regions and hardware environments, prod, qa, uat, dev).

alexlynham10:11:22

I’ve always wondered if it’s because like ‘god classes’ in OOP there’s often a concept in your business domain that is more important or key to everything in the domain than other nouns/concepts

3Jane11:11:31

God classes come into being (IMO) for lack of function capabilities

3Jane11:11:47

Each and every Util would be better as separate functions

mccraigmccraig11:11:50

we grew some very large namespaces quite early on - i remember splitting one monster down into 9 separate namespaces at one point. i think there is less of a tendency to large namespaces in yapster now, perhaps because there are so many namespaces already that there is no draw to “keep things simple” by avoiding creating further namespaces

3Jane11:11:11

More to the point I’m just reading simple vs easy. A screen-sized function is easy (to read, because you need never to cache its above-fold contents in your memory.) A namespace is simple if it has a single concern.

❤️ 12
mccraigmccraig11:11:17

i think our largest namespaces are unit-test namespaces now, because they are constrained by relation to the code being tested

3Jane11:11:12

@mccraigmccraig interesting, what prompted you guys to split? Ease of reading?

mccraigmccraig11:11:07

for me it’s usually ease of comprehension

mccraigmccraig11:11:48

i think the single-concern thing is right - i need to be able to forget about as much as possible to focus on what’s important, and it’s much easier to forget about simple stuff

otfrom11:11:01

I actually find that lein var-graph bit I mentioned earlier to be really useful as a way of visualising if I can/should break up a nampespace

otfrom11:11:07

and morning

otfrom11:11:22

(goodness. this is a Saturday chat?)

otfrom11:11:06

any while it looks squamous and rugrose it is better than it was before. Still a ways to go yet.

folcon14:11:00

Hmm, not heard of lein var-graph, what kind of things do you look for?

otfrom14:11:07

I look for things where there are arrows coming in from lots of different namespaces into one thing and I look for things where the arrow comes into a namespace from only one other (a helper function that isn't very local)

👍 8
rickmoynihan14:11:25

@alex.lynham: I’d say it often depends on what you’re writing… as @mccraigmccraig single responsibility is usually the key. But I think it’s worth mentioning there are two different styles of slicing functionality/responsibilities. One is vertically, the other is horizontally. I think vertically sliced usually belongs to applications; and it means that you want to slice namespaces per feature; not by layer; i.e. I consider it a bit of a smell to have in a web app mywebapp.handlers, mywebapp.models, mywebapp.templates… unfortunately as far as I’ve seen most getting started templates tend to slice apps this way… which I think scales poorly; and results in a mixing of concerns. In apps I much prefer to see , mywebapp.follow, mywebapp.product.search mywebapp.product.purchase etc… Things can then be split horizontally within those; if required. Splitting this way makes it much easier to write tests; and much easier to verify where you have test coverage and where you don’t… as you should basically have at least 1 test ns per namespace. If things are split horizontally different features get mixed up and split in weird ways that becomes harder to verify they’re tested etc. Libraries tend to be for more cross-cutting concerns; they’re by definition usually intended to be reusable. So I think it’s more natural to split them horizontally… e.g. a library for database access, a library for async stuff, a library for string manipulation. i.e. you expect a bucket of miscellaneous functions for doing X in a namespace.

👍 4
otfrom14:11:16

+1 to all of that @rickmoynihan

otfrom14:11:57

I'm trying to make witan.send more vertical and see if there is a library (around monte carlo methods and markov chains and others) that can be factored out

otfrom14:11:08

I'm iteratively refactoring from the entry point (and hitting some of the obvious ones in other parts)

otfrom14:11:14

and just doing the really mechanical ones first

rickmoynihan15:11:51

Yeah, I’ve been on that journey more than a few times… My wish for the clojure ecosystem would be that web app templates would set you out on a vertical structure from the start… The problem is that they all want to show you “this is what a handler looks like”, “this is a template” etc… so they put framework concerns before app concerns. I think a feature first templating system would be harder to write though.

dominicm15:11:14

Why does vertical work better than horizontal? Doesn't it actually result in a mixing of feature logic with http semantics?

rickmoynihan16:11:47

A number of reasons: 1. The app’s layout immediately tells you about what the app is doing; rather than how it is doing it… i.e. the what is brought up to the top of the app, rather than being buried and split across leaf namespaces. This I think has huge benefits for onboarding, and groking a new unfamiliar app. 2. Typically one works on a feature at a time; which means the majority of changes occur together. I personally find it easier to work on things when all the files are co-located, rather than split across a large tree… though this latter point is perhaps more subjective. 3. Split by feature typically results in less conflicts when multiple developers are adding features…. though cross cutting changes will clearly touch lots of things… yes there are trade offs 🙂 4. Easier to confirm things are tested/untested. 5. Probably less bespoke rules about where things go, leading to greater consistency. I think it’s easier to say to people “all feature stuff goes together”, and have your expectations met rather than horizontally dividing things, where people tend to make more wildly different decisions (in my experience). On the mixing HTTP semantics front… it depends what you mean by semantics. I think HTTP semantics tend to get removed pretty quickly in the handlers… but yes in rest features tend to map roughly to routes, and the rules about roughly follows the route form… though I think that’s more a coincidence of organising features into tree’s (paths on filesystem and http) rather than being about HTTP semantics. Within a large feature splitting horizontally; or by input/output operations also works well IMHO. Obviously you have the problem of where to put shared stuff… I typically like to organise this in a common namespace; often at various levels of hierarchy…. The more common stuff is the closer to the root common it moves. common stuff is then also a good candidate for factoring out into libraries.

rickmoynihan16:11:59

( got to go do some decorating - bbl)

alexlynham16:11:19

I feel like I should write this thread up as a blogpost lol

✍️ 16
rickmoynihan16:11:57

I’ve been meaning to do that for years - lol!

alexlynham16:11:28

Fwiw iirc the django approach in python is supposed to be vertical slicing, but it tends toward a collection of monoliths, so you end up having to use e.g. flask to do microservices instead

alexlynham16:11:11

Have you gone the vertical route for zib or one of the bigger new projects then?

rickmoynihan19:11:26

zib started out horizontally because I was off when it began, and that was how the duct templates were structured. But new stuff is vertically factored, and older features have mostly been refactored to be vertical too… New stuff will certainly be vertically arranged if I have anything to do with it.

alexlynham16:11:33

Interesting...

dominicm17:11:15

By http semantics, I mean that you put together your counter and ring handler in one area. It's harder to have the layer split perhaps?

rickmoynihan19:11:55

Possibly… but it’s not something I’ve seen happening. I certainly agree that you don’t want response codes etc being mixed with business logic; but people seem to understand that’s the job of the handler/middleware layer - or your “resource” abstraction if you’re using something like liberator/yada/compojure-api etc. I’m personally not too prescriptive about every feature having a app.feature.handler namespace as some trivial features may just be app.feature, which contains the handler and the data access etc in a single namespace. When you do that the functions should be clearly layered, but there’s no point adding extra boilerplate/files etc if it’s not serving a purpose…. basically features & apps should be organised at a level that’s appropriate for their complexity. At the point you start sprouting more than a handful of http helper functions to handle http things, and/or the same for data access / business logic etc, you should definitely start splitting horizontally but within the initial vertical feature layering. Sometimes you might want to group features into top level feature categories etc… but these are bridges you should cross as you come to them.

dominicm17:11:04

It's taken a while for this to sink in, but I think: > you should definitely start splitting horizontally but within the initial vertical feature layering. hits my contention point quite well. I've experienced a lot of applications which are unREPLable because they depend on information injected into the req by some earlier middleware that is complexly derived from the request in some other way. There's no good way to call this business logic function without the ring request happening! This has made me want to organize code such that logic is as far away from http as possible. I haven't actually done this though yet, but Edge has been updated to reflect that I expect companies to create modules with limited dependencies (no ring dependency 😈), and then integrate that library/module into your http framework.

dominicm17:11:07

I guess what I'm saying is that I actually think you should vertically split your business layer, and horizontally split http/db/business logic.

dominicm18:11:47

That's surprisingly lackluster

dominicm18:11:34

Vapid article https://hackernoon.com/package-by-features-not-layers-2d076df1964d but has an interesting structure listed.

dominicm18:11:40

I'm trying to find extended writing on this 😊 I'm certain it exists.

rickmoynihan23:11:04

I’ll take a look at the articles later… thanks for sharing. TBH I’m not sure if we’re agreeing or disagreeing. I think that hinges on what we think of as “http/db/business logic”. If all we’re saying is that cross cutting concerns should be horizontal then we’re in agreement if we can agree on what a cross cutting concern is 🙂 Most middlewares are cross cutting because they affect many routes, likewise for authentication logic etc… If by DB we mean things like schemas, then yes I’d agree they’re typically horizontal. If a table were only used by a single feature, I could try and make a case for vertically arranging it; but would probably concede that as it needs to be initialised with the others etc, it should coexist with them. If we were to consider “initialisation” a feature of the app though I guess you could blur the boundaries and consider that as being a vertical slicing too. Infact I might be inclined to do that…. I’ll ponder that one some more…:thinking_face: Business logic is tricky because it covers a such a multitude of things, and is often applied at various layers… e.g. some constraints might be enforced in an auth middleware, others in a database constraint etc. So it might need to be horizontal if only for expedience; but if it’s feature specific business logic all other things being equal I’d rather it was colocated with the feature. I should also say the approach you’re advocating can work well too. I’ve done it more than a few times myself. The difficulty with the approach is that splitting into a library often creates friction (version bumping tedium etc); tools can certainly help with those issues. It’s particularly painful when every change requires a library and app change. Perhaps monorepo is your answer to this, in which case yes, but you then lose some of the separation/discipline/isolation that putting it into a real library can bring. I definitely agree UNREPLable is bad, and I agree that letting the request escape unchecked into your system is a problem. We typically avoid that by repackaging the salient parts into a new “request” object, with the important bits transformed by various coercers into datastructures. I’ve come to agree with Rich and often use namespaced keywords to pass options through the layers; letting intermediates just ignore the parameters rather than repackage them. In my experience the library for business logic approach can require a repackaging/unpackaging layer which does bring complexity remapping terms. It’s good past a certain complexity point where it brings you utility; it can also get in the way and create friction at the REPL as you need to remember all the various parameter renames.

rickmoynihan23:11:45

I’d also say there’s no one right way for all times and all situations; and that my argument largely hinges on the tools we use and the workflows and environments we find ourselves in. If these things change then the app layout may want to too.

rickmoynihan23:11:32

FYI: the layout in the vapid article is essentially what I’m advocating. Though I’d be tempted to remove feature altogether, and put everything that isn’t the main entry point or a feature in a concerns/`horizontal`/`common`/`whatever` directory. i.e. bring the features up a level and make them even more prominent.

rickmoynihan23:11:34

> The typical characteristic of organization by layer is that the logical coupling is stronger within the logical components that span across the layers than within the layers themselves. This is a good point from that article on 4 ways to layout code, slightly better articulated than in my ramble I’m sure, but it is certainly one of my main issues with organising by layer.

rickmoynihan00:11:39

clicking through some of the links from those articles and beyond it looks like uncle bob came to the same conclusions as me: https://youtu.be/Nsjsiz2A9mg?t=420 The talk is a pretty typical uncle bob rant about random OO things, not really relevant… but the 1-2minute rant about this is also what I was saying too.

dominicm06:11:34

You're probably struggling to know where I stand because I don't know where I stand either.

dominicm06:11:32

I'm more interested in trying to explore this space, find lots of prior art and use that to help me figure out what is right. Ultimately that will make edge better.

dominicm08:11:47

Seems like he's saying the same as me, keep your use cases away from the web

rickmoynihan16:11:41

Well, the one thing I do know is there are no absolutes… so we’re really talking about rules of thumb here… i.e. usually its better to do this. “Usually” depends on who’s asking, and what they usually do, of course. IIRC juxt have lots of banks as customers etc, it’s a domain that maybe demands more rigour, than say a throw-away API for some data viz. Your division into a domain lib makes perfect sense in some situations… infact it might even be the theoretical ideal… the problem is that theory and practice are only the same in theory 🙂. So practical problems from that separation may make it more painful that it’s worth… i.e. it may mean you spend forever bumping versions, and lose commits being isolated. At some point you have to acknowledge it really depends on what you want to optimise for. For me I want apps that communicate what they do. I want changes to more often than not be isolated to features, and not split across layers. I don’t think it’s always necessary to push this stuff into a lib. It’s an option, but an option that can be overkill for many apps.

dominicm17:11:11

I don't work on the bank projects 🙂 I work on the more startup SPA applications. So my view is actually skewed closer to yours than you think.

dominicm17:11:43

My experience of using libs is that I develop more isolated, generic parts which then assemble more cleanly. When I need to work on that part it has it's own small test suite, and I can work on it in isolated with very little context loaded up. My role involves me jumping around a lot of projects, so I find being able to isolate something extremely useful. 🔥

dominicm17:11:17

I actually think what you're talking about is really important. There's no real reason why a folder should incur a cost. It's just that tooling can get in the way of making this painless.

dominicm17:11:15

However, if you work on smaller services (e.g. service oriented architecture / microservices) then it might be that your project size is too small to warrant more than one folder.

rickmoynihan17:11:56

Yeah, I’m a big fan of isolation too… and I do like to work that way too. So we’re in total agreement about good engineering being about isolate things for dev/testing/understanding etc. I find this works best when you have a good understanding of the domain. Maybe you’re on v2, or you’ve been through some spikes or have carved out some hammock time and you can see the shape clearly. It’s something I want to do with some of our logic; but the main motivation there is allowing reuse, because we now want that logic to work in other contexts. However I don’t think we could’ve built v1 and shipped a working app if we’d done that work first. The customer would’ve left. Rigour and separation of concerns have a cost, and lets not kid ourselves “doing it right” usually means more work, or more work at the wrong time. There can be value in that work, but if it doubles the time it takes before you can ship the app, sometimes it doesn’t matter if it’ll cost more in the long run. These are organisational level decisions and trade offs, rather than code ones. These boundaries often require a lot of bike-shedding to get right (good), and can also introduce friction that results in poor decisions (bad)… e.g. at a previous employer we had a shared library FFI to a C++ library we’d written that was exposed through bindings into a bunch of different environments (android (java), ios (objective-c), windows (C#), mac (obj-c)). Anyway the binding layer was complicated to the say the least, which meant exposing library functions became too much of a chore, so people would marshall data through other channels etc.

rickmoynihan17:11:17

> There’s no real reason why a folder should incur a cost. It’s just that tooling can get in the way of making this painless. Agreed.

dominicm18:11:16

So the problem I see here is that the strategies we have for doing this separation we (think) will cost us too much time in initial development? We should define initial development here. A month? 6 months? A year?

dominicm18:11:38

Fwiw, I've definitely experienced the rush of impressing a client early on. I'm still on those projects, they're hard to work on and we have slowed down. So I think the balance is wrong.

dominicm18:11:13

I've not read past the title: https://blog.acolyer.org/2016/09/05/on-the-criteria-to-be-used-in-decomposing-systems-into-modules/ > “This paper discusses modularization as a mechanism for improving the flexibility and comprehensibility of a system while allowing the shortening of its development time.”

rickmoynihan21:11:03

ah yes IIRC I’ve read this paper before (I follow that blog), though can’t recall what it says.

rickmoynihan22:11:13

> Fwiw, I’ve definitely experienced the rush of impressing a client early on. I’m still on those projects, they’re hard to work on and we have slowed down. So I think the balance is wrong. This can definitely be one of the forces - but I don’t really consider it the main motivator. It’s just that you can needlessly kill a lot of time thinking about where things belong; and futzing about at the boundaries, without really learning anything about the real problem or domain. At the start of a greenfield project you’re probably going to ship a lot of shallow features, to scaffold the app out. A business logic library boundary can slow this bit down unnecessarily when you have most room to move quick and freely. “How do you want to spend that time?“, is a good question. On the flip side foundations and structural issues are important, so whatever you do should support you as you go on. I have a development methodology that I call POD (Pain Oriented Development), which is wait for things to start becoming painful before treating them. i.e. don’t prematurely optimise, YAGNI etc. The point at which you start to suffer the boundary blurring, or not having the functionality shared etc, then you can extract it then. I don’t advocate absolute rules though, so if you know better from experience or foresight by all means take preventative action. POD/YAGNI is just another heuristic. Anyway I actually think the app/business-logic-lib division is a digression from the vertical vs horizontal discussion, because it’s orthogonal to it. You can be vertical and have a business-logic lib from the start, extract the lib from the app later, or never extract the lib. I’ve been in plenty of scenarios where each of those 3 options makes sense. So I think our only real bone of contention on this point is that I don’t want to assume it’s always the right thing to do, because why take good options and flexibility off the table? Another way to structure projects is: https://en.wikipedia.org/wiki/Conway%27s_law Structuring this way can be really good or really bad, depending on how (dis)functional the teams are. But a good example I have of this is many years ago I took a job at a software agency because a friend told me I needed to work with his mate who was an incredible designer. When I arrived everything he’d said was true, the designer was truly brilliant; like really deep, lateral thinking brilliant, and his CSS/HTML skills are the best I’ve ever seen (he also happens to be great at code too — though swears blind he can’t do it). Anyway we had a big software project, and as the dev lead I thought the best thing I could possibly do was to make him as productive as possible. We made a bunch of paper UI prototypes, and then he wanted to start working on the UI as soon as possible, so I told him to build the layout and widgets on two HTML pages, and whilst he did that I wrote a crude parser/compiler called Stiller that would extract chunks of DOM from between special HTML comments and compile them into javascript functions that would render the UI. It meant that he could work in pure HTML/CSS for the whole project and maximize his time building out and refining the UI, and consuming his changes was almost no work, as there was no retranslation needed. He said at the time it was the best workflow he’d ever had, and it just so happens I still work with him (at a different company), and he still raves about how great that workflow was. We’ve done many similar of things on various projects with him since then too. Anyway this worked really, really well and I think was pretty ahead of its time, though I think front enders have started adopting similar workflows these days too.

dominicm06:11:41

I think the problem with POD is learned helplessness. Developers accept the pain as this project. Or worse, the client never stops to let you fix things.

dominicm06:11:50

I felt the horizontal split was relevant. Because taken directly, one can argue that a lot of layers should go into a feature. So I will restructure my question: How much should go into a feature?

rickmoynihan08:11:20

> I think the problem with POD is learned helplessness. I disagree. You should avoid the pain getting bad; when its a niggling you should sort it out. > I felt the horizontal split was relevant. Because taken directly, one can argue that a lot of layers should go into a feature. Yes. You should still be looking for opportunities to refactor/share commonality though. > So I will restructure my question: How much should go into a feature? There’s no short answer to this question; and there are many ways you can decide what is in or out. Essentially keep them as small as necessary. Other hard questions can be what constitutes a feature etc… But basically at a minimum its a vertical slice of relatively self contained functionality. Obviously in the real world everything is tangled and connected, so you have to do some untangling, and accept that there will be coupling between other things - ultimately you can’t remove all of that. I’m not absolute about this, if practical reasons make it awkward to arrange something vertically then you may need to make some compromise. Feature orientation is nothing like the most important design goal, it’s just something that can help on a number of fronts if a significant percentage of code is arranged this way.

rickmoynihan11:11:34

I think these problems apply to all design methodologies though. You can read some of my thoughts on why this is here: https://clojurians.slack.com/archives/C064BA6G2/p1542102751654400 Likewise one has to be careful to focus on what is really important; the methods and methodologies should typically play second fiddle to achieving the real objectives.