This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
I was having a long chat with bing about Clojure and a question I'd asked here a little while ago (how to get the nth result of iterating a function call). I tried to get it to tell me tradeoffs of its suggestions and it was telling me that calling (nth (iterate f init) n)
potentially uses a lot of memory and so wouldn't be a good option for large n.
So I asked it, doesn't iterate
return a lazy sequence, in which case iterate
isn't using a lot of memory? And it told me: oh, yes, but it will generate all the items up until the nth when you call nth
.
So I asked, isn't that only going to be a memory problem if nth
holds on to the head? And then it told me that's right and that if nth
doesn't hold on to the head, then there's no problem. It then reiterated that nth
can have memory issues for large n and I should consider functions that operate on lazy sequences such as drop
.
So, for you non-robot folks: does nth
actually hold on to the head? From what I can tell poking around a bit in the Java code, it doesn't look like it does, and I can't imagine any reason why it would.
And I know, I know: what did I expect, right? đ
I think my actual question got lost in the story telling: does nth
holds on to the head? Is it memory inefficient for large sequences?
Further: is there any (easy) way of knowing whether a function (core or in general) is inefficient in this way?
I think the answer is something to do with handling lazy sequences. The function call will happen n times and if it produces a lazy sequence then the nth function will only cause the iteration to happen n times. If the iterate function does not create a lazy sequence then you will see more than n calls to the function. Although Clojure is calling the function the normal view is to look at the data rather than the function call. So hence my suggestion about lazy sequences.
I have a bunch of operations that I'd like to group in a certain way. Let's say the problem is CRUD operations on local files, a database and S3 (it's not but I'm trying to keep the example as simple as possible - I think it's a mostly accurate reduction of the actual problem). In my experience there are two, orthogonal ways to achieve that.
1. Multimethods achieve what theory calls runtime polymorphism. In practice, they group multiple implementations of the same operation together, from the perspective of the user. So e.g. you can have a load-to-memory
operation, which you apply across sources, e.g. S3, database, etc based on the contents of the input.
2. Records and protocols do the opposite. You define an abstract entity (e.g. S3Storage, LocalStorage, DBStorage) and specify a bunch of operations that are allowed on it (`load-to-memory`, store
, update
, delete
) and when you need new functionality you're expected to implement these operations. In principle, this is similar to interfaces and inheritance from OOP.
So my question is, is any of these tools more idiomatic than the others overall? If not, how do you choose which one to use? (Some opinions in thread)
⢠I've lost count how many times I've faced this dilemma and after a lot of consideration mostly ended up with plain functions. It works for the most part but needs more discipline to keep code organized neatly and I feel I'm missing out on something, like I'm underutilizing Clojure. ⢠I like multimethods and I've used them a bunch because they promote a more data-oriented approach to solving problems than records/protocols but on the other hand you have to rely on namespaces and your eye to group operations together. That's not necessarily a bad thing but it's menial work that you don't have to think about if you use protocols. ⢠Namespace (re)loading also gets weirder the more you use any of these (you have to make sure you've loaded all namespaces that contain the multimethods, even though they don't seem to be used, which can also throw linters off), so it's probably an additional thing to consider if your workflow depends on that?
I'm aware of the performance aspects of each (i.e. protocols are faster than multimethods) but I suppose what I'm trying to understand here is, performance and other technical considerations aside, which approach leads to simpler code that is easier to develop/refactor/accrete over time?
I have an alternative perspective for you. Protocols and multimethods just define a named switch statement that is extensible after it's defined. Performance aside, the least complex thing is to not use these extensible systems and just use if
/`cond`/etc. Multimethods allow you to define the function to switch on, and protocols only allow you to use java's type dispatch mechanism. There are reasons extensibility is useful, for example code ownership (if the switch statement is in a library, not part of the application, then being open to extension can be useful) or code co-locality (a bunch of different things seem to naturally fit together so you want to write their implementations next to each other) but I think it's always more straight forward to start with normal functions and only add the more complex structures when you really need them. Don't worry that you're missing out đ
Good point about code ownership! > code co-locality (a bunch of different things seem to naturally fit together so you want to write their implementations next to each other) Yes, that's it, I suppose I'm trying to find a way to marry convenience to simplicity in this scenario. As in, if you use conditionals, you have to describe the dispatch logic as you're writing the code. But with multimethods/protocols you design it once and then it's easier to use because it's abstracted away from the user (it's more convenient but complex). > I think it's always more straight forward to start with normal functions and only add the more complex structures when you really need them That's what I've been doing for the most part but I'm not sure I'll ever "really need" any of these. Anything that you do with multimethods/protocols you can do with conditionals, so how do you tell when to switch?
Doesnât it boil down to open vs closed? Youâll need to know all options in your conditional, whereas other options are more dynamic and extensible, even outside of your own code. Then protocols dispatch on type, so you might prefer multimethods when dispatch is more complex, perhaps dispatching on a value or some other logic?
Then protocols dispatch on type, so you might prefer multimethods when dispatch is more complex, perhaps dispatching on a value or some other logic?Good point... Assume that both are applicable in this case (let's say S3Storage/LocalStorage/DBStorage are custom types, so you can use those to dispatch with protocols)
The nice thing is that I donât itâs that difficult to change your approach later. Unlike when I used to code in more static languages, when it feels as if perhaps youâre more locked in to the pattern, itâs not difficult to switch and clients donât necessarily need to change much, if at all. So get something working and then make more complex if you really need it.
Sounds to me like a protocol is your best bet.
Youâre likely to then have some configuration options so youâll have a factory of the type of thing and then all other code just calls your protocol(s) with the opaque thing. If you changed stuff in the future, most of the polymorphism is kind of opaque to callers anyway⌠they just work to whatever abstraction youâve designed.
So get something working and then make more complex if you really need it.That's good advice... > Youâll need to know all options in your conditional, whereas other options are more dynamic and extensible, even outside of your own code. Not an issue, it's an internal thing, so only people with access to the existing code will try to extend it. > Sounds to me like a protocol is your best bet. Even over plain conditionals? This is a fresh perspective đ Would you prefer conditionals if neither extensibility nor complex dispatching is a goal or do you find merit in protocols/multimethods besides those two?
:man-shrugging: ⌠no ârightâ answer but I can only keep so much in my head at the same time, so what youâve described to me sounds as if itâs an important boundary. When Iâm wearing a âclient hatâ I donât want to know the impl details, and when Iâm wearing the âimplâ hat, it feels to me as if you have types that are little machine type things and you already have multiple impls so keeping things open and extensible resonates⌠but I think this is more art than science? Sometimes you only grok the proper boundaries later, and after a few implementations so you end up with a coding iteration that is quite different in Clojure compared to say, Java, when youâre natural tendency is to think you know everything upfront and start writing interfaces with only one implementation. Thatâs why Iâve so enjoyed Clojure because that iteration aspect is so easy and probably gives better results?
Is that an example of deferring decisions until as late as possible⌠probably.
I like plain functions unless there's a specific reason they aren't up to the job.
<https://www.youtube.com/watch?v=t6ktSfInNhU> is a talk from way back where chouser talks about the expression problem (which is sort of a special case within the concept of polymorphism). The coverage of mutlimethods and protocols starts at about 14:00, and talks about the pre-reqs and trade-offs of each
I think my actual question got lost in the story telling: does nth
holds on to the head? Is it memory inefficient for large sequences?
Further: is there any (easy) way of knowing whether a function (core or in general) is inefficient in this way?