I have a question about multi-threading. I’m not at all an expert at this, but mathematically what I want to do is pretty simple. Are there some tricks I can use to make the coding simple as well. What it is: I have an infinite sequence of computations. I want to continue computing them (and plotting the results) until some exit condition such as time-out or user-interrupt, not sure yet. The sequence of computations forms a partially ordered set. I.e. each computation can begin at any time, but I dont’ want to collect and plot the result out of order. each result depends on some (but not all) previous result. So I want to collect the results in a greedy an order as possible, but obeying the partial order. The easiest way to describe this (for the purpose of clojurians) is that each computation has an index which is a positive integer, and each index, i, is “dependent” on one particular other index, j. In every case i < j, but i does not depend on all indices smaller, just j. and j is in turn depending on some other index, etc etc.
@lambeauxworks I agree that my question is not really well defined. I’m still in the brainstorming phase. the balanced-ness of the tree is a good question. I’m not sure how to measure it.
to give a bit more information about the underlying problem. I want to graph a mathematical function by colorizing a rectangle. the function maps the plane (or in this case a rectangle in the plane) to the real numbers. I will chose a gradient of colors to illustrate a range of values. most points in the rectangle will be fast to compute, but some points will be slow or even verrry slow. The idea is to compute the value in the lower-left corner of the rectangle, use that value to colorize the entire rectangle, then divide the rectangle into 4 quadrants, colorize each recursively. ( I realize now that this is slightly different than I described above, sorry_) this computation order has the disadvantage that most points will be re-computed too many times. but it has the advantage, that the human-eye sees very quickly a vague out-of-focus colorization, which sort of comes into focus. I.e. points which are fast to compute colorize their regions, and points that are slow to compute are slow to colorize.
additionally if for any quadrant, its lower-left point is slow to compute, then the other 3 quadrants will not be delayed because of this slowness.
That's okay, sometimes we need to elaborate a bit (even to ourselves) to fully grasp what we are dealing with.
Is your goal to visualize job completion in the monitoring/metrics sense? Or does the rectangle actually serve a geometric purpose elsewhere?
Is the data fully generated by you or are these batch jobs or records coming from elsewhere?
no the data is coming from a mathematical formula which computes a quantity iteratively. sometimes it converges very slowly.
I will have a think about it before taking up too much of anyone’s time.
I could just start a bunch of threads, and then somehow ask which ones are “ready” and then collect only if its dependency has already been collected. That seems wrong to me. There ought to be a way to collect lazily but obeying the partial order.
There are libraries that do computations based on a specified graph. Usually with some amount of syntax sugar. But I'm yet to use them myself. Without such libraries, assuming you don't have more than hundreds of operations or you have enough memory to keep all their independent input data, you can achieve it by using futures and deref'ing them. Something like:
(let [task1 (future
(perform-task1))
task2 (future
(perform-task2))
task3 (future
(perform-task3 {:task1-result @task1
:task2-result @task2}))
task4 (future
(perform-task4 {:task3-result @task3
:task2-result @task2}))
all-tasks [task1 task2 task3 task4]]
(doseq [task all-tasks]
(let [task-result @task]
(do-something task-result))))
Adding a global timeout in a way that can interrupt tasks that are currently being executed is less trivial.
At that point, I'd probably switch to core.async because it makes it very easy with alts! and timeout. But then I'd probably replace futures with core.async stuff as well. After all, it's for CSP - communicating sequential processes, which perfectly models parallelizable computation graphs.
Let me know if you'd like to see some code for it, I'll see if my brain has enough flex to remember how to do it given that I've been doing only electrical work for the past month or so.Thanks for the suggestion. I need to do a bit more thinking to try to really understand what I want to do. Or at least try to put in works the computation that is in my head.
Why not just a heap (priority queue) by "index" and a simple executor/thread pool that pops from the queue and stores results in target data structure? Simple, explicit, easy to tweak/control. Am I missing something?
@Tomek, if I understand what you mean it wont work. but maybe I misunderstand (I hope I misunderstand). Here is how I envision it working. each job computs some return value which might be fast or slow to compute, but each job also starts 3 dependent jobs. job0 starts jobs 1 2 and 3. job 1 starts 4 5 and 6, job 2 starts 7 8 and 9, job 3 starts 10 11 and 12, job 4 starts 13, 14, and 15, job 5 starts 16, 17 and 18. job 6 starts .19, 20, and 21 etc. I want to collect the values in any order which satisfies the constraint that parent is collected before child. One possibility is to have each job only start its dependent jobs at the time it is collected. Therefore every job started will already have its dependency met by construction.
For simplicity, assuming it won't cause bottlenecks, I like your idea about collecting also triggering. Is all this happening in-process or distributed?
What are the properties of the final dependency tree? Is it balanced? Dense or sparse? Particularly deep or wide? That would significantly help with algorithm identification.
Since it's infinite, assume "final" means some snapshot of the tree after [N, N 10, N 100, ...] completed jobs.
Is there a way to pretty-print a type name? For example I would want (pprint (type :a)) to print "keyword" and (pprint (type 'a)) to print "symbol".
thanks!
I guess I can use (str/lower-case (.getSimpleName (type :a))) but that seems kinda hacky
Depending on your "why" (esp. the kinds of things you'll be asking the types of) you may end up with something like maria.friendly.kinds: https://github.com/mhuebert/maria/blob/5e932c6e6d64ee1de588f283f975cb7df4cf5964/friendly/src/maria/friendly/kinds.cljs
oh huh, the kind function there looks neat
thank you!
sure thing!
whoa Bling looks really cool
i might use that for error reporting