The other day when merging Johanna's code to clusterize the
configuration framework, I noticed this code in there:
# [Send id=val to everyone else]
Broker::publish(change_topic, Config::cluster_set_option, ID, val, location);
if ( Cluster::local_node_type() != Cluster::MANAGER )
Broker::relay(change_topic, change_topic, Config::cluster_set_option, ID, val, location);
It took me a bit to understand that ... The goal here is that a change
in a configuration value gets propagated out to all nodes in the
cluster. The Broker::publish() sends it to a node's immediate
neighbors, but not further. That means that for workers it goes (only)
to their manager; for the manager it means, it goes to all workers. If
we're not a manager, we then separately (through Broker::relay()) ask
our neighbors (that's the manager) to forward the change to *their*
neighbors (that's the other workers), without reraising it locally.
I remember we have discussed this API before, but I wanted to bring it
up again as I keep finding it confusing. I believe the code above
could be simplified by using the newer Broker::publish_and_relay(),
which was added to combine the two operations. Still, I'm realizing
now that I don't like thinking about this in terms of separate
publishing and relaying operations.
It all won't become easier once we add multi-hop routing to the mix
(which is in the works). And on top of all that, we also have
Cluster::publish_rr, Cluster::publish_hew, Cluster::relay_rr, and
Cluster::relay_hew -- another set of separate publishing & relay
I'm wondering if we should give it another try to simply this API
while we still can (i.e., before 2.6 goes out). To me, the most
intuitive publish operation is "send to topic T and propagate to
everybody subscribed to that topic". I'd structure the API around
that, making that the main publish function for that simply:
That would send to all neighbors, which then process locally and relay
to their neighbors. Right now, that would propagate just across one
hop but once we have multihop that'd start being broadcasted out
To support the other use cases, we can then add modifiers & functions
to tweak this default, e.g.:
- Give publish() another argument "relay: bool &default=T" to prevent
it from going beyond the immediate receiver. Or maybe instead:
"relay_hops: int &default=-1" to specify the max number of hops
to relay across, with -1 meaning no limit. (I recall concerns
about loops being too easy to create; we could set the default
here to F/0 to default to no forwarding, although conceptually I
don't really like that
- Give publish() another argument "relay_topic: string &default=""
to change the topic when relaying on the 1st hop.
- Give publish() another argument "process_on_relays: bool &default=T"
to change whether a relaying hop also sees the event locally.
- Add a second function publish_pool() that has all the same
options, but receives a pool type instead of a topic (just an
enum: RR, HRW).
What I'm not quite sure about is if some of these modifiers are better
to leave for the receiver to specify (e.g., whether to raise events
received on a given topic locally, or just forward). I think I can see
that either way.