[nanomsg] Rules-based topology description

From: Paul Colomiets <paul@xxxxxxxxxxxxxx>
To: nanomsg@xxxxxxxxxxxxx
Date: Thu, 19 Sep 2013 17:14:53 +0300
Hi,

Continuing experiments on name service. Today we are going to
implement a pretty complex topology [1] (with arbitrary number of
workers) in a less than hundred lines of configuration file [2]. Here
is a project url:

https://github.com/tailhook/rulens


Divide & Connect Overview
----------------------------------------

The technique overview:

1. Select similar parts of topology and make it separate sub topology.
I name it "layout".

2. Instantiate each layout multiple times

3. Make a higher level layout to connect those instances


The Real Steps
----------------------

Looking at [1], firstly we remove the subtree at "wally" node. Layout
is simple, defined under a name "subcluster" [2]:

https://docs.google.com/a/colomiets.name/file/d/0Byd_FTQksoKLeWhkSzZjX1M4SVk/edit

All nodes shown in the layout pictures are named by "roles" not
hostnames. So every arrow can potentially describe a many-to-many
connection between nodes. The octagon-style nodes with _underscored
names, are points where the layout is connected to higher level
layouts (I'll call them "slots" below).

Next we split out internal datacenter layout, that is shared between
data centers:

https://docs.google.com/a/colomiets.name/file/d/0Byd_FTQksoKLTDg2V2NZQi1wMUE/edit

It's also simple enough. But it already has failover path and slots to
connect to other datacenters.

Finally we make a trivial DC to DC layout:

https://docs.google.com/a/colomiets.name/file/d/0Byd_FTQksoKLS1RWVkNJdzNtS1k/edit

It barely specifies that all "api" slots are joined for single name
"public" and all cross-data-center gateways are connected to each
other. There is a special property described in yaml that avoids loops
in this connection.

We need to define the ports and ip addresses for all sockets that will
nn_bind(). The reason for them being defined in the config is that we
can't update the address on the fly yet, so important addresses must
be defined ahead of time.

The only tiny bits left in configuration file is how we combine
topologies and how to define public topology on top of internal one.
Look for "topologies" section in [2].

Here is how instantiated topologies look like  (I mean with nodes
attached and running):

https://drive.google.com/folderview?id=0Byd_FTQksoKLYWR6aW9KQmhFaGs&usp=sharing

All of them are kinda work (not tested thoughtfully). The graphs are
generated by script, so may be not perfect, but clearly indicate the
potential. Also the first cluster is the biggest one, other's are
smaller just for the sake of smaller graph, they have same structure
as first one.

Here is a diff, to show you how much work is needed to grow the
configuration from single- to multiple-cluster one:

https://gist.github.com/tailhook/6622757

So it's kinda easy task, with this sort of configuration.


Random Remarks
--------------------------

1. In reality probably each DC will have it's own name service. It may
be achieved by replicating the config, or by providing "slots"
(described above), and building cross-data-center connections around
that slots.

2. I'll probably cut down the NS protocol to (<url> <socket_type>)
pairs. All additional data can be specified in url as query parameters
("?name=value")

3. I don't believe that applications would live without the name of
the topology in configuration files. E.g. if there is an image
resizing service. It could join the topology with the name
"image-resizer". But in reality it will join the topology as
"image-resizer.mywebsite.com" for two reasons:

  3a) I might run same service for multiple projects

  3b) I would like my development cluster's "image-resizer" service to
be unable to connect to production cluster, even if I make a mistake

4. There is a way to build a separate topology on top of existing
topology's slots (see how "public" defined on top of "internal" in
example). So that external entity connecting to public topology never
mess up internal connections.

5. As you can see in the pictures, it's possible to draw the full
topology graph based on the name service's requests only. The only
piece left is the nodes that are dead after making name request. That
would be provided by monitoring system.


What's next
------------------

As discussed on IRC we will provide the separate library called
"nanoconfig" that
is built on top of nn_connect()/nn_bind() API that does name service
requests. It will live outside of the core until the semantics and
protocol are stable enough and match everybody's expectations. We will
encourage language bindings to support nanoconfig out of the box.

The "rulens" service will be rewritten in C to be easier to setup. So
I'm much interested in discussing configuration file format and
semantics, but not the real implemenation.

Any comments?


[1] 
https://docs.google.com/a/colomiets.name/file/d/0Byd_FTQksoKLY3lyZnlQalFtRnc/edit
[2] https://github.com/tailhook/rulens/blob/master/examples/twodc/topology.yaml

--
Paul
Follow-Ups:
- [nanomsg] Re: Rules-based topology description
  - From: Martin Sustrik
[nanomsg] Rules-based topology description

Other related posts: