Hi, This thread tries to summarize what should be implemented in nanomsg for the state of the art monitoring support. Everything is IMO, so feel free to discuss. There are basically three separate tasks for monitoring: 1. Logging . 2. Statistics 3. Topology info I think they are different enough so that it would be wrong to mix them in single protocol. So I discuss each sepately. Note as outlined in ticket for logging [1] all three interfaces are targeted at administrators rather than programmers. So they should be as transparent as possible for programmers. Logging ========== Logging should should be used for the erroneous situations that can't be delivered otherwise through the API. The examples are: 1. DNS name can't be resolved 2. An extra connection attempt for NN_PAIR socket But, I think that logging should not cover the situations that (a) library can fix without intervention and (b) can be generated thousand times a second, for example: 1. Socket disconnect in the middle of the message 2. Connection limit reached The point (a) is pretty weak, so common sense might be applied. The (b) however, is necessary to avoid filling logs while being DoS-attacked. E.g. if socket disconnected in the middle of the message, it might be network overloaded and logging messages for that only add up traffic and doesn't help to solve the problem. So this kind of warnings should be generated at the monitoring system itself using the Statistical data discussed in the next section. In other words: the delivered correct numeric values are much more reliable proof of system's health than an empty log. So the logging subsystem should most of the time give hints about misconfigurations on the system, rather than identify transient errors or check the current health of the system. This kind of limitation of what log messages are, give us two important decisions: 1. There are no log levels. You may think of it as log level always ERROR. This gives us smaller number of runtime configuration needed which is A Good Thing. 2. The log messages are text. That whats all admin tools work with. The next thing is API. In [1] Luca proposes an programmatic API for logging, which is (simplified): typedef void (*nn_log_callback)(char *msg); NN_EXPORT void nn_log_register(nn_log_callback cb); It very complex to write right callback, because of the following requirements: 1. The callback must be thread safe, and can be called from worker thread 2. The callback must be reentrant 3. The callback must be non-blocking 4. In general, the callback must not use any API from nanomsg (except perhaps nn_strerror) So it's unclear how to write the callback that sends a messages to syslog. Writing messages to a file is not safe either, because it may block for some time too. So I think its better to define a socket-based API. It may look like int sock = nn_socket(AF_SP, NN_PUSH); nn_connect(sock, "ipc://log.socket"); nn_log_register(sock); There are several other variations of registering a socket possible, but they can be discussed when it's clear that sockets should be used for log messages. Similarly exact format for log message should be discussed. Probably it should be what rsyslog-zeromq uses, or something similar. The pros of the socket-based API: * It allows to configure logging messages to be delivered to a correct thread (via inproc transport), to local syslog-like daemon (via ipc transport), or to network via any nanomsg transport * Reuse of configuration of sockets (addresses, buffers..). The app that uses *.ini file to configure other sockets, can use same code to configure logging * Reuse of infrastructure, e.g. devices for sending logs by network. Same encryption keys when eventually that is implemented. Any nanomsg transport, and so on. * Easy to read the logs with nanocat, no matter how logs are actually stored Statistics =============== The examples of statistical data: 1. Number of messages per second sent/received through socket 2. Number of connections per socket/app 3. Number of connection attempts/rejected connections/dropped connections over period of time Anything that might be helpful for knowing the health of the system belongs here. To keep subsystem simple, I assume that only numerical data is collected. Probably same argument for socket-based API holds here. So this data should be delivered through PUB socket at regular intervals using some simple to parse protocol. For example graphite [2] or ESTP [3]. The graphite message looks like: socket.1.messages 4 1378318683 The nanomsg pubsub protocol allows multiple monitoring services to keep an eye on that values, filter messages and so on. So it seems to be perfect fit. (Why not SNMP is discussed below). Note: I don't propose use surveyor here because that will add random jitter to time when the value is reported. Also using multiple surveyors is more resource consuming than using several subscribers. Also each surveyor will get a slightly different data. For the case of DoS attack or network overload, the port used for statistics can be prioritized over other data. The traffic for statistics is fairly low, and what is more important the amount of traffic is predictable (fixed amount of records per socket). Also important things like messages per seconds should be sent through the wire as incrementing counters of messages sent since the creation of the socket instead of plain messages per second, so that loosing single message doesn't hurt the statistics (that's why I prefer ESTP over graphite protocol). Topology Info ============ It would be nice to have the topology graph drawn for us. Building it from log (as proposed in [1]) is not feasible, because in case of disaster log may miss few important records. Also as outlined above I do see it wrong to log each connect and disconnect attempt. The statistical data is not enough to build the graph too, because it shows only number of connections not the actual peers. So the collection of sockets and their peers should be sent separately. Comparing with the statistics data, this topology info is: 1. Changed sparsely and is not sensitive to survey time jitter 2. Potentially contains lots of data (e.g. lots of inactive connections) 3. Occasionally should be updated immediately (button in web panel, or a command-line tool) So I believe it fits SURVEYOR pattern nicely. I.e. the monitoring system sends periodic (say once a minute) survey, and every process replies with a serialized list of endpoints and peers. In case user open web panel and push refresh button, the survey can be sent right away. In the future a delta protocol can be invented, so that survey contains timestamp of topology it last received and only changes are transferred. The subtle things here are: 1. Naming the sockets 2. Devices The naming sockets will be discussed in a separate section, as it is relevant for logging and statistics too. Without info on devices, the graph may contain node hostnames and a number of connections between each pair of nodes. It's not very interesting, as in my projects almost complete graph would be drawn. Standard devices (those created by nn_device) are easy to report. But there are plenty of use-cases for other devices. So to correctly display devices we need one of the following: 1. Name convention for sockets (see below) 2. Annotating the socket with details about device using setsockopt 3. API, for attaching arbitrary annotations to the topology info 4. As survey can be replied infinite number of times, application can answer survey itself alongside with internal reply by nanomsg. This effectively gets #3, but seems to be quite ugly. I'm not sure what to choose. Note, however, that device can consist of any number of sockets not just two. E.g. it can be a socket which packs multiple topologies into a single cross-data-center connection. Also a serialization format have to be chosen. I'm pretty sure there is no suitable format for topology data. So I think we should make something up based on msgpack or json. Identifying the Sockets ================ The data would only be useful if all sockets can be identified easily in log messages, statistical data and topology info. I propose to borrow format from the ESTP. The latter declares basically the following structure of the name: <hostname>:<app_name>:<resource> (the :<metric> part is skipped as it's relevant only for statistics). Basically we can do names like this: org.nanomsg.example:nanomsg.1234:socket.anonymous.7 Where 1234 is an pid of the process and 7 is a socket number. The host name should probably be the (reversed) name returned by gethostname() or `hostname --fqdn`, and doesn't need to be configurable in nanomsg. The app_name and resource as are automatically generated like above clearly identify the socket at any given moment, but are garbage in the long term. So they should be overriden with the socket option: nn_setsockopt(7, NN_SOL_SOCKET, NN_SOCK_NAME, "request_db", 9); nn_setsockopt(7, NN_SOL_SOCKET, NN_SOCK_APP_NAME, "myapp.1", 7); The above should turn name into: org.nanomsg.example:myapp.1:socket.request_db Not sure about app name. It may be set on the statistics submitter socket, or nn_setsockopt(-1, NN_GLOBAL, ...) or another global API function may be invented. Random Thoughts ============= I use the term "monitoring software", but does not describe what it is. I think it should be obvious. In the long term special software may be written for handling all the nanomsg specific stuff. In the short term the existing solutions should work well. E.g. for logging the rsyslog is obvious candidate, the tiny plugin for submitting the data must be written however (there are one for zeromq). For statistics either graphite or collectd may be used with tiny plugin either (there are ones for zeromq too). Or even nanocat can be used to submit data to these ones or variety of other monitoring systems (e.g. nagios mostly uses command-line utils to submit everything). AFAIK, there is no existing software to draw the topology. But with proposed solution the 15-minute command-line script in python can produce *.dot file for the topology. In the long term some better software will appear. In-between the application process and the monitoring software all the standard devices can be used, e.g. to gather data from all processes locally and send it through single socket to the monitoring. No special software needed. Why not SNMP? That's a very good question. Note that SNMP may only be used for Statistical data, AFAICS. The reasons why I don't want SNMP are: 1. It doesn't support all infrastructure we have in nanomsg: devices, transports (and encryption that will eventually be added), etc. It have it's own devices and encryption, see #2. 2. It's another thing to know. However, it might be argued that SNMP is already known by admins. But it's also another *complex* (see below) thing to know by nanomsg developers. I would say it's more complex than nanomsg itself, given the amount legacy the protocol collected for decades. 3. AFAIU, separate daemon needed on each node to answer SNMP requests 4. The OIDs management is ugly. They are long dotted integers that's complex to read, complex to search for, complex to create for your own app. 5. The protocol is pretty incomprehensible by me, and probably to 99% users of nanomsg too (subjective, yeah). The wikipedia page lists 27 RFCs for SNMP. Ah, well, it's not the real reason. Real reason is: it's hard to explain how it works and how to find out those OID's in five minutes, unlike for the ESTP. Anyway the proposed solution leaves the gap where one can write a daemon that collects statistics by pubsub locally and turns it into SNMP. The daemon is required even if nanomsg would implement some SNMP-related functionality, so nothing is lost. I believe that patterns described here match their function well. Maybe the special pattern(s) for monitoring data could be invented. But I see the current patterns as low level building blocks. And it's a big win that the whole monitoring system can be built on top of existing patterns. It's also a good example for dissection a task into a small patterns. For logging and statistics it's nice that same socket can be used for delivering application-specific data, so that nanomsg establishes a standard for logging and statistics for applications built on top of it. Nevertheless the standard is fully optional to use. Thoughts? [1] https://github.com/250bpm/nanomsg/issues/81 [2] http://graphite.readthedocs.org/en/1.0/feeding-carbon.html [3] https://github.com/estp/estp -- Paul