ShimmerCat's Logging

Pragmatic logging

As it is conventional with logging, ShimmerCat separates logs using different severities. On top of that, ShimmerCat's logging infrastructure adopts these principles:

To achieve these goals, ShimmerCat assigns a code to each log message, and it allows to turn on and off logging messages corresponding to each code. The codes have the form 'Mxxxxx', where the 'x' represents a digit. Different codes are used for different situations, and the first digit can be used as a hint of the severity of the message. We assign the codes manually, and they should remain stable even as we improve the text representation.

Choosing what to log

Choosing what to report is done through the option --log-codes. Keep in mind that this option controls what logs are generated in the first place. If a message is disabled with this option, you can not capture it in any way.

The --log-codes option establishes a white-listed set of codes.

It accepts as argument a sentence that computes the white-listed set. Let's explain with examples!

Logging everything

--log-codes=all

The example above enables all log messages. This includes many annoying messages, so use with care. In this line, all denotes a named set of codes, namely, all the logging codes that ShimmerCat implements.

Other named sets are default (the one which is used when the command line option is not specified), empty (meaning "don't log anything"), and connection, whose codes include information about connecting clients. More of these named sets may be introduced in the future.

Excluding a set

--log-codes=all\\connection

This value suppresses some of the annoying messages when using the SOCKS5 functionality. In particular, it blocks all the messages reporting the attempts by Google Chrome and Firefox to call home.

Notice the use of the \\. This is how you escape a single \ in bash. This XKCD comic explains better the escaping issue. The backslash in the --log-codes value denotes set difference. Literally, the example above means "all codes minus the connection codes".

Ranges of codes and set union

--log-codes=M40000..M49999UM50000

This example uses uppercase "U" to join sets. It means that all the messages with codes from M40000 to M49999 should be created and reported, plus the code 50000. The two codes separated by .. are a range.

In general, any set can be specified as unions and differences between sets, with individual codes, ranges or named sets as building blocks. Union is denoted by U and difference by \, although the later needs to be escaped in bash. The set operations associate to the left.

Severity levels

Severity levels are somehow subjective: something that is inconsequential in one scenario can be fatal in another. Therefore, you may be better off by classifying the logs by code according to your needs.

That said, we have the following severity levels, from more to less critical in our own subjective assessment:

Logging for devops engineers

ShimmmeCat allows for advanced log collection, attending to the following:

By default, all the generated log messages go to standard output. In devlove mode, that's it, not much more to do.

However, there is another operation mode for the server, internet. This mode is very similar right now to devlove, but it will progressively diverge by adopting defaults which are better tuned to run a production web server. Even now there are a few differences, and one of them has to do with the way processes and logs are handled.

In internet mode, the actual work of serving web assets is done by worker processes which are supervised by a master process. The master process does by design very little, so that it can be replaced by something else by users with advanced needs. But there is no need to replace the master process to do advanced log collection.

Log messages go to queues in Redis

In internet mode the workers send the log information to the master via messages in Redis queues. You can listen to those queues with your own service, and from there enter the logging data in your logs workflow.

An example goes a long way on explaining how this works. First ShimmerCat should be ran in internet mode:

$ shimmercat internet
...
.low.      wrk-0    06:57:36,2016-05-08 M41021 Starting SOCKS5 server at address 127.0.0.1:2006
.low.      wrk-2    06:57:36,2016-05-08 M41021 Starting SOCKS5 server at address 127.0.0.1:2006
.info.     wrk-0    06:58:33,2016-05-08 M31002 C/0 Proxied (Appl. response status: 200) HEAD  /en/info/articles/getting-started/
.info.     wrk-0    06:58:33,2016-05-08 M50000 C/0 rr GET www.shimmercat.com /en/info/articles/getting-started/ HTTP/2    (200/OK) (pushing 0 streams)
...

Then one can0 connect to the Redis database that ShimmerCat uses, and listen for messages in channels that start with sc_log_*. This can be accomplished by executing the following commands, from the same directory where ShimmerCat is running:

$ redis-cli -s .shimmercat.loves.devs/redis.sock
redis .shimmercat.loves.devs/redis.sock> PSUBSCRIBE "sc_log_*"

Here is some same output from that command:

Reading messages... (press Ctrl-C to quit)
...
1) "pmessage"
2) "sc_log_*"
3) "sc_log_shimmerc_M50000_i_wrk-2..."
4) "wrk-0...\xfc73(\x01\x00\x00\x00\xe0\x0b\x0e\x00\x03\x1ay\x1e\x00\x00\x00\x00\x00\x00\x000\x00200\x00GET\x00/en/timing/do-noise/"
...

The easiest way to use the information is by using a scripting language with support for Redis, there is a bare-bones example a bit below. But first let's explain the information present in the queues.

Names of the logging queues

Redis' PSUBSCRIBE command above can be used to subscribe to a set of queues by pattern-matching on their names. To profit from this, we send log messages to queues whose names are created by taking the following fields and separating them with underscores (_). In the example above, the queue name of a message is sc_log_shimmerc_M50000_i_wrk-2... (the last three dots are part of the name) and it contains the following fields:

  1. A sc_log_ prefix.
  2. A namespace, for the case when multiple swarms of ShimmerCat servers are running using the same Redis cluster. The default namespace is shimmerc. Namespace names are always eight characters in length. If longer, they are trimmed to that length, and if shorter, they are padded with dots to the right.
  3. The code of the message, M50000 in the example above.
  4. A one letter code for the severity. This letter is the same than the initial (lowercase) letter of the name of the severity level, see previous section.
  5. A worker name. This is because a swarm can have multiple workers. The worker name is taken to eight characters in the same way than the namespace. In the example above, the worker name is wrk-2....

By using different types of patterns in the PSUBSCRIBE command, it is possible to tune to specific message codes, or all messages with specific severities, or messages from a specific source, etc. Notice that some of this information is repeated in the log message itself, which we describe in the following subsection.

Data in the logging queues

We use a binary format to transfer the log data in the queues, because it is shorter and easier to parse. The text form of a message can be re-constructed by using their message codes and our text-file database with a mapping from code to message. We have examples in the next sub-section that do just that, but first let's explain the binary format.

The binary log data of a log message contains two parts. The first part is fixed for all messages, and it contains fields which are present in all messages. We describe now those fields, notice that all numbers are encoded in little-endian format:

  1. Eight bytes for the worker name. See the previous subsection to understand how these names are formed.
  2. A 64-bits unsigned integer with the number of seconds since the 17th of November of 1858 (modified Julian Date).
  3. A 32-bits unsigned integer with the number of microseconds.
  4. One byte indicating the severity code. Here we use the value 3 to denote low, 4 for info, 5 for attention and 6 for error.
  5. A 16-bits unsigned integer with the message code, that is, with the number after the "M" in e.g. M50000.
  6. A 32-bits unsigned integer with the length of the second part of the message containing data specific to this log code.

If we were to describe the above information with a C struct, it would look like this:

...
#pragma pack(1)
typedef struct _log_header {
    char worker_name[8];
    uint64_t modified_julian_seconds;
    uint32_t microseconds;
    uint8_t severity_code;
    uint16_t message_code;
    uint32_t specific_fields_length;
} log_header_t;
...

The second part changes from message to message, as it contains the data which is specific to each. In this second part, the message-specific information is almost always encoded in ASCII, unless there are good reasons to use a binary format. We don't have any binary fields so far. The fields themselves are separated by a zero octet.

The next subsection contains a basic Python example to process the logs.

Reading off the logs with Python

Here is a very basic example for reading the logs using Python, but you can build similar code using the programming language of your choice.

This example uses Python's standard struct library and redis-py, an external library that can be installed via pip.

import struct
import redis

LOG_MESSAGE_FORMAT=b"<8sQLBHI"

sr = redis.StrictRedis(
    unix_socket_path="../frontend/.shimmercat.loves.devs/redis.sock"
    )

ps = sr.pubsub()
ps.psubscribe("sc_log_*")

for msg in ps.listen():
    if msg['type'] == 'pmessage':
        data = msg['data']
        log_header_data = data[:27]
        log_data_fields = data[27:].split(b'\0')
        log_header = struct.unpack(LOG_MESSAGE_FORMAT, log_header_data)
        print(log_header, log_data_fields)

When you try this example, don't forget to adjust the path to the Redis socket according to your setup. Here is how the output of the above script looks:

(b'wrk-2...', 4969695455, 315955, 5, 31002, 36) [b'2', b'200', b'GET', b'/en/info/articles/logging/']
(b'wrk-2...', 4969695455, 319853, 5, 50000, 90) [b'2', b'GET', b'www.shimmercat.com', b'/en/info/articles/logging/', b'HTTP/2  ', b'(200/OK) (pushing 27 streams)']
(b'wrk-2...', 4969695455, 352353, 5, 50000, 81) [b'2', b'GET', b'www.shimmercat.com', b'/jsapp/timing_for_info.js?vh=39d02dc16', b'HTTP/2  ', b'(200/OK)']
(b'wrk-2...', 4969695455, 397045, 5, 50000, 61) [b'2', b'GET', b'www.shimmercat.com', b'/sass/main.css.map', b'HTTP/2  ', b'(200/OK)']