Asynchronous handlers with callbacks
In the early days of Python, writing such a state machine would be based on callbacks or some class methods. An object-oriented implementation would be implemented via two classes: a Channel class and a Parser class. The Channel class reads data from a network link and notifies a listener when some data is available. Here is a simple implementation of this class:
class Channel(object):
def set_data_callback(self, callback):
self.callback = callback
def notify(self, value):
self.callback(value)
It contains two methods:
- The set_data_callback method allows you to register a callback (it is also often called a listener or an observer) which is called whenever some data is available
- The notify method is a hook that is used here to simulate incoming data
In a real implementation, the Channel class would wait for incoming data being available and then call the callback.
The Parser class is the implementation of the state machine. Its implementation is the following one:
class Parser(object):
def __init__(self, transport):
transport.set_data_callback(self.on_data)
self.state = self.sync
self.remaining_size = 0
def on_data(self, data):
self.state(data)
def error(self, data):
print("error: {}".format(data))
def sync(self, data):
if data == 42:
self.state = self.size
else:
self.state = self.error
def size(self, data):
self.remaining_size = data
self.state = self.payload
def payload(self, data):
if self.remaining_size > 0:
print("payload: {}".format(data))
self.remaining_size -= 1
if self.remaining_size <= 0:
self.state = self.sync
It is composed of five methods and the constructor. Four of these methods correspond to the four states of the state machine. The constructors has a transport input parameter. This parameter contains an object of type Channel or any other object that exposes the same API as the Channel class. In the constructor, the parser registers itself as a listener of the transport object.
Note that the self.on_data syntax registers a bound method as a listener. This means that when the transport object will call the callback, the on_data method of the object referenced by self will be called. This is a built-in feature of Python, but it is not the case for all programming languages. The C language will never support this, and C++ started to support it through function objects in C++11. Without such a feature, registering a listener is more complex. Either the registration method must have two parameters, one to provide the function to call and another one to give the context to the listener, or the Parser class must inherit from a Listener base class (as defined by the observer design pattern). So writing this kind of code with Python has always been easier than with some other programming languages.
The constructor then initializes the current state to sync, and then sets the size of the payload to 0. Bound methods are also used to store the current state as a reference to a method of the Parser object. This allows you to dispatch incoming data to the correct state with a single function call.
The on_data method is the data callback. Each time a new value is available on the transport object, then this method is being called. This method is just a trampoline that forwards the data to the current state handler.
Then each state is implemented as a method. The error state just prints an error. The sync state transits to the size or error state, depending on the value of the received data. The size state initializes the number of values to receive in the payload and transits to the payload state. Finally, the payload state prints the data of the payload and goes back to the sync state once all the payload data has been received.
This implementation can be tested with the following code:
s = Channel()
p = Parser(s)
s.notify(42)
s.notify(3)
s.notify(33)
s.notify(44)
s.notify(24)
s.notify(43)
s.notify(4)
Note that the notify method is here to simulate events that should be asynchronous. The following values should be printed:
payload: 33 payload: 44 payload: 24 error! Done
The payload of the first packet is printed,but then the sync value of the second packet is not correct so an error is printed. This is an implementation of a very simple state machine, but following the code flow is not immediate. One has to read the code of each state handler to understand what will be the next active handler. When such code is used on big state machines or hierarchical state machines, it is hard to debug and evolve it. However, the benefit of such a design is that a state machine is deterministic, so one can test all its transitions with unit testing.