PyCounters is pure python. All you need is to run easy_install (or pip):
easy_install pycounters
Of course, you can always checkout the code from BitBucket on https://bitbucket.org/bleskes/pycounters
PyCounters is a library to help you collect interesting metrics from production code. As an case study for this tutorial, we will use a simple Python-based server (taken from the python docs):
import SocketServer
class MyTCPHandler(SocketServer.BaseRequestHandler):
"""
The RequestHandler class for our server.
It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print "%s wrote:" % self.client_address[0]
print self.data
# just send back the same data, but upper-cased
self.request.send(self.data.upper())
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
# Create the server, binding to localhost on port 9999
server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
Both of these metrics are connected to the handle method of the MyTCPHandler class in the example. The number of requests per second the server serves is exactly the number of times the handle() method is called. The average time for handling a request is exactly the average execution time of handle()
Both of these metrics are measure by decorating handle() the shortcut decorators frequency and time:
import SocketServer
from pycounters import shortcuts
class MyTCPHandler(SocketServer.BaseRequestHandler):
...
@shortcuts.time("requests_time")
@shortcuts.frequency("requests_frequency")
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print "%s wrote:" % self.client_address[0]
print self.data
# just send back the same data, but upper-cased
self.request.send(self.data.upper())
Note
Now that the metrics are being collected, they need to be reported. This is the job of the reporters. In this example, we’ll save a report every 5 minutes to a JSON file at /tmp/server.counters.json (check out the Reporters section for other options). To do so, create an instance of JSONFileReporter when the server starts:
import SocketServer
from pycounters import shortcuts, reporters, start_auto_reporting, register_reporter
....
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
JSONFile = "/tmp/server.counters.json"
reporter = reporters.JSONFileReporter(output_file=JSONFile)
register_reporter(reporter)
start_auto_reporting()
# Create the server, binding to localhost on port 9999
server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
Note
To make pycounters periodically output a report you must call start_auto_reporting()
By default auto reports are generated every 5 minutes (change that by using the seconds parameter of start_auto_reporting() ). After five minutes the reporter will save it’s report. Here is an example of the contest of /tmp/server.counters.json:
{"requests_time": 0.00039249658584594727, "requests_frequency": 0.014266581369872909}
Average request time and request frequency were both nicely measured by decorating MyTCPHandler::handle(). Some metrics do not fit as nicely into the decorator model.
The server in our example receives a string from the a client and returns it upper_cased. Say we want to measure the average number of characters the server processes. To achieve this we can use another shortcut function value:
import SocketServer
from pycounters import shortcuts
class MyTCPHandler(SocketServer.BaseRequestHandler):
...
@shortcuts.time("requests_time")
@shortcuts.frequency("requests_frequency")
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print "%s wrote:" % self.client_address[0]
print self.data
# measure the average length of data
shortcuts.value("requests_data_len",len(self.data))
# just send back the same data, but upper-cased
self.request.send(self.data.upper())
Until now, the shortcut decorators and functions were perfect for what we wanted to do. Naturally, this is not always the case. Before going on, it is handy to explain more about these shortcuts and how PyCounters work (see Moving Parts for more about this).
PyCounters is built of three main building blocks:
process them and the number of bytes the processed).
Counters - to capture events and analyse them (in the example: measuring requests per second, averaging request processing time and averaging the number of bytes processed per request).
Reporters - to periodically generate a report of all active counters.
PyCounters’ shortcuts will both report events and create a counter to analyse it. Every shortcut has a default counter type but you can override it (see Shortcuts). For example, say we wanted to measure the total number of bytes the server has processed rather than the average. To achieve this, the “requests_data_len” counter needs to be changed to TotalCounter. The easiest way to achieve this is to add a parameter to the shortcut shortcuts.value("requests_data_len",len(data),auto_add_counter=TotalCounter) (don’t forget to change your imports too). However, we will go another way about it.
PyCounter’s event reporting is very light weight. It practically does nothing if no counter is defined to capture those events. Because of this, it is a good idea to report all important events through the code and choose later what you exactly want analyzed. To do this we must separate event reporting from the definition of counters.
Note
When you create a counter, it will by default listen to one event, named exactly as the counter’s name. However, if the events parameter is passed to a counter at initialization, it will listen only to the specified events.
Note
This approach also means you can analyze things differently on a single thread, by installing thread specific counters. For example, trace a specific request more heavily due to some debug flag. Thread specific counters are not currently available but will be in the future.
Reporting an event without defining a counter is done by using one of the functions described under Event reporting . Since we want to report a value, we will use pycounters.report_value():
import SocketServer
from pycounters import shortcuts,reporters,report_value
class MyTCPHandler(SocketServer.BaseRequestHandler):
...
@shortcuts.time("requests_time")
@shortcuts.frequency("requests_frequency")
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print "%s wrote:" % self.client_address[0]
print self.data
# measure the average length of data
report_value("requests_data_len",len(self.data))
# just send back the same data, but upper-cased
self.request.send(self.data.upper())
To add the TotalCounter counter, we change the initialization part of the code:
import SocketServer
from pycounters import shortcuts, reporters, report_value,counters, register_counter, start_auto_reporting, register_reporter
....
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
JSONFile = "/tmp/server.counters.json"
data_len_counter = counters.TotalCounter("requests_data_len") # create the counter
register_counter(data_len_counter) # register it, so it will start processing events
reporter = reporters.JSONFileReporter(output_file=JSONFile)
register_reporter(reporter)
start_auto_reporting()
# Create the server, binding to localhost on port 9999
server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
Here is the complete code with all the changes so far (also available at the PyCounters repository ):
import SocketServer
from pycounters import shortcuts, reporters, register_counter, counters, report_value, register_reporter, start_auto_reporting
class MyTCPHandler(SocketServer.BaseRequestHandler):
"""
The RequestHandler class for our server.
It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
@shortcuts.time("requests_time")
@shortcuts.frequency("requests_frequency")
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print "%s wrote:" % self.client_address[0]
print self.data
# measure the average length of data
report_value("requests_data_len",len(self.data))
# just send back the same data, but upper-cased
self.request.send(self.data.upper())
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
JSONFile = "/tmp/server.counters.json"
data_len_counter = counters.TotalCounter("requests_data_len") # create the counter
register_counter(data_len_counter) # register it, so it will start processing events
reporter = reporters.JSONFileReporter(output_file=JSONFile)
register_reporter(reporter)
start_auto_reporting()
# Create the server, binding to localhost on port 9999
server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
In the above example, the MyTCPHandler::handle method is decorated with two short functions: frequency and time: . This is the easiest way to set up PyCounters to measure things but it has some down sides. First, every shortcut decorate throws it’s own events. That means that for every execution of the handle method, four events are sent. That is inefficient. Second, and more importantly, it also means that Counters definition are spread around the code.
In bigger projects it is better to separate event throwing from counting. For example, we can decorate the handle function with report_start_end:
@pycounters.report_start_end("request")
def handle(self):
# self.request is the TCP socket connected to the client
And define two counters to analyze ‘different’ statistics about this function:
avg_req_time = counters.AverageTimeCounter("requests_time",events=["request"])
register_counter(avg_req_time)
req_per_sec = counters.FrequencyCounter("requests_frequency",events=["request"])
register_counter(req_per_sec)
Note
Multiple counters with different names can be set up to analyze the same event using the events argument in their constructor.
Doing things this way has a couple of advantages:
- It is conceptually cleaner - you report what happened and measure multiple aspects of it
- It is more flexible - you can easily analyse more things about your code by simply adding counters.
- You can decide at runtime what to measure (by changing registered counters)
In this example we will create a few counters listening to the same events. Let say, we want to get maximum, minimum, average and sum of values of request data length in 15 minutes window. To achieve this, we need to create 4 counters, all of them listening to ‘requests_data_len’ event.
import SocketServer
from pycounters import shortcuts, reporters, register_counter, counters, report_value, register_reporter, start_auto_reporting
class MyTCPHandler(SocketServer.BaseRequestHandler):
"""
The RequestHandler class for our server.
It is instantiated once per connection to the server, and must
override the handle() method to implement communication to the
client.
"""
@shortcuts.time("requests_time")
@shortcuts.frequency("requests_frequency")
def handle(self):
# self.request is the TCP socket connected to the client
self.data = self.request.recv(1024).strip()
print "%s wrote:" % self.client_address[0]
print self.data
# measure the average length of data
report_value("requests_data_len",len(self.data))
# just send back the same data, but upper-cased
self.request.send(self.data.upper())
if __name__ == "__main__":
HOST, PORT = "localhost", 9999
JSONFile = "/tmp/server.counters.json"
data_len_avg_counter = counters.AverageWindowCounter("requests_data_len_avg",\
events=["requests_data_len"], window_size=900) # create the avarage window counter
register_counter(data_len_avg_counter) # register it, so it will start processing events
data_len_total_counter = counters.WindowCounter("requests_data_len_total",\
events=["requests_data_len"], window_size=900) # create the window sum counter
register_counter(data_len_total_counter)
data_len_max_counter = counters.MaxWindowCounter("requests_data_len_max",\
events=["requests_data_len"], window_size=900) # create the max window counter
register_counter(data_len_max_counter)
data_len_min_counter = counters.MinWindowCounter("requests_data_len_min",\
events=["requests_data_len"], window_size=900) # create the min window counter
register_counter(data_len_min_counter)
reporter = reporters.JSONFileReporter(output_file=JSONFile)
register_reporter(reporter)
start_auto_reporting()
# Create the server, binding to localhost on port 9999
server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler)
# Activate the server; this will keep running until you
# interrupt the program with Ctrl-C
server.serve_forever()
You can change size of window by specifying different window_size parameter when creating a counter.
In the example so far, we’ve outputted the collected metrics to a JSON file. Using that JSON file, we can easily build simple tools to report the metrics further. The Utilities reference package contains a set of utilities to help building such tools.
At the moment, PyCounter comes with a utility to help writing munin plugins. Here is an example of a munin plugin that taks the JSON report procude by the Tutorial and presents it in the way munin understands:
#!/usr/bin/python
from pycounters.utils.munin import Plugin
config = [
{
"id" : "requests_per_sec",
"global" : {
# graph global options: http://munin-monitoring.org/wiki/protocol-config
"title" : "Request Frequency",
"category" : "PyCounters example"
},
"data" : [
{
"counter" : "requests_frequency",
"label" : "requests per second",
"draw" : "LINE2",
}
]
},
{
"id" : "requests_time",
"global" : {
"title" : "Request Average Handling Time",
"category" : "PyCounters example"
},
"data" : [
{
"counter" : "requests_time",
"label" : "Average time per request",
"draw" : "LINE2",
}
]
},
{
"id" : "requests_total_data",
"global" : {
"title" : "Total data processed",
"category" : "PyCounters example"
},
"data" : [
{
"counter" : "requests_data_len",
"label" : "total bytes",
"draw" : "LINE2",
}
]
}
]
p = Plugin("/tmp/server.counters.json",config) # initialize the plugin
p.process_cmd() # process munin command and output requested data or config
Try it out (after the server has run for more than 5 minutes and a report was outputted to the JSON file) by running python munin_plugin config and python munin_plugin .
Some application (like a web server) do not run in a single process. Still, you want to collect global metrics like the ones discussed before in this tutorial.
PyCounters supports aggreating information from multiple running processes. To do so call pycounters.configure_multi_process_collection() on every process you want to aggregate data from. The parameters to this method will tell PyCounters what port to use for aggregation and, if running on multiple servers, which server to collect data on.