.. _tutorial: ========================== Tutorial ========================== --------------------- Installing pycounters --------------------- PyCounters is pure python. All you need is to run easy_install (or pip): :: easy_install pycounters Of course, you can always checkout the code from BitBucket on https://bitbucket.org/bleskes/pycounters --------------------- Introduction --------------------- PyCounters is a library to help you collect interesting metrics from production code. As an case study for this tutorial, we will use a simple Python-based server (taken from the `python docs `_): :: import SocketServer class MyTCPHandler(SocketServer.BaseRequestHandler): """ The RequestHandler class for our server. It is instantiated once per connection to the server, and must override the handle() method to implement communication to the client. """ def handle(self): # self.request is the TCP socket connected to the client self.data = self.request.recv(1024).strip() print "%s wrote:" % self.client_address[0] print self.data # just send back the same data, but upper-cased self.request.send(self.data.upper()) if __name__ == "__main__": HOST, PORT = "localhost", 9999 # Create the server, binding to localhost on port 9999 server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler) # Activate the server; this will keep running until you # interrupt the program with Ctrl-C server.serve_forever() ---------------------- Step 1 - Adding Events ---------------------- For this basic server, we will add events to report the following metrics: * Number of requests per second * Average time for handling a request Both of these metrics are connected to the handle method of the MyTCPHandler class in the example. The number of requests per second the server serves is exactly the number of times the handle() method is called. The average time for handling a request is exactly the average execution time of handle() Both of these metrics are measure by decorating handle() the :ref:`shortcut ` decorators :meth:`frequency ` and :meth:`time `: :: import SocketServer from pycounters import shortcuts class MyTCPHandler(SocketServer.BaseRequestHandler): ... @shortcuts.time("requests_time") @shortcuts.frequency("requests_frequency") def handle(self): # self.request is the TCP socket connected to the client self.data = self.request.recv(1024).strip() print "%s wrote:" % self.client_address[0] print self.data # just send back the same data, but upper-cased self.request.send(self.data.upper()) .. note:: * Every decorator is given a name ("requests_time" and "requests_frequency"). These names will come back in the report generated by PyCounters. More on this in the next section. * The shortcut decorators actually do two things - report events and add counters for them. For now, it's OK but you might want to separate the two. More on this later in the tutorial ------------------------ Step 2 - Reporting ------------------------ Now that the metrics are being collected, they need to be reported. This is the job of the :ref:`reporters `. In this example, we'll save a report every 5 minutes to a JSON file at /tmp/server.counters.json (check out the :ref:`reporters` section for other options). To do so, create an instance of :class:`JSONFileReporter ` when the server starts: :: import SocketServer from pycounters import shortcuts, reporters, start_auto_reporting, register_reporter .... if __name__ == "__main__": HOST, PORT = "localhost", 9999 JSONFile = "/tmp/server.counters.json" reporter = reporters.JSONFileReporter(output_file=JSONFile) register_reporter(reporter) start_auto_reporting() # Create the server, binding to localhost on port 9999 server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler) # Activate the server; this will keep running until you # interrupt the program with Ctrl-C server.serve_forever() .. note:: To make pycounters periodically output a report you must call start_auto_reporting() By default auto reports are generated every 5 minutes (change that by using the seconds parameter of start_auto_reporting() ). After five minutes the reporter will save it's report. Here is an example of the contest of /tmp/server.counters.json: :: {"requests_time": 0.00039249658584594727, "requests_frequency": 0.014266581369872909} ---------------------------------------------------------- Step 3 - Counters and reporting events without a decorator ---------------------------------------------------------- Average request time and request frequency were both nicely measured by decorating MyTCPHandler::handle(). Some metrics do not fit as nicely into the decorator model. The server in our example receives a string from the a client and returns it upper_cased. Say we want to measure the average number of characters the server processes. To achieve this we can use another shortcut function :meth:`value `: :: import SocketServer from pycounters import shortcuts class MyTCPHandler(SocketServer.BaseRequestHandler): ... @shortcuts.time("requests_time") @shortcuts.frequency("requests_frequency") def handle(self): # self.request is the TCP socket connected to the client self.data = self.request.recv(1024).strip() print "%s wrote:" % self.client_address[0] print self.data # measure the average length of data shortcuts.value("requests_data_len",len(self.data)) # just send back the same data, but upper-cased self.request.send(self.data.upper()) Until now, the shortcut decorators and functions were perfect for what we wanted to do. Naturally, this is not always the case. Before going on, it is handy to explain more about these shortcuts and how PyCounters work (see :ref:`moving_parts` for more about this). PyCounters is built of three main building blocks: * *Events* - to reports values and occurrences in your code (in the example: incoming request, the time it took to process them and the number of bytes the processed). * *Counters* - to capture events and analyse them (in the example: measuring requests per second, averaging request processing time and averaging the number of bytes processed per request). * *Reporters* - to periodically generate a report of all active counters. PyCounters' shortcuts will both report events and create a counter to analyse it. Every shortcut has a default counter type but you can override it (see :ref:`shortcuts`). For example, say we wanted to measure the *total* number of bytes the server has processed rather than the average. To achieve this, the "requests_data_len" counter needs to be changed to :class:`TotalCounter `. The easiest way to achieve this is to add a parameter to the shortcut ``shortcuts.value("requests_data_len",len(data),auto_add_counter=TotalCounter)`` (don't forget to change your imports too). However, we will go another way about it. PyCounter's event reporting is very light weight. It practically does nothing if no counter is defined to capture those events. Because of this, it is a good idea to report all important events through the code and choose later what you exactly want analyzed. To do this we must separate event reporting from the definition of counters. .. Note:: When you create a counter, it will by default listen to one event, *named exactly as the counter's name*. However, if the events parameter is passed to a counter at initialization, it will listen *only* to the specified events. .. Note:: This approach also means you can analyze things differently on a single thread, by installing thread specific counters. For example, trace a specific request more heavily due to some debug flag. Thread specific counters are not currently available but will be in the future. Reporting an event without defining a counter is done by using one of the functions described under :ref:`event_reporting` . Since we want to report a value, we will use :meth:`pycounters.report_value`: :: import SocketServer from pycounters import shortcuts,reporters,report_value class MyTCPHandler(SocketServer.BaseRequestHandler): ... @shortcuts.time("requests_time") @shortcuts.frequency("requests_frequency") def handle(self): # self.request is the TCP socket connected to the client self.data = self.request.recv(1024).strip() print "%s wrote:" % self.client_address[0] print self.data # measure the average length of data report_value("requests_data_len",len(self.data)) # just send back the same data, but upper-cased self.request.send(self.data.upper()) To add the :class:`TotalCounter ` counter, we change the initialization part of the code: :: import SocketServer from pycounters import shortcuts, reporters, report_value,counters, register_counter, start_auto_reporting, register_reporter .... if __name__ == "__main__": HOST, PORT = "localhost", 9999 JSONFile = "/tmp/server.counters.json" data_len_counter = counters.TotalCounter("requests_data_len") # create the counter register_counter(data_len_counter) # register it, so it will start processing events reporter = reporters.JSONFileReporter(output_file=JSONFile) register_reporter(reporter) start_auto_reporting() # Create the server, binding to localhost on port 9999 server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler) # Activate the server; this will keep running until you # interrupt the program with Ctrl-C server.serve_forever() --------------------------- Step 4 - A complete example --------------------------- Here is the complete code with all the changes so far (also available at the PyCounters `repository `_ ): :: import SocketServer from pycounters import shortcuts, reporters, register_counter, counters, report_value, register_reporter, start_auto_reporting class MyTCPHandler(SocketServer.BaseRequestHandler): """ The RequestHandler class for our server. It is instantiated once per connection to the server, and must override the handle() method to implement communication to the client. """ @shortcuts.time("requests_time") @shortcuts.frequency("requests_frequency") def handle(self): # self.request is the TCP socket connected to the client self.data = self.request.recv(1024).strip() print "%s wrote:" % self.client_address[0] print self.data # measure the average length of data report_value("requests_data_len",len(self.data)) # just send back the same data, but upper-cased self.request.send(self.data.upper()) if __name__ == "__main__": HOST, PORT = "localhost", 9999 JSONFile = "/tmp/server.counters.json" data_len_counter = counters.TotalCounter("requests_data_len") # create the counter register_counter(data_len_counter) # register it, so it will start processing events reporter = reporters.JSONFileReporter(output_file=JSONFile) register_reporter(reporter) start_auto_reporting() # Create the server, binding to localhost on port 9999 server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler) # Activate the server; this will keep running until you # interrupt the program with Ctrl-C server.serve_forever() --------------------------------------- Step 5 - More about Events and Counters --------------------------------------- In the above example, the MyTCPHandler::handle method is decorated with two short functions: :meth:`frequency ` and :meth:`time `: . This is the easiest way to set up PyCounters to measure things but it has some down sides. First, every shortcut decorate throws it's own events. That means that for every execution of the handle method, four events are sent. That is inefficient. Second, and more importantly, it also means that Counters definition are spread around the code. In bigger projects it is better to separate event throwing from counting. For example, we can decorate the handle function with :meth:`report_start_end `: :: @pycounters.report_start_end("request") def handle(self): # self.request is the TCP socket connected to the client And define two counters to analyze 'different' statistics about this function: :: avg_req_time = counters.AverageTimeCounter("requests_time",events=["request"]) register_counter(avg_req_time) req_per_sec = counters.FrequencyCounter("requests_frequency",events=["request"]) register_counter(req_per_sec) .. note:: Multiple counters with different names can be set up to analyze the same event using the events argument in their constructor. Doing things this way has a couple of advantages: * It is conceptually cleaner - you report what happened and measure multiple aspects of it * It is more flexible - you can easily analyse more things about your code by simply adding counters. * You can decide at runtime what to measure (by changing registered counters) ----------------------------------------------------- Step 6 - Another example of using Events and Counters ----------------------------------------------------- In this example we will create a few counters listening to the same events. Let say, we want to get maximum, minimum, average and sum of values of request data length in 15 minutes window. To achieve this, we need to create 4 counters, all of them listening to 'requests_data_len' event. :: import SocketServer from pycounters import shortcuts, reporters, register_counter, counters, report_value, register_reporter, start_auto_reporting class MyTCPHandler(SocketServer.BaseRequestHandler): """ The RequestHandler class for our server. It is instantiated once per connection to the server, and must override the handle() method to implement communication to the client. """ @shortcuts.time("requests_time") @shortcuts.frequency("requests_frequency") def handle(self): # self.request is the TCP socket connected to the client self.data = self.request.recv(1024).strip() print "%s wrote:" % self.client_address[0] print self.data # measure the average length of data report_value("requests_data_len",len(self.data)) # just send back the same data, but upper-cased self.request.send(self.data.upper()) if __name__ == "__main__": HOST, PORT = "localhost", 9999 JSONFile = "/tmp/server.counters.json" data_len_avg_counter = counters.AverageWindowCounter("requests_data_len_avg",\ events=["requests_data_len"], window_size=900) # create the avarage window counter register_counter(data_len_avg_counter) # register it, so it will start processing events data_len_total_counter = counters.WindowCounter("requests_data_len_total",\ events=["requests_data_len"], window_size=900) # create the window sum counter register_counter(data_len_total_counter) data_len_max_counter = counters.MaxWindowCounter("requests_data_len_max",\ events=["requests_data_len"], window_size=900) # create the max window counter register_counter(data_len_max_counter) data_len_min_counter = counters.MinWindowCounter("requests_data_len_min",\ events=["requests_data_len"], window_size=900) # create the min window counter register_counter(data_len_min_counter) reporter = reporters.JSONFileReporter(output_file=JSONFile) register_reporter(reporter) start_auto_reporting() # Create the server, binding to localhost on port 9999 server = SocketServer.TCPServer((HOST, PORT), MyTCPHandler) # Activate the server; this will keep running until you # interrupt the program with Ctrl-C server.serve_forever() You can change size of window by specifying different window_size parameter when creating a counter. ------------------------ Step 7 - Utilities ------------------------ In the example so far, we've outputted the collected metrics to a JSON file. Using that JSON file, we can easily build simple tools to report the metrics further. The :ref:`pycounters_utils` package contains a set of utilities to help building such tools. At the moment, PyCounter comes with a utility to help writing `munin `_ plugins. Here is an example of a munin plugin that taks the JSON report procude by the Tutorial and presents it in the way munin understands: :: #!/usr/bin/python from pycounters.utils.munin import Plugin config = [ { "id" : "requests_per_sec", "global" : { # graph global options: http://munin-monitoring.org/wiki/protocol-config "title" : "Request Frequency", "category" : "PyCounters example" }, "data" : [ { "counter" : "requests_frequency", "label" : "requests per second", "draw" : "LINE2", } ] }, { "id" : "requests_time", "global" : { "title" : "Request Average Handling Time", "category" : "PyCounters example" }, "data" : [ { "counter" : "requests_time", "label" : "Average time per request", "draw" : "LINE2", } ] }, { "id" : "requests_total_data", "global" : { "title" : "Total data processed", "category" : "PyCounters example" }, "data" : [ { "counter" : "requests_data_len", "label" : "total bytes", "draw" : "LINE2", } ] } ] p = Plugin("/tmp/server.counters.json",config) # initialize the plugin p.process_cmd() # process munin command and output requested data or config Try it out (after the server has run for more than 5 minutes and a report was outputted to the JSON file) by running ``python munin_plugin config`` and ``python munin_plugin`` . ----------------------------- Step 8 - Multiprocess support ----------------------------- Some application (like a web server) do not run in a single process. Still, you want to collect global metrics like the ones discussed before in this tutorial. PyCounters supports aggreating information from multiple running processes. To do so call :meth:`pycounters.configure_multi_process_collection` on every process you want to aggregate data from. The parameters to this method will tell PyCounters what port to use for aggregation and, if running on multiple servers, which server to collect data on.