There is one factor which is crucial when we are considering capacity planning for a proxy server and that is the ‘peak load’. This point will vary greatly between organisations or companies yet it is essential that it is first identified and measured correctly. For some networks it might be first thing in the morning as employees start off their work particularly if the proxy is utilised for accessing any corporate internet based applications. In other situations you might find that a proxy is relatively idle until lunchtime when users start browsing the web in their free time. For an ISP it might be completely different with a relatively quiet day followed by a surge in the evenings when users start logging on and downloading torrents.
What’s important is that this time is identified and the maximum capacity for the server determined accurately. To do this you need to look at the proxy logs which should contain the majority of the information you require. These can be a little difficult to follow unfortunately particularly in their raw format, much of the text will merge into each other and it can be difficult to read even if it includes deliminators. You can import into a spreadsheet program but it’s usually simpler if you can get access to a decent log analyser which will highlight specific details and issues. You might find that a few users are responsible for server lag by streaming the American version of Netflix to their desktops! Most of these will provide hourly statistics and many of the paid ones will automatically determine the peak periods of use – for example the peak 5 minutes and peak hour are valuable pieces of information.
If you don’t have access to an analyser for your proxy logs then save the files in a separate file and look at them individually. If you use something like the Unix wc command you can count the lines in the log file to work out the time space that it refers to. Count the time space and then usually dividing by 3600 will give the number of request per second. Although logs will give you a lot of the information you need, it’s a good idea to monitor the performance of the proxy in real time too. Logs will only give you a historical account but you can find tools which will monitor how the proxy is performing at any given time. Some tools which have been available for this are things like Netscape’s excellent sitemon utility which will display the status of the server processes on your proxy plus the current load.
There are different versions of these tools available in Unix, Windows systems and related specifically to proxies like Apache and IIS. In corporate networks it’s worth checking out what is available for SNMP (Simple Network Management Protocol) as there are many performance statistics available here too which will likely cover the proxy server. If all else fails the simple Unix command netstat will give you a decent indication of how the proxy is performing in real time.