As part of Networking Field Day 4, Statseeker presented to the delegates on their network performance monitoring software.
Statseeker is a product has an interesting three pronged proposition:
- Poll all ports all the way to the edge (not just uplinks and WAN)
- Poll them every 60 seconds
- Keep all the data gathered – don’t roll it up and lose the detail
If you run network performance monitoring software right now – maybe SolarWinds, What’s Up Gold, MRTG, VitalNet or similar, you probably recognize that these are somewhat unusual features for a product in this space. But is it any good, and why would you want it?
Statseeker’s main focus is to allow you to monitor every port in your network, right down to the access layer, to track key statistics for each port once a minute, and to retain that data indefinitely (assuming you have enough storage).
Why Edge Ports?
Statseeker’s argument is that because of scaling problems, many people end up removing monitoring on edge ports and limit themselves to key uplinks and WAN ports, thus potentially missing the source of network issues. My experience bears this out – I’ve often had to set up temporary monitoring on specific ports in order to get data to support a troubleshooting effort, because we just haven’t been able to justify polling every port all the time. So at least from the perspective of looking back at recent events, having data on every port could certainly be helpful.
Why Once A Minute?
Monitoring once a minute gives a pretty good level of granularity for the data collection. My experience though has been that trying to set up my NMS tools to poll that often has led to problems on the server side trying to process so much data, and the more ports being polled, the worse it got. The easy solution was to back off the polling to be less frequent.
Despite this, Statseeker claim the product can scale up to pretty much any number of ports without reducing polling intervals, although the server hardware must play a part in that equation. Given that a well loaded access layer switch could have upwards of 500 ports, and knowing how inefficient SNMP can be, I asked whether Statseeker had seen any problems on the polled devices with CPU as a result of this intense polling. The answer was that Statseeker doesn’t poll all interfaces as a block, but instead uses an undisclosed algorithm to try and determine the capability of the device to answer SNMP queries, then spreads out the polling requests over each 60 second period in order to minimize the instantaneous effect of any polling and try to keep the request level to below half of the determined capability. This could be very important not least because you may also still be running other management products that are polling the same devices, so any software that just sucks data as fast as possible could be a real problem. In fact this is likely to be the case, because as we will see, Statseeker is not going to answer every question you have.
Why Retain The Data?
Data “roll up” is something that every other product I’ve used does as standard – when you look back at data from 3 months back, even if data were originally captured on a per minute basis, that detail will have been lost and you’ll be viewing aggregated (i.e. averaged, lower resolution) data instead. With the cost of storage in the past, that policy made a lot of sense, but storage is cheaper now and Statseeker estimates that you’ll need only 1GB of storage for each 1000 interfaces per year of data.
Lots Of Data == Slow Access?
I was surprised, because no – the demo system wasn’t even close to being slow. You access Statseeker via a web browser and can run a range of reports on all monitored elements, or you can drill down to regions, sites, devices, and so forth. Here’s the main interface page (sorry for quality of the picture), and you can see the list of report types on the left hand side, then the drill-down filters in the next columns:
Given the number of ports being monitored (over 20,000 on the demo site), it was fast. I’m used to seeing products that are slow to generate tables of results, and in some cases extremely slow to generate charts from the collected data. Statseeker caught me off guard by generating every page – even those where it was sorting data from all ports - in 2 to 5 seconds. I should note that we viewed this demo in San Jose accessing their live demo system which is based in Australia. The claim is that due to the specially developed “time-series database”, typically any report can be displayed in just a couple of seconds, and based on what I saw this does seem to be accurate. I do not know, of course, how phenomenally powerful their demo server might be and how much investment might be required on a customer’s part to see similar performance. Fast is not necessarily a unique selling point for Statseeker of course – and given a sufficiently well-specified server, other products may be able to perform similarly – but given that there’s no roll up of data, to keep performing that way with per-minute data going back into the past is really quite impressive.
What Doesn’t Statseeker Do?
There’s quite a lot that Statseeker doesn’t do actually, and oddly that’s part of the charm. Statseeker does not claim to be the one solution for all your problems; rather it has a sweet spot – monitoring every port, regularly, and retaining the data – and it does that very well. The presenters were refreshingly candid about where the product excelled, and where it didn’t, and it’s acknowledged that for some things, you may want other products (presumably in addition to theirs, not instead of!).
Trending is not an option in Statseeker – you will not see reports telling you that a port will hit capacity in 3 months. We’re told, somewhat tongue in cheek, that the math is “too hard”. I don’t know quite how to take that response, quite honestly. Other vendors have managed this in one way or another, but I do appreciate the wild approximations that go into any trend line, and that there are a lot of assumptions and compromises required in order to offer trending. It seems a huge shame though to have so much great data and not to be able to forecast at all. The closest you get is the ability to set thresholds that you can use to highlight potential problem ports. SLA and threshold violations can be set to trigger action scripts to send emails or other alerts as needed.
For me, from a usability perspective, Statseeker’s big miss is that there isn’t a single place I can look to find out the health of my network. I know, it’s all a bit “pointy haired boss” to want a traffic light page, but if the person ultimately paying for the tool can see how the network is doing at a glance, presented in a way that won’t require much effort on their part, I believe it assists in justifying the purchase they made.
Arguably, what defines ‘healthy’ may be down to personal preference, but I found myself questioning how best to use the tool. If you have nothing better to do than to spend the day drilling down on the TopN reports in order to eliminate bad things from the network, you can do so quickly and efficiently. On the other hand, when trying to find out where you should start spending your time digging, we come up a bit short. There are threshold reports available, so you can see ports that have crossed thresholds that you define, but it felt like you would have to page through multiple types reports to figure out whether you needed to fix anything, and you might be missing a huge problem that’s a wild exception to previous data because it’s on a report further down the list that you have yet checked.
Statseeker offers a useful report that links MAC addresses with their associated IP and the interface they connect to. This is very handy for troubleshooting, but is only updated every 60 minutes, with no historical data being retained. There are definitely compromises involved in tracking this kind of data, but the ability to look back and say that on a particular day and time, I know where a specific device was plugged in, or which MAC address had a particular IP, could be very powerful. Tie that with the ability to then see what the device did on its connected port on a per-minute data basis, and this could add another dimension to troubleshooting certain kinds of problem.
Picking aside, Statseeker does have a few very neat features. For example, when you look at historical data – say, the last 30 days’ of utilization for a port, in other software I have in the past been presented with the same size chart but with the data for 30 days squeezed into the same x-axis. Statseeker has realized that this just doesn’t work, and they show daily charts all nicely aligned by weekday so that you can easily compare week to week, day to day:
Simple and effective; you can spot patterns very easily this way, and again, the charts appear quickly.
What Gets Monitored?
Statseeker monitors interfaces using a specific set of what they believe are the most common OIDs related to network problems. That list is generally not editable, and that’s kind of the catch throughout the product; while you can customize searches and charting, everything is so optimized in order to provide good performance and data storage, you largely have to accept the decisions made by Statseeker. Whether that’s a bad thing depends on how you view this product. As I said before, it has a relatively limited set of functionality that it appears to perform extremely well. If you’re an inveterate tweaker and customizer though, you’ll want to spend some time with the product and the sales/tech folks in order to determine whether or not you’ll be frustrated down the line.
You provide the hardware to run Statseeker – they don’t offer an appliance. We asked whether it ran well in a VM, and the answer was that while it will run, we should be aware that they use knowledge of the server’s storage controllers to optimize data read/write operations, and those didn’t work on virtualized controllers. In other words, you really want to run this on dedicated hardware. I know, how very 2000s, right?
Do I Need Statseeker?
Statseeker were pretty up front and said that if you have less than 1000 ports to monitor, you’re much better served by other products.1000 ports is where their licensing begins – which would cost somewhere around $8000, and it goes upwards at least to 500,000 ports. It doesn’t just monitor network ports; you can monitor server interfaces, and you can even monitor printer toner status, although whether it’s useful to deploy a tool to monitor toner status by the minute is highly questionable to me. But maybe your toner is more mission critical than mine.
Do you need Statseeker? I have to say that for troubleshooting port behavior, and comparing current activity to previous behavior, this idea of monitoring everything and keeping all the data is really attractive. If you can spend the time browsing for problems and digging into them, it could be a great tool to identify problems. However, it isn’t a proactive tool in my opinion – it isn’t warning you that a port has dramatically changed its behavior; it isn’t giving you a heads up that you’re rapidly heading towards a capacity problem on a link; it isn’t giving you an “at a glance” dashboard on your network status. On the other hand, I’d argue that monitoring the number of ports it can handle, as frequently as it does, while retaining all data indefinitely, gives it a unique place in your NMS arsenal. But whatever happens, you need to be clear that it probably won’t be your only tool.
What Statseeker does, it does well – and it may be the fastest tool I’ve seen so far, with drill downs, data display, charting, and re-sorting all operating amazingly quickly. What it doesn’t do, well, you’re going to have to fill the gap with something else. And if you have less than 1000 ports, frankly you can make mrtg or similar do the same thing without having to scale your NMS server up to an insane degree. Either way, I can see that Statseeker will definitely be on my list of vendors to include next time I’m involved in an a network performance monitoring product selection process.
Watch The Presentation from NFD4
NFD4 presentations are streamed live, then offered for online viewing later. If you’re interested to hear more about the product, have a look at the video of the presentations at the link below, and make up your own mind:
Get Some Other Points of View
Here are some blog posts from other NFD4 delegates so you can get their take on Statseeker:
- Network Management System NMS – Statseeker (Brent Salisbury, @networkstatic)
- When You Get Into A Fight The Person With The Best Notes Wins (Anthony Burke, @pandom_)
- Statseeker – Does It Fit Your Organization? (Paul Stewart, @packetu)
I’ll update this list as I become aware of other posts, and these are good blogs to subscribe to even if Statseeker isn’t your primary interest!
Statseeker was a presenter at Networking Field Day 4, and while I received no compensation for my attendance at this event, my travel, accommodation and meals were provided. I was explicitly not required or obligated to blog, tweet, or otherwise write about or endorse the sponsors, but if I choose to do so I am free to give my honest opinions about the vendors and their products, whether positive or negative.
Please see my Disclosures page for more information.