Customer Success Story: When Automation Fails

Customer Success Story: When Automation Fails

Posted on October 09, 2017 0 Comments

Almost every vendor, Flowmon included, claims its NPMD solution delivers automation, machine learning, context analytics and other modern features. So, it is easy for admins to handle networks today, right? Well, it is not and feedback I get from Level 3+ engineers of 50 thousand people bank proves that sometimes automation is not enough.

Is there anyone interesting you could suggest to me for a customer testimonial? I asked the team in our office. As it happened, two Level 3+ engineers from a bank with 50 thousand employees, who had recently become Flowmon customers, were visiting our offices in Brno. After a short hesitation, so typical for technical guys, they agreed to be interviewed.

We are talking about a globally successful banking company – a kind of institution that really needs to bear confidentiality and security in mind. So, it was natural that the interviewed engineers did not want to disclose their names or even the name of the bank.

“We are a team of two Level 3+ engineers. Our job is root cause analysis of network issues that no other team before could solve,” says a good looking, about 40-year-old man in a short-sleeved shirt, sitting in front of me. Let us call him Tim. He explains that he and his colleague sitting to my right, we will call him Clarke, are a kind of “scene investigation” experts, a special team that provides answers to “why and how”. They are not involved in fixing bugs, administration or configuration. “We deal with hard, complicated issues. For example, last time we had to figure out why a VPN connection between the customer and the bank, both on different continents, had outages. Our network operations center and security operations center and even no engineering team could deal with that. So, we performed forensics using packet analysis.”

It is highly probable that every CTO would pay the weight in gold for these guys. During our interview I quickly recognized that Tim and Clarke are not really involved in the “automation” and “context rich analysis” game of today’s NPMD scene. The work of these packet analysis experts cannot be automated. It will always be hard manual work. To be honest, I was a bit confused why they deployed Flowmon. “Of course, Flowmon has features for packet capture, but this is not the solution for deep packet analysis. So, what were the reasons behind choosing Flowmon?” I asked myself. 

Petabytes of data crunched by WireShark

Sitting down the entire day, digging in Wireshark… this is a nightmare when you have to go through petabytes of data from hundreds of different systems in a complicated and heterogeneous environment. Before actual packet forensics, Tim and Clarke need to narrow down to the point of interest as much as possible - to a specific time-frame or flow data sample - to minimize the mean-time-to-investigate.

“Ten years ago, we built a whole platform to help us to do this. It was based on a commercial tool for continuous packet capture, a customized open source software for flow monitoring, and an SNMP based tool showing data transfer heat maps in real time. We were a kind of special unit, so everything fitted to our specific needs,” says Clarke. Soon it became clear that maintaining, supporting and upgrading the homemade solution to fit everyday operations were too expensive and time consuming. “So, we sought NetFlow/IPFIX technology to replace the original solution.” 

The Objectives

The problem was no vendor could cover the features that Tim and Clarke desired. “We looked for a solution to completely replace the original one. Although budget was not an issue, the choice was not as easy as we thought. Sometimes the problem was data aggregation, sometimes no virtualization, but always it was the slowness in terms of the time needed to provide measured results,” names Clarke as the most common deal breakers.

I mentioned all the vendors I could remember. The engineers sitting in front of me always said they had installed all of them and were used by NOC and other operations teams. For their purpose though, none of the solutions could satisfy their needs. What were the objectives? They needed a solution that: 

provided not only context-aware top-level dashboards, but one that allowed for manual drill down to any flow.

did not aggregate stored data and kept the raw flows for as long as the storage lasted.

was not necessarily dependent on its own sensors, since they could not afford to be locked into  a single technology due to the heterogeneous environment.

may be virtualized, so management and migration would be as flexible as possible.

combined flow monitoring with on-demand full packet capture.

provided, most importantly, outputs of the measured statistics of the flow data faster than the platform they built themselves ten years ago from an opensource tool.

Then the team came across Flowmon. “Since the proof-of-concept project, Flowmon has become the fundamental tool in our set. It has become the root cause investigation of workflow itself, as we start with dashboard, go into top-level statistics, deeper onto levels of NetFlow and then only focus on a small portion of traffic, for which we have full packet trace.” 

Complete network visibility pays off

I always thought of Flowmon as the solution that saves time by automating tasks like detecting incidents, presenting data in an understandable way with out-of-shelf dashboards, etc. From our experience, this is what the market wants in general – everything is becoming faster, and admins need quick, appropriate answers to see the full picture by investigating data on multiple levels - network operations, network security, application, cloud. And everything should be automated as much as possible. For the majority of incidents this is the way how to get things done.

The ability to manually browse masses of raw flow data with ease, not interpreted by algorithms and AI, for many is irreplaceable. Guys like Tim and Clarke do not pay attention to claims about automation, machine learning, etc. The only hard-core network issues they need to deal with is having complete visibility into network traffic. And Flowmon is able to deliver them with such a workflow.

It is quite unusual to hear guys with such experience in packet analysis telling you that Flowmon has changed their workflow and helps them to do their work better. They are not typical end-users of Flowmon – and this is why I like this customer testimonial so much. Our technology is made by professionals with passion, for passionate professionals.

Artur Kane

Artur was a Progress employee.

Comments

Comments are disabled in preview mode.
Topics

Sitefinity Training and Certification Now Available.

Let our experts teach you how to use Sitefinity's best-in-class features to deliver compelling digital experiences.

Learn More
Latest Stories
in Your Inbox

Subscribe to get all the news, info and tutorials you need to build better business apps and sites

Loading animation