Provide historical n/w metrics in TS7700 historical view

We experienced recent performance issues in our VTS environment and were having problems isolating the issue. It turned out to be WAN saturation so our 'sync across the WAN' workload was the only ones being slowed down. However, we only came to this conclusion through the knowledge that several of our VTS's were experiencing the problem. There was nothing in the VTS historical panels that would allow us to easily diagnose the issue and point in time commands like STATUS,GRIDLINK didn't give us a historical perspective. We would like additional historical metrics in the VTS performance history panels. This should include network latency at a minimum. Ideally we would be allowed to view this on a cluster to cluster basis so we could isolate the problem.

Idea priority

Medium

Post comment

Guest

Jul 1, 2019

The actual problem was WAN saturation causing delayed replication between our two sites. The vast majority of our replication between sites is deferred. However, some workloads (like HSM migration and DB2 log offloads) require sync mode between sites. Those workloads were being affected and were proceeding very slowly during the time the WAN was saturated. We suspected network issues but didn't have any trending to compare the current response times to.

Having the network stats for trending like those that appear as output in the GRIDLINK STATUS commands at the link level would have pointed us to a general network issues. Statistics like latency, packets sent/retransmitted, read/write/total MB/S etc. would be a good start. While having those at the cluster level would be a great start having them at the link level would be even better. We've had instances where one link was operating at a much lower level and being able to trend that could be helpful in diagnosing other routing issues out of a specific LAN leg.

Reply
Hide replies

Guest

Jun 28, 2019

Could you provide more specific examples of stats you think would have helped? What was the actual problem and symptom? How did the team figure out why it was occurring?

Reply
Hide replies

By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.

Shape the future of IBM!

Search existing ideas

Post your ideas

Specific links you will want to bookmark for future use

Provide historical n/w metrics in TS7700 historical view

Please enter your email address

RELATED IDEAS

Provide historical n/w metrics in TS7700 historical view