Skip to Main Content
IBM System Storage Ideas Portal


This portal is to open public enhancement requests against IBM System Storage products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas
  1. Post an idea.

  2. Get feedback from the IBM team and other customers to refine your idea.

  3. Follow the idea through the IBM Ideas process.


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Status Delivered
Created by Guest
Created on Sep 26, 2012
Merged idea
This idea has been merged into another idea. To comment or vote on this idea, please visit SCSI-I-1033 Better hardware alerting \ callhome.

Hardware monitoring and alerting of SVC nodes Merged

The SVC software will detect if a memory DIMM did fail at bootup.
However, during operations, the IMM sometimes predicts hardware failures. In this case, the DIMM is still working correctly, but is starting to fail. SVC does not currently detect the predictive failure. It would be nice if this could be caught by SVC code and alerted accordingly. This will allow us to plan for maintenance instead of urgent replacement when the node is already impacted.

Currently only way to detect this situation is by looking at the amber led on the SVC node and the LDP.

Only experienced this scenario for DIMM, so not sure how this is handled for other node HW components (power supply, AC drop, fans, CPU, ...) but this request for node HW monitoring is valid for all components.

Idea priority Medium
  • Guest
    Reply
    |
    Jun 12, 2015

    Due to processing by IBM, this request was reassigned to have the following updated attributes:
    Brand - Servers and Systems Software
    Product family - Storage
    Product - IBM System Storage SAN Volume Controller (SVC) / Spectrum Virtualize

    For recording keeping, the previous attributes were:
    Brand - Tivoli
    Product family - Storage
    Product - IBM System Storage SAN Volume Controller (SVC) / Spectrum Virtualize

  • Guest
    Reply
    |
    Jan 31, 2013

    We have also experienced a hardware (memory) issue that has gone unnoticed until someone physically saw the attention light on. In another instance, we found a node unbootable mid-code upgrade as it had a defective USB drive in the front panel.

    Monitoring hardware is an expected feature to any enterprise storage device. Please implement this. In today's environments you can not expect equipment admins to be close to the equipment they manage.

  • Guest
    Reply
    |
    Nov 2, 2012

    During a random visit in the computer room, we noticed an amberlight on 2 SVC nodes. The SVC management interface did not report any error and we did not receive an alert.

    We asked IBM to investigate and it appeared that a memory DIMM had failed or was about to fail. IBM couldn't provide information specifically what the issue was. We don't know how long this issue was present. We scheduled a maintenance window to replace the DIMM's.

    If no one would have seen the LED, we only would have known when this DIMM problem really impacted the node by a crash.

    SVC code should alert us via the usual error alerting that a node component needs to be replaced so that we can be proactive versus having a node crash and then suffering a longer outage of a node for repair.

  • Guest
    Reply
    |
    Oct 30, 2012

    Another consideration is that during a code upgrade, the node is rebooted. If the DIMM is marginal, the node may not complete the reboot and you are now doing a hardware replacement in the middle of a code upgrade.

  • Guest
    Reply
    |
    Oct 10, 2012

    To improve customer confidence on material reliability I think this "rfe" is really necessary. Thanks beforehand for your support

  • Guest
    Reply
    |
    Oct 10, 2012

    To improve customer confidence on material reliability I think this "rfe" is really necessary. Thanks beforehand for your support