Skip to Main Content
IBM System Storage Ideas Portal


This portal is to open public enhancement requests against IBM System Storage products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).


Shape the future of IBM!

We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:

Search existing ideas

Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,

Post your ideas
  1. Post an idea.

  2. Get feedback from the IBM team and other customers to refine your idea.

  3. Follow the idea through the IBM Ideas process.


Specific links you will want to bookmark for future use

Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.

IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.

ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.

Status Functionality already exists
Created by Guest
Created on Sep 24, 2024

Monitoring and Alerting Local Filesystems Usage in Advance for IBM 9500 Nodes

Recently due to Node error 565 "Node internal disk is failing" due to /dumps full getting we discovered the impact where the nodes to be rebooted due to nodes reported in service state..


panel_name cluster_id cluster_name node_id node_name relation node_status error_data
01-2 00000XXXXXXXXXXX Flash840_Sec 2 node2 local Service 565 Disk full: /dumps01-1 00000XXXXXXXXXXX Flash840_Sec 3 node1 partner Active


Since there no mechanician to minotor and alert them in advance those FS reaches about threshold >80% and not action taken addess the problem to cleanup, the FS reaching out for 99% and impacting the node services.


this issue can address having appropriate monitoring and alert mechanician for customers ad iBM support team in advance..


Please treat this ideaa to high critical


superuser>fs_usage

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/SVC_Encrypted1 8.0G 637M 7.0G 9% /

/dev/mapper/SVC_Encrypted5 14G 2.7M 13G 1% /tmp

/dev/mapper/SVC_Encrypted4 18G 4.1G 13G 25% /opt

/dev/md2p4 101G 66G 30G 70% /dumps

/dev/md2p5 1.4G 91M 1.2G 8% /var

/dev/md2p2 6.7G 281M 6.1G 5% /upgrade

/dev/mapper/SVC_Encrypted2 3.3G 468M 2.7G 15% /compass

/dev/md2p3 2.0G 48M 1.8G 3% /data

/dev/mapper/SVC_Encrypted3 8.0G 4.1M 7.6G 1% /home

/dev/sda2 192G 70M 182G 1% /hdata1

/dev/sdb2 192G 28K 182G 1% /hdata2

/dev/sdc1 7.4G 260K 7.4G 1% /run/do_usb_16087


This monitor alerting and metrics shoudl available for infrastructure should include:

  • CPU (aggregate and my core) - % used

  • Memory (aggregate and by partition to function) - % used

  • Disk (all disk needed to operate, and arrays disk that are leveraged as storage needs are not included here) - %used

  • Network Rates (again Aggregate to the array and specific by purpose) - bytes

  • Network Errors (same) - count

  • Network Latency (same) - time

Network (same) - % used







Idea priority Urgent
  • Admin
    Philip Clark
    Reply
    |
    Oct 25, 2024

    Filesystem monitoring is addressed in other Idea.

    Other metrics referenced (CPU, memory, network, etc) are already available via call home telemetry or support snap in the case of debugging network issues.

  • Guest
    Reply
    |
    Oct 2, 2024

    Checking in on this idea. The lack of Observability on the platform is of high concern for us. When will this be reviewed?