This portal is to open public enhancement requests against IBM System Storage products. To view all of your ideas submitted to IBM, create and manage groups of Ideas, or create an idea explicitly set to be either visible by all (public) or visible only to you and IBM (private), use the IBM Unified Ideas Portal (https://ideas.ibm.com).
We invite you to shape the future of IBM, including product roadmaps, by submitting ideas that matter to you the most. Here's how it works:
Start by searching and reviewing ideas and requests to enhance a product or service. Take a look at ideas others have posted, and add a comment, vote, or subscribe to updates on them if they matter to you. If you can't find what you are looking for,
Post an idea.
Get feedback from the IBM team and other customers to refine your idea.
Follow the idea through the IBM Ideas process.
Welcome to the IBM Ideas Portal (https://www.ibm.com/ideas) - Use this site to find out additional information and details about the IBM Ideas process and statuses.
IBM Unified Ideas Portal (https://ideas.ibm.com) - Use this site to view all of your ideas, create new ideas for any IBM product, or search for ideas across all of IBM.
ideasibm@us.ibm.com - Use this email to suggest enhancements to the Ideas process or request help from IBM for submitting your Ideas.
See this idea on ideas.ibm.com
Hi
We had NFS Ganesha node that failed - but we did not able to recover it , only reboot solve it.
Sysmon attempted to remove the F flag three times, but all attempts to acquire the lock failed.
As a result, the node remained in a failed state despite being healthy.
Restarting the node allowed the process to retry successfully once the lock became available.
NFS failure: RPC null checks and the stat checks (collects IO number 2 times and if its the same, the test fails) both failed. thats why nfs_not_active. This was fixed soon.
why node stayed in failed state even NFS was healthy: But to remove Failed state we need a fail-over lock and we didn't get it, it tries 3 times. Later when the lock was available, we had already exhausted our tries. And this is working as designed. But ideally we may want to redesign this.
Idea priority | Medium |
By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.
Hi
CAn we add to mmhealth monitor the following errors in ganesha.log file (or all CRIT messages to be monitor):
2025-01-26 12:20:48 : epoch 001c0612 : ess4-proto6 : gpfs.ganesha.nfsd-2163267[svc_4386] fsal_find_fd :FSAL :CRIT :Open for locking failed for access Read/Write
2025-01-26 12:20:48 : epoch 001c0612 : ess4-proto6 : gpfs.ganesha.nfsd-2163267[svc_2687] fsal_find_fd :FSAL :CRIT :Open for locking failed for access Read/Write
2025-01-26 12:20:48 : epoch 001c0612 : ess4-proto6 : gpfs.ganesha.nfsd-2163267[svc_4005] fsal_find_fd :FSAL :CRIT :Open for locking failed for access Read/Write
2025-01-26 12:20:49 : epoch 001c0612 : ess4-proto6 : gpfs.ganesha.nfsd-2163267[svc_4612] fsal_find_fd :FSAL :CRIT :Open for locking failed for access Read/Write
2025-01-26 16:07:25 : epoch 0010060f : ess4-proto2 : gpfs.ganesha.nfsd-2657517[svc_749] mdcache_lru_fds_available :INODE LRU :CRIT :FD Hard Limit (943718) Exceeded (open_fd_count = 943719), waking LRU thread.
2025-01-26 16:07:25 : epoch 0010060f : ess4-proto2 : gpfs.ganesha.nfsd-2657517[svc_989] mdcache_lru_fds_available :INODE LRU :CRIT :FD Hard Limit (943718) Exceeded (open_fd_count = 943723), waking LRU thread.
2025-01-26 16:07:25 : epoch 0010060f : ess4-proto2 : gpfs.ganesha.nfsd-2657517[svc_752] mdcache_lru_fds_available :INODE LRU :CRIT :FD Hard Limit (943718) Exceeded (open_fd_count = 943723), waking LRU thread.