Optimize migration algorythm of seq. storagepool to collocated storagepool

See this idea on ideas.ibm.com

Short:
When migrating data from a file pool to a collocated tape pool a large number of mounts occur in the tape pool causing throughput to go down and maintenance can not complete in time.

Setup:
Data flow
Client > (backup) > file pool > (migration) > tape pool
The file pool and the tape pool are using collocation=group. File pool uses volumes of 50GB which are not pre allocated.
There are around 60 collocation groups defined each about 6 TB in size, about 600 clients are on the server.

Problem:
During the migration from file pool to tape pool we found that the migration process is issuing a lot of tape mounts in the destination device class. Each tape is only written to for a short period of time and then unmounted again. This causes low throughput for the migration process since it is waiting a lot for tapes to be mounted and it also causes the tape library to have a large mount queue.

Cause ( in our opinion)
From the manual on how migration works:
When migrating data from a random disk storage pool to a sequential storage pool, and collocation is by node or file space, nodes or file spaces are automatically selected for migration based on the amount of data to be migrated. The node or file space with the most data is migrated first. If collocation is by group, all nodes in the storage pool are evaluated to determine which node has the most data. The node with the most data is migrated first along with all the data for all the nodes that belong to that collocation group. This process takes place, regardless of how much data is stored in the file spaces of nodes and regardless of whether the low migration threshold was reached.

However, when migrating collocated data from a sequential storage pool to another sequential storage pool, the server orders the volumes according to the date when the volume was last accessed. The volume with the earliest access date is migrated first, and the volume with the latest access date is migrated last.

This behavior makes sense if the source pool is a random disk pool or a seq. "real" tape pool, if the source pool is however a seq. file pool it would be much better to use „random disk pool behavior “.

Suggestion:
Change migration algorithm in the following case:
If
migration detects the following
Source pool is sequential and collocated
Destination pool is sequential and collocated
Then
The node or file space with the most data is migrated first. If collocation is by group, all nodes in the storage pool are evaluated to determine which node has the most data. The node with the most data is migrated first along with all the data for all the nodes that belong to that collocation group.

Idea priority

Urgent

Post comment

Guest

Reply
| Mar 1, 2017

Would like to add, that if the source-pool is sequential and on disk in my opinion data movement should always be per filespace and not per node or collocation group,
because when writing to tape each change of backup-date, nodename, filespace or even managementclass triggers a new transaction
-> which triggers a tape-commit
-> which triggers the tape buffer written completly to tape
-> which triggers a backhitch (stop of tape, rewind, reaccelerate)
-> which results in poor performance.
And once the data is written to tape each subsequent tape-operation like reclamation or move data also suffers from this poor performance.

And the data movement from disk to tape is the only point in time where TSM is able to rearrange the structure of the data efficiently. So either this way or TSM stops to do so much unneccesary tape commits as requested in this rfe I created:
http://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe&CR_ID=101746

0 reply Hide replies

By clicking the "Post Comment" or "Submit Idea" button, you are agreeing to the IBM Ideas Portal Terms of Use.
Do not place IBM confidential, company confidential, or personal information into any field.

Shape the future of IBM!

Search existing ideas

Post your ideas

Specific links you will want to bookmark for future use

Optimize migration algorythm of seq. storagepool to collocated storagepool

Please enter your email address

RELATED IDEAS

Optimize migration algorythm of seq. storagepool to collocated storagepool