Data migration is defined as a process which involves the transfer of data between storage types, formats and computer systems. Data migration as we know is usually performed pro grammatically in order to achieve automated migration, minimizing human resources from tedious jobs. Database migration is necessary when organizations and individuals change their computer systems or go for upgrading to new systems, or in some cases when systems are merged.
Being one of core of data management activities, data migration has been around since we started using computers. Even then, this process often is it one of the most neglected on IT departments’ lists of things to do, thereby resulting in low quality data in the target system. As a result of this neglect, studies have shown that an estimated 84% of data migration projects failed due to poor management. Some of the major reasons why data migration project fails many times:
- Breakdown of the target systems
- Poor or low data quality in target environment
- Loss of opportunity for business
- Cost overruns
This article will take a look at some of the necessary steps in order to successfully perform large-scale data migration projects. These steps have been tested over the years in data migration projects. You are free to adopt some level of customization as per your needs.
Step 1: Investment in Profiling Source Data
Remember that the first key to successful database migration is having proper source data. As such understanding the characteristics of your source data is absolutely necessary in order to identify undocumented data relationships, data volume, data quality, data and anomalies. Data profiling will provide an x-ray vision of your source data sets, which will help you in understanding the strengths and weaknesses of data sets. This investment will directly impact on the effectiveness of downstream processes and the software code components. It is also important to define the scope of data profiling exercise upfront in order to avoid overspending on such project. You can do basic data profiling by developing your own scripts. However, if you are working on complex and large data sets, it is worth investing in industrial strength data profiling tool and software.
Step 2: Draw the model of Data Migration process
Data migration can be anything from one single step with just one date source and one target system, or may involve highly complex data processing with multiple source systems, multiple steps and multiple targets. To avoid confusion, it is important to draw or layout an elaborate process model showing every step of the data migration process. The artifact will serve as the roadmap for moving of data, besides being an agreement among the stakeholders. The model will also serve as an input to the downstream administration, configuration management and software development processes. The migration process model must come with interim steps in order to validate the volume and the quality of data that is being transferred through the process. By having the embedded checkpoints in place, data analysts can ensure that exceptions are within well-accepted limits and that there are no hidden surprises.
Step 3: Define Roles & Responsibilities Upfront
Data migration is both a complex and daunting work which involve every stakeholders and IT managers or team leaders. A formal handshake at every critical step is imperative. The engineers and the project manager must identify all the possible roles and assign responsibilities to the roles as part of project planning. Later, the project manager should formally assign the roles to every staff members. When assigning roles and responsibilities are upfront, project leaders can make sure that the entire data migration life cycle is supported with appropriate accountability established.
Step 4: Divide, then Conquer
Just as any huge and complex tasks, data migration should be clearly divided based on logical clubbing of data – such as business section, geography, cost, etc. The choice of logical groupings depends largely on business context for data migration task. It is recommended to select the smallest data set first and later move to larger data sets. By following this methodology, the team members will be able to learn and then fine-tune the migration process quickly with minimum risk. Migration of every data set can be treated as a release which will reduce the pressure on the team in communication. Every data release should be followed by a formal evaluation –document and educate – and then refine it for the next release.
Step 5: Investing in Technology and Training
If you are dealing with large-scale data migration projects, it is best to invest in technology or tools that have been proven over the years. Some tools available in the market offer great automation, metadata collection, scheduling and error handling. In adding to having the tools, it is necessary to invest money in formal training of IT staff who are responsible for development and execution of data migration. Investing in training reduces the risk while dealing with real projects. Such training will also open an opportunity for the IT staff to speak and sort out confusion with the vendor’s technical support staff.
For every large-scale database migration tasks, the size of the data sets that’s being moved from the source to destination data stores can be absolutely overwhelming. Due to business and/or operational requirements, majority of data migration projects have a predefined and a short time for moving of data. Thus, it is necessary for the code components to have performance levels that are acceptable. Always conduct a full test of the code components for performance at production scale. The staff involved in database migration should continue to fine-tune the software/tool and/or configuration the parameters until the desired throughput is achieved. A repetitive performance testing also helps IT staff to get familiar with the technology and the database migration process.
Step 6: Run Performance Test