Let’s look at how we can plan the migration of a Big Data Platform to a new data center.
High Level Migration Plan
- Hardware Procurement for your new data location
- Infrastructure Build: server racking, cabling, OS installation and configuration of your new cluster (master, slaves and gateway nodes)
- Network: Firewall rules, F5 configuration, DR and Business Continuity
- Installation of the Hadoop Platform: what flavor will you use? e.g. Cloudera, Hortonworks, IBM…
- Grant access to all the relevant user accounts
- Configuration and deployment of all the artifacts
- Smoke testing and connectivity testing
- Regression testing: prove that your ingestion-enrichment-extraction process works as expected.
- Data Migration – How many TB will you copy and how long will it take?
- Migration Event (mini sequence of events)
- Shut down the source environment
- Copy the delta data
- Cutover to the new environment
- Test and validate all the business applications
- Decommission the source environment
Well done going through the plan, however on top of that you will need to identify the critical pre-requirements in order to achieve a successful migration.
Pre-Requirements
The list below is obviously generic and you will need to re-assess them on case by case basis.
- Data Validation: how do you prove that all the thousands of tables containing billions of records were successfully copied?
- Connectivity: what if the new data centre is behind a firewall? Are your interface and application lists accurate and up to date?
- Performances: on top of new security features, migrating to a new geographical location will affect latency and throughput. Can your application cope with it?
- New Releases: a migration project takes several months and it is not unlikely that business releases will implement new functionalities and connections to the platform. Make sure you are on top of them and don’t miss any of them!
- Business Testing: ensure all the application owners are informed upfront about the project and ask them to provide their testing success criteria and ring-fence their resources when required to perform connectivity and business testing.