Jim Lawrence
accessd at shaw.ca
Fri Oct 28 07:32:38 CDT 2005
Hi John: You seem to have to process well in hand so I doubt if there is anything that I could help you with. (Data migration has always been my strong suit.) Traditionally, I am leery of having clients managing data importing as it can lead to a lot of extra work; maybe not a bad thing, from the developers perspective but some clients take it very personally, when errors are brought to their attention and costs are spiraling up. Cleaning up dirty data can be very time consuming. Here are my ten commandments, steps for data importing and I invite any comments for improvement. 1. Clean up the basic dirty data at the legacy database end. This type of data is mostly blank or unfinished records. Most of that work can and should be done by the client. This would be a good spot for your routines. 2. Pull all the raw data into tables in the destination database in the same table layout as the source structure. This will require a set of basic import queries or DTS routines. Save these import queries for later. 3. With the clients help, create the definitive table and relational structure of the new database; the schema. 4. Build a set of Queries that will create and remove the KEYS and Relationships. Separate the Create and Remove queries into two groups and save for later. 5. Build the queries that will extract the data from the raw tables and manipulate the data into it final structures. Start with just SELECT queries. Watch for duplications where there should not be any like 'two customers with the same id'. These issues may have to be handled at the legacy data end as the client has to be made aware. 6. Populate the root tables first; i.e. Inventory table as their new keys will be used to generate the complex and compound table; i.e. Invoice table. 7. Some custom tables may have to be created at this point that will hold new unique data that will be used in conjunction with the import routines; i.e. accounting or ledger codes. 8. When the SELECT statements are finally manipulating the raw data into the correct form it is time to replace the SELECT statement with the INSERT statement or add the INSERT statement. Be sure to remove all the KEYS and RELATIONSHIPS before testing the data importing as it can slow things dramatically. 9. When the raw data imports cleanly, the queries that created the keys and relationships can be used. Steps 4 to 7 may have to be repeated a few times until the results are clean. 10. Now it is time to create the master import form or procedure. When run the following process will take place: a.) All tables will be cleared, keys and relationships removed. b.) The raw import queries will be ran populating the raw tables. c.) The queries that import the data from the raw to the destination tables will be run, in order, from simple tables (ones with one key) to the complex tables (ones with multiple indexes or foreign keys). d.) The queries that will add the keys and relationships will then be run. When this process can run with just one button the job is complete. Time to go and create all the forms and reports. This is my two cents worth. Jim -----Original Message----- From: accessd-bounces at databaseadvisors.com [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of John Colby Sent: Thursday, October 27, 2005 8:54 PM To: 'Access Developers discussion and problem solving' Subject: [AccessD] Is anyone migrating data? I am right in the middle of a data migration job. I am building a simple two table (so far) tool to assist me in organizing the data migration. This will be a recurring migration, fairly complex. The client hired people to build the system and get the data migrated once but that developer did not document the process, nor save the queries etc. Thus I am having to learn the whole process from scratch using the "sit with the client and ask questions" method. They want it documented this time naturally. I have migrated many different databases over the years but mostly the migrations were "one-shot" migrations designed to get the data from denormalized tables in to a new system I was building to replace the old system. In this case, the old system will stay (mostly) with the same data having to be imported every month, or quarter - something like that. The tool is pretty simple, just a form to list the table names, in the order the tables need to be migrated, and some attributes to describe what that step is doing, then a child table to hold the query / SQL statement / code to run to migrate the data into that table. Anyone who has done this stuff knows that the process is usually order sensitive in terms of which tables get migrated when, and also the order that the queries are run for any given table. My objective is just to document the table / order / requirement / data source, plus the queries / order used for each table, with comments. In the end I will have a little program that pulls each record and applies the queries in the order they are in the table, such that the process is "push button" (with luck) and the client can do it themselves. Documentation will be in the comments in the table and can be pulled into a report. If anyone else is doing this kind of stuff and wishes to collaborate with me, contact me offline. John W. Colby www.ColbyConsulting.com Contribute your unused CPU cycles to a good cause: http://folding.stanford.edu/ -- AccessD mailing list AccessD at databaseadvisors.com http://databaseadvisors.com/mailman/listinfo/accessd Website: http://www.databaseadvisors.com