| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
|
| Had a question about how folks are managing "change management" on their DW projects. One of the biggest problems we face on the DW, is most of our data sources have separate ongoing development and rework of their systems. So we have to be very communicative on change. If a system changes table structure in usually will break our ETL process at the first steps of copying into PreStaging. My ETL was using SQL Server Objects Task to copy table data from the source system into PreStaging inside my SSIS Package. However the first time a table say had 10 columns now has 12 because 2 new were added, it breaks this step. The SMO task was fast however I'm starting to think I need to use Select statements to manage what tables and columns I bring across. If my select statement always asked for the first 10 columns from a table - when someone adds a couple more - the DW will just ignore those columns. What steps do you do to populate PreStaging and manage change management on your data sources? |
|
#2
|
| Hi Joe, I think during dev stages, this is just something you have to take on the chin. Hopefully if the design has been done well, there shouldn't be too much overhead as there won't be that many metadata changes - there will always be some though... Personally, I always develop my tasks to use SQL statements with named columns in the sources. There are a few reasons for this: 1) I like to know exactly what columns are being returned. 2) Usually I'll be joining on 1 or more other tables when retrieving the data, and usually this is quicker in the database. 3) Some conversions and/or transformations are easier to do in the database (particularly date formatting) 4) It's faster. Something to do with the type of connection that SSIS opens on the server ... Google it ![]() The approach of just taking the first columns are added to the middle of the table instead of the end? What if column types or lengths are changed? These things need to be accounted for, and there is really no short-cut better than knowing the structure of the data you're transporting. Yes there are ways to work around this, but in the long run these often end up adding time to the project as things break due to unexpected behaviour. Good luck! J |
|
#3
|
| Rather than querying directly from tables in the data source, instead ask the developers of those systems to implement views that provide you with the data columns you need in an agreed upon format. I've always felt that datamarts and operational data stores should exchange data using well defined specifications or a contract, much like in oject oriented programming. When schemas, meta data, etc. change, then the developers should make whatever revisions are needed in the views to match the specifications. If there is no contract regarding the format and content of the source data to be extracted, then that's the root of the problem. "Joe H" wrote: > Had a question about how folks are managing "change management" on their DW > projects. One of the biggest problems we face on the DW, is most of our > data sources have separate ongoing development and rework of their systems. > So we have to be very communicative on change. If a system changes table > structure in usually will break our ETL process at the first steps of > copying into PreStaging. > > > > My ETL was using SQL Server Objects Task to copy table data from the source > system into PreStaging inside my SSIS Package. However the first time a > table say had 10 columns now has 12 because 2 new were added, it breaks this > step. The SMO task was fast however I'm starting to think I need to use > Select statements to manage what tables and columns I bring across. If my > select statement always asked for the first 10 columns from a table - when > someone adds a couple more - the DW will just ignore those columns. > > > > What steps do you do to populate PreStaging and manage change management on > your data sources? > > > > > > > |
|
#4
|
| It's called project management. No, seriously, it didn't become entirely obsolete with the new millenium, honest! Josh On Thu, 3 Jul 2008 08:26:43 -0700, "Joe H" wrote: >Had a question about how folks are managing "change management" on their DW >projects. One of the biggest problems we face on the DW, is most of our >data sources have separate ongoing development and rework of their systems. >So we have to be very communicative on change. If a system changes table >structure in usually will break our ETL process at the first steps of >copying into PreStaging. > > > >My ETL was using SQL Server Objects Task to copy table data from the source >system into PreStaging inside my SSIS Package. However the first time a >table say had 10 columns now has 12 because 2 new were added, it breaks this >step. The SMO task was fast however I'm starting to think I need to use >Select statements to manage what tables and columns I bring across. If my >select statement always asked for the first 10 columns from a table - when >someone adds a couple more - the DW will just ignore those columns. > > > >What steps do you do to populate PreStaging and manage change management on >your data sources? > > > > > |
|
#5
|
| Someone had commented about making the source pull from views on the outside applications. This really made sense - it puts ownership of those devlopers on those systems to ensure the views don't break or if they do to contact us. I agree it's all project management but was just wondering on other peoples approaches. Some of the applications are in different citys, different govt agencies, ect - so it's a nightmare to keep managment under control. "JXStern" news:jmmh749ue2scgrrjtfj2i28jketn8n7phf-at-4ax.com... > It's called project management. > > No, seriously, it didn't become entirely obsolete with the new > millenium, honest! > > Josh > > > On Thu, 3 Jul 2008 08:26:43 -0700, "Joe H" > wrote: > >>Had a question about how folks are managing "change management" on their >>DW >>projects. One of the biggest problems we face on the DW, is most of our >>data sources have separate ongoing development and rework of their >>systems. >>So we have to be very communicative on change. If a system changes table >>structure in usually will break our ETL process at the first steps of >>copying into PreStaging. >> >> >> >>My ETL was using SQL Server Objects Task to copy table data from the >>source >>system into PreStaging inside my SSIS Package. However the first time a >>table say had 10 columns now has 12 because 2 new were added, it breaks >>this >>step. The SMO task was fast however I'm starting to think I need to use >>Select statements to manage what tables and columns I bring across. If my >>select statement always asked for the first 10 columns from a table - when >>someone adds a couple more - the DW will just ignore those columns. >> >> >> >>What steps do you do to populate PreStaging and manage change management >>on >>your data sources? >> >> >> >> >> > |
|
#6
|
| I once worked on an ETL and datamart reporting project for a federal agency that involved receiving flat file submissions from multiple state agencies and territories (~72 total), and of course each had their own developers or database admins on back end pulling the extracts from a broad range of platforms, so I know what you're talking about. It helped when they started submitting extracts via a standard SDN application we provided. It would check the file format prior to upload and then notify them immediately if anything was out of spec. Before that, the submission / import / feedback / re-submission cycle could take days or even weeks. "Joe H" wrote: > Someone had commented about making the source pull from views on the outside > applications. This really made sense - it puts ownership of those devlopers > on those systems to ensure the views don't break or if they do to contact > us. > > I agree it's all project management but was just wondering on other peoples > approaches. Some of the applications are in different citys, different govt > agencies, ect - so it's a nightmare to keep managment under control. > > "JXStern" > news:jmmh749ue2scgrrjtfj2i28jketn8n7phf-at-4ax.com... > > It's called project management. > > > > No, seriously, it didn't become entirely obsolete with the new > > millenium, honest! > > > > Josh > > > > > > On Thu, 3 Jul 2008 08:26:43 -0700, "Joe H" > > wrote: > > > >>Had a question about how folks are managing "change management" on their > >>DW > >>projects. One of the biggest problems we face on the DW, is most of our > >>data sources have separate ongoing development and rework of their > >>systems. > >>So we have to be very communicative on change. If a system changes table > >>structure in usually will break our ETL process at the first steps of > >>copying into PreStaging. > >> > >> > >> > >>My ETL was using SQL Server Objects Task to copy table data from the > >>source > >>system into PreStaging inside my SSIS Package. However the first time a > >>table say had 10 columns now has 12 because 2 new were added, it breaks > >>this > >>step. The SMO task was fast however I'm starting to think I need to use > >>Select statements to manage what tables and columns I bring across. If my > >>select statement always asked for the first 10 columns from a table - when > >>someone adds a couple more - the DW will just ignore those columns. > >> > >> > >> > >>What steps do you do to populate PreStaging and manage change management > >>on > >>your data sources? > >> > >> > >> > >> > >> > > > > > |
|
#7
|
| That's a good idea. We have direct connections to all the sources - so we will be pulling. That's why I kind of like using views on their side. It's going to be a challenge - I'm sure. "Eric Russell" news:8AE5424D-0449-4A7B-B039-8B9082B57872-at-microsoft.com... >I once worked on an ETL and datamart reporting project for a federal agency > that involved receiving flat file submissions from multiple state agencies > and territories (~72 total), and of course each had their own developers > or > database admins on back end pulling the extracts from a broad range of > platforms, so I know what you're talking about. > It helped when they started submitting extracts via a standard SDN > application we provided. It would check the file format prior to upload > and > then notify them immediately if anything was out of spec. Before that, the > submission / import / feedback / re-submission cycle could take days or > even > weeks. > > "Joe H" wrote: > >> Someone had commented about making the source pull from views on the >> outside >> applications. This really made sense - it puts ownership of those >> devlopers >> on those systems to ensure the views don't break or if they do to contact >> us. >> >> I agree it's all project management but was just wondering on other >> peoples >> approaches. Some of the applications are in different citys, different >> govt >> agencies, ect - so it's a nightmare to keep managment under control. >> >> "JXStern" >> news:jmmh749ue2scgrrjtfj2i28jketn8n7phf-at-4ax.com... >> > It's called project management. >> > >> > No, seriously, it didn't become entirely obsolete with the new >> > millenium, honest! >> > >> > Josh >> > >> > >> > On Thu, 3 Jul 2008 08:26:43 -0700, "Joe H" >> > wrote: >> > >> >>Had a question about how folks are managing "change management" on >> >>their >> >>DW >> >>projects. One of the biggest problems we face on the DW, is most of >> >>our >> >>data sources have separate ongoing development and rework of their >> >>systems. >> >>So we have to be very communicative on change. If a system changes >> >>table >> >>structure in usually will break our ETL process at the first steps of >> >>copying into PreStaging. >> >> >> >> >> >> >> >>My ETL was using SQL Server Objects Task to copy table data from the >> >>source >> >>system into PreStaging inside my SSIS Package. However the first time >> >>a >> >>table say had 10 columns now has 12 because 2 new were added, it breaks >> >>this >> >>step. The SMO task was fast however I'm starting to think I need to >> >>use >> >>Select statements to manage what tables and columns I bring across. If >> >>my >> >>select statement always asked for the first 10 columns from a table - >> >>when >> >>someone adds a couple more - the DW will just ignore those columns. >> >> >> >> >> >> >> >>What steps do you do to populate PreStaging and manage change >> >>management >> >>on >> >>your data sources? >> >> >> >> >> >> >> >> >> >> >> > >> >> >> |
![]() |
| Thread Tools | |
| Display Modes | |