Shamil Salakhetdinov
shamil at users.mns.ru
Fri Oct 21 04:09:58 CDT 2005
<<< It is intentionally (or intended to be) that simple. >>> OK. I see now. And as far as I see the only source data transformation method is to apply specified format. This looks of rather limited applicability. I think with a few additional efforts this program can be made to use additionally runtime pluggable data formatters - then it would be more useful. I'd also add that some export files (for banks e.g here in Russia) have special header and footer sections - I'd add this feature to your spec. Of course this feature can be added later when there will be real customer's request to add it. > There are only a handful of classes envisioned, > possibly as few as two. If you add pluggable data formatters feature then you can get hundreds additional small classes written by others for their own needs. You can combine them in one library database (MS Access), ATL/COM library(VB6 or C++ ATL/COM) or class library (.NET). Or they can be used as separate pluggable units. And hey will be used by your generic core code. > The entire system will likely > be a pair of classes. OK. But it may grow as far as I see to use different data sources then you'll have duplicated code to synchronize. Classes are small but when you do changes even in small classes then they have to be tested. There should be a solution with generic core code ready for new data sources to be added. And this core code will not be needed to changes/retest when these new data sources will be added. Small change in you code design and small additional work on this project phase will results in big savings on next phases. > If additions are needed later, then you do like > you do in any project, you attempt to make > them fit. With this approach the risk is high to rewrite whole system in attempts to make the new features to fit it. Or add a lot of patch code here and there (as it usually happens) getting migrated your simple code into "spaghetti" code. Yes, I see your system is small but nowadays mainstream requirement is to design the system as easy adaptable as possible to the future changes. Yes, to "foresee" the future changes some additional work is needed but in long run this small additional work will result in considerable savings. If in your case "long run"(many changes in the future) is not planned then there is no need of course in any additional work on your small system architecture. You can go coding "blindly"... > It just occurred to > me that you might be able to scale it up using parallel processing, have 10 > machines each append to it's own file, then append the 10 files together. John, when I'm talking on scaling I'm NOT talking will your system be able to make export of huge amount of source rows as speedy as the small amount(here 10 machines "brute force" is one of the possible solutions) - I'm talking will your system be able to do that export of huge source data sources at all on ONE PC without stopping all the other work on it? The time it will spend on this export and transformation doesn't matter in this case. > However, if you envision something that needs to export millions of records > at a shot, then a custom highly optimized solution is probably preferred. When your customer will grow to have million records to export then you will write a speedy custom solution or adapt your one to run in parallel on many threads/PCs/processors (in the case you use VB.NET). But WHILE your customer is growing you will not need to adjust your solution if you make it a little bit more flexible NOW. You will sleep well, your customer will sleep well when their data will be exported night time. Yes, you will have less work for this customer for this certain project in long run(patch coding work will not be needed) but they will like you and they will deliver you more work in other areas and they will recommend you to their partners - they gain in long run, you gain in long run, everybody gains - "win-win" approach with some small additional efforts on first stages. Well, maybe the picture I'm drawing is too bright :) If your customer is not ready to pay for this additional efforts on first stage or you do not want to invest your time - then just go coding without trying to envision future changes. You ( and I and anybody else) can't foresee all the possible changes of course - if they will try they will end up in "analysis paralysis". But modern design and programming approaches allow to make right design decisions without detailed analysis of what changes may come in the future... > I can time it to get a feel for performance, but like anything, > performance will always "depend" on a lot of different variables. Your > Mileage May Vary GREATLY. I'm not talking about system performance - it depends of course. I'm talking about programming ready for many(but not all of course) future changes without many troubles... > It is just plain silly to code > this over and over and over (and over). Well, as Gustav noted for him hardcoded approach works well. I did program last year for my customer on C++ a rather flexible approach based on XML metadefinitions - it was still a semi-hardcoded because I used inherited legacy code I started with and there were no time and resources to generalize it. And this semi-hardcoded approach worked rather well. But you is keen to make it more generic as far as I see. And ready to go with that generic coding. OK. For your certain case if you foresee you will have in the near future 10 customers requests for 10 different export formats and this work will take, say one working week (5 days) but now for developing the code for two export formats your customers is ready to pay you for two working days and you will make your generic solution in these two days, which will fit all the ten formats then you of course can go with a generic solution. Then your customer will save money and you will have a generic solution applicable in other projects. If you invest one additional day or convince your customer to pay you for this additional day to work on your small system architecture(this additional day's work will be spread on three days work) then you will get even more flexible more generic solution, easily portable to VB.NET and scalable. Does it makes sense to spend this additional working day now or not? This a question to you and your customer.... (I say 5 working days, 1 additional working day etc. just as an example not as an evaluation of the amount of work needed for this certain project or for the work needed to work on this project's architecture). > BTW, I will eventually be porting this to .NET > as I have a requirement for that. Here we go! Then later they will ask you to convert it to a Web Service but they will not have enough money for that work because these money will be spend on patch coding, scaling and VBA->VB.NET conversion of your "cowboy coding" solution. And then they will find Indian, Pakistanian, Romanian, Russian,... young guys who will use your code and make it converted to Web service because these young guys can work on 5USD/hour rate or less and they are rather good(very good sometimes) programmers after all and your customer will have money to pay these young guys and maybe they will make him(your lost customer) happy maybe not... But in long run you will loose your customer. Your customer will think you're not a good developer because your code wasn't flexible enough to adapt to the changes. Young Indian,... guys will spoil your customer by "free-cheese" kind of work but the chances they will make your customer happy in long run are questionnable... Everybody unhappy, resources wasted, mutual credibility low - isn't that situation what is so often happens in IT industry? Maybe I'm generalizing too much for this small project, I know... Shamil ----- Original Message ----- From: "John Colby" <jwcolby at ColbyConsulting.com> To: "'Access Developers discussion and problem solving'" <accessd at databaseadvisors.com> Sent: Friday, October 21, 2005 4:22 AM Subject: Re: [AccessD] Data Export Spec - Rev1 > Shamil, > > I guess I don't understand what you mean. This is a self contained system. > Place data in the tables, call a function, out pops a text file. That is > precisely the plan, to make a table driven export SYSTEM, where you place > specifications for exporting data into a set of tables (well defined > interface), instantiate a class, and call a method of the class. Out pops a > text file. It is intentionally (or intended to be) that simple. At the > same time it can be simultaneously used by 1 or a dozen (or a hundred) > different users, exporting different data files, each exporting the files > they choose by selecting which Export record(s) they use. I run this thing > on a server, automatically, in the middle of the night, but that doesn't > have to be. > > >- problem to share the work between team members; > > Work on this export wizard? There are only a handful of classes envisioned, > possibly as few as two. I am not getting volunteers buy the droves you > might notice, so it does not appear that splitting up the work is going to > be one of the major concerns. > > >- problem of duplicated code, which may become "nightmarish" to support; > > I guess I just don't understand what you see. The entire system will likely > be a pair of classes. One class holds the data and methods for a field, the > other holds the data and methods for the recordset export. > > >- problem for future extensions; > > Am I missing something? If you are going to define a program, that performs > a fixed functionality, then you always risk "problem for future extensions". > This is not .net, there is no inheritance. The best we can do is to open > the discussion up and get as much input and ideas for future expansions as > possible right up front so that they can be planned for. If additions are > needed later, then you do like you do in any project, you attempt to make > them fit. This is not Windows however, or SQL Server, or Office, it is a > pair of classes and 4 tables. > > >- problem for scaling up.... > > Well... This one will indeed be a problem. If you intend to export millions > of records using this method the results will likely be unsatisfactory. > That said however, this method loads the field class instances into memory > and just passes a pointer to the recordset to each one. We can certainly > run timing analysis per record (per field) but what you get is what you get. > Again, there is no magic involved here. In fact using DAO instead of ADO > will likely INCREASE the speed rather than decrease it. It just occurred to > me that you might be able to scale it up using parallel processing, have 10 > machines each append to it's own file, then append the 10 files together. > > However, if you envision something that needs to export millions of records > at a shot, then a custom highly optimized solution is probably preferred. > > >etc. > > Sorry, can't really address that one until it is fleshed out a bit. ;-) > > I have a "similar" system actually running. It is very application > specific, but the concept is virtually identical. I export 4 files with it > daily. I can time it to get a feel for performance, but like anything, > performance will always "depend" on a lot of different variables. Your > Mileage May Vary GREATLY. > > My intention here is to attempt to "genericize" a common requirement - > delimited or fixed width text file export. It is just plain silly to code > this over and over and over (and over). > > BTW, I will eventually be porting this to .NET as I have a requirement for > that. > > John W. Colby > www.ColbyConsulting.com > > Contribute your unused CPU cycles to a good cause: > http://folding.stanford.edu/ > <<< tail trimmed>>>