[AccessD] Data Export Spec - Rev1

Fri Oct 21 04:09:58 CDT 2005

<<<
It is intentionally (or intended to be) that simple.
>>>
OK. I see now.
And as far as I see the only source data transformation method is to apply
specified format.
This looks of rather limited applicability.
I think with a few additional efforts this program can be made to use
additionally runtime pluggable data formatters - then it would be more
useful.

I'd also add that some export files (for banks e.g here in Russia) have
special header and footer sections - I'd add this feature to your spec. Of
course this feature can be added later when there will be real customer's
request to add it.

> There are only a handful of classes envisioned,
> possibly as few as two.
If you add pluggable data formatters feature then you can get hundreds
additional small classes written by others for their own needs. You can
combine them in one library database (MS Access), ATL/COM library(VB6 or C++
ATL/COM) or class library (.NET). Or they can be used as separate pluggable
units. And hey will be used by your generic core code.

> The entire system will likely
> be a pair of classes.
OK. But it may grow as far as I see to use different data sources then
you'll have duplicated code to synchronize. Classes are small but when you
do changes even in small classes then they have to be tested. There should
be a solution with generic core code ready for new data sources to be added.
And this core code will not be needed to changes/retest when these new data
sources will be added. Small change in you code design and small additional
work on this project phase will results in big savings on next phases.

>  If additions are needed later, then you do like
> you do in any project, you attempt to make
> them fit.
With this approach the risk is high to rewrite whole system in attempts to
make the new features to fit it. Or add a lot of patch code here and there
(as it usually happens) getting migrated your simple code into "spaghetti"
code. Yes, I see your system is small but nowadays mainstream requirement is
to design the system as easy adaptable as possible to the future changes.
Yes, to "foresee" the future changes some additional work is needed but in
long run this small additional work will result in considerable savings.
If in your case "long run"(many changes in the future) is not planned then
there is no need of course in any additional work on your small system
architecture. You can go coding "blindly"...

> It just occurred to
> me that you might be able to scale it up using parallel processing, have
10
> machines each append to it's own file, then append the 10 files together.
John, when I'm talking on scaling I'm NOT talking will your system be able
to make export of huge amount of source rows as speedy as the small
amount(here 10 machines "brute force" is one of the possible solutions) -
I'm talking will your system be able to do that export of huge source data
sources at all on ONE PC without stopping all the other work on it? The time
it will spend on this export and transformation doesn't matter in this case.

> However, if you envision something that needs to export millions of
records
> at a shot, then a custom highly optimized solution is probably preferred.
When your customer will grow to have million records to export then you will
write a speedy custom solution or adapt your one to run in parallel on many
threads/PCs/processors (in the case you use VB.NET).

But WHILE your customer is growing you will not need to adjust your solution
if you make it a little bit more flexible NOW.

You will sleep well, your customer will sleep well when their data will be
exported night time. Yes, you will have less work for this customer for this
certain project in long run(patch coding work will not be needed) but they
will like you and they will deliver you more work in other areas and they
will recommend you to their partners - they gain in long run, you gain in
long run, everybody gains - "win-win" approach with some small additional
efforts on first stages.

Well, maybe the picture I'm drawing is too bright :)

If your customer is not ready to pay for this additional efforts on first
stage or you do not want to invest your time - then just go coding without
trying to envision future changes. You ( and I and anybody else) can't
foresee all the possible changes of course - if they will try they will end
up in "analysis paralysis". But modern design and programming approaches
allow to make right design decisions without detailed analysis of what
changes may come in the future...

> I can time it to get a feel for performance, but like anything,
> performance will always "depend" on a lot of different variables.  Your
> Mileage May Vary GREATLY.
I'm not talking about system performance - it depends of course. I'm talking
about programming ready for many(but not all of course) future changes
without many troubles...

> It is just plain silly to code
> this over and over and over (and over).
Well, as Gustav noted for him hardcoded approach works well.
I did program last year for my customer on C++ a rather flexible approach
based on XML metadefinitions - it was still a semi-hardcoded because I used
inherited legacy code I started with and there were no time and resources to
generalize it. And this semi-hardcoded approach worked rather well.

But you is keen to make it more generic as far as I see. And ready to go
with that generic coding. OK.

For your certain case if you foresee you will have in the near future 10
customers requests for 10 different export formats and this work will take,
say one working week (5 days) but now for developing the code for two export
formats your customers is ready to pay you for two working days and you will
make your generic solution in these two days, which will fit all the ten
formats then you of course can go with a generic solution. Then your
customer will save money and you will have a generic solution applicable in
other projects.
If you invest one additional day or convince your customer to pay you for
this additional day to work on your small system architecture(this
additional day's work will be spread on three days work) then you will get
even more flexible more generic solution, easily portable to VB.NET and
scalable.

Does it makes sense to spend this additional working day now or not? This a
question to you and your customer.... (I say 5 working days, 1 additional
working day etc. just as an example not as an evaluation of the amount of
work needed for this certain project or for the work needed to work on this
project's architecture).

> BTW, I will eventually be porting this to .NET
> as I have a requirement for that.
Here we go!
Then later they will ask you to convert it to a Web Service but they will
not have enough money for that work because these money will be spend on
patch coding, scaling and VBA->VB.NET conversion of your "cowboy coding"
solution.

And then they will find Indian, Pakistanian, Romanian, Russian,... young
guys who will use your code and make it converted to Web service because
these young guys can work on 5USD/hour rate or less and they are rather
good(very good sometimes) programmers after all and your customer will have
money to pay these young guys and maybe they will make him(your lost
customer) happy maybe not...

But in long run you will loose your customer.
Your customer will think you're not a good developer because your code
wasn't flexible enough to adapt to the changes.
Young Indian,... guys will spoil your customer by "free-cheese" kind of work
but the chances they will make your customer happy in long run are
questionnable...

Everybody unhappy, resources wasted, mutual credibility low - isn't that
situation what is so often happens in IT industry?

Maybe I'm generalizing too much for this small project, I know...

Shamil

----- Original Message ----- 
From: "John Colby" <jwcolby at ColbyConsulting.com>
To: "'Access Developers discussion and problem solving'"
<accessd at databaseadvisors.com>
Sent: Friday, October 21, 2005 4:22 AM
Subject: Re: [AccessD] Data Export Spec - Rev1

> Shamil,
>
> I guess I don't understand what you mean.  This is a self contained
system.
> Place data in the tables, call a function, out pops a text file.  That is
> precisely the plan, to make a table driven export SYSTEM, where you place
> specifications for exporting data into a set of tables (well defined
> interface), instantiate a class, and call a method of the class.  Out pops
a
> text file.  It is intentionally (or intended to be) that simple.  At the
> same time it can be simultaneously used by 1 or a dozen (or a hundred)
> different users, exporting different data files, each exporting the files
> they choose by selecting which Export record(s) they use.  I run this
thing
> on a server, automatically, in the middle of the night, but that doesn't
> have to be.
>
> >- problem to share the work between team members;
>
> Work on this export wizard?  There are only a handful of classes
envisioned,
> possibly as few as two.  I am not getting volunteers buy the droves you
> might notice, so it does not appear that splitting up the work is going to
> be one of the major concerns.
>
> >- problem of duplicated code, which may become "nightmarish" to support;
>
> I guess I just don't understand what you see.  The entire system will
likely
> be a pair of classes.  One class holds the data and methods for a field,
the
> other holds the data and methods for the recordset export.
>
> >- problem for future extensions;
>
> Am I missing something?  If you are going to define a program, that
performs
> a fixed functionality, then you always risk "problem for future
extensions".
> This is not .net, there is no inheritance.  The best we can do is to open
> the discussion up and get as much input and ideas for future expansions as
> possible right up front so that they can be planned for.  If additions are
> needed later, then you do like you do in any project, you attempt to make
> them fit.  This is not Windows however, or SQL Server, or Office, it is a
> pair of classes and 4 tables.
>
> >- problem for scaling up....
>
> Well... This one will indeed be a problem.  If you intend to export
millions
> of records using this method the results will likely be unsatisfactory.
> That said however, this method loads the field class instances into memory
> and just passes a pointer to the recordset to each one.  We can certainly
> run timing analysis per record (per field) but what you get is what you
get.
> Again, there is no magic involved here.  In fact using DAO instead of ADO
> will likely INCREASE the speed rather than decrease it.  It just occurred
to
> me that you might be able to scale it up using parallel processing, have
10
> machines each append to it's own file, then append the 10 files together.
>
> However, if you envision something that needs to export millions of
records
> at a shot, then a custom highly optimized solution is probably preferred.
>
> >etc.
>
> Sorry, can't really address that one until it is fleshed out a bit.  ;-)
>
> I have a "similar" system actually running.  It is very application
> specific, but the concept is virtually identical.  I export 4 files with
it
> daily.  I can time it to get a feel for performance, but like anything,
> performance will always "depend" on a lot of different variables.  Your
> Mileage May Vary GREATLY.
>
> My intention here is to attempt to "genericize" a common requirement -
> delimited or fixed width text file export.  It is just plain silly to code
> this over and over and over (and over).
>
> BTW, I will eventually be porting this to .NET as I have a requirement for
> that.
>
> John W. Colby
> www.ColbyConsulting.com
>
> Contribute your unused CPU cycles to a good cause:
> http://folding.stanford.edu/
>
<<< tail trimmed>>>