[AccessD] Data Export Spec - Rev1

John Colby jwcolby at ColbyConsulting.com
Fri Oct 21 08:56:20 CDT 2005


Shamil,

LOL, I NEVER want to be all those backwards things that you portray me to
be, so join the team and show me (all of us) how we accomplish this?

What I am discussing is getting data out of a row of a record, into a
delimited string of text, with any format you desire (or which format$() can
handle anyway).  Each field can have its own format string stored right in
the field definition, so you can specify the output format of the field.
For example, if a field needs to be currency, placing the format$
"$###,###.00" gives an output in the text string of $23,423.00, etc etc.
Every fld record has a place to store a format string like this, so each
field can be formatted as precisely as format strings allows you to format
that field.

Let's back up a minute.  

The objective is to create a system for creating fixed width / delimited
data files for exchanging data between our puny little micro computers and
those monsters of calculation mainframes.  Specifically for EXPORTING data
to these monsters of computing.  

The "typical" way that developers export data is:

1) Use the export wizard, which works well IF the format is well understood.
Call up the wizard, point to a query, fill in the blanks, use the export
specification with the OutputTo vba function.  This is fine, except that if
anything in the format needs to change, you have to step back through the
wizard, selecting the field, and then find and fix the format.  And of
course, if you need any formatting not provided by the wizard you are stuck.
The wizard is pretty cool, I use it, but I always find it a bit of a PITA.

I also have one HUGE issue with this, which is that the OutputTo
occasionally errors out for reasons completely unknown to me.  Come back
through and execute again and it works the second time.  Reliability sucks.
The exports that I do occur in the middle of the night because my client's
client needs them on their desk in the morning.  Every few weeks I come in
to find it not happening.  I even tried just looping back to do it again
until it works and for whatever reason that doesn't work.  So... Output to
is a fine solution as long as you don't need reliability.

2) Write a custom output module that takes a record, pulls each field, and
outputs it however it needs to be seen in the file, hard coding the quotes,
if any, the delimit property, the fixed width etc.  This is Gustav's
solution, and if your spec is carved in stone, and you don't need to do lots
of them, this method also works just fine.

In either case, what often happens is that any necessary FORMATTING is often
accomplished in the query that pulls the data.  IOW, I pull a currency
field, alias it, and right in the query use a format$() to get it in the
desired format.  This certainly works, but the issue is that if you want to
find out (and change) how it is formatted, you have to go find the specific
query that has the format$() statement in the field alias and change the
format there.  

My intention is to:

Move the formatting out of the queries entirely.  
Allow the query just selects data to export, nothing more.
Allow the system to export data using the same query (so that IDENTICAL data
is sent) into multiple files with varying DELIMITING format requirements.
Place the formatting (if any) in a table.  
Make the formatting field specific, i.e. have a place to specify that for
THIS field, I want THIS format.
Place the other "delimiting information" such as "fixed width" or "comma
delimited", or "Tab delimited" in a table, easy to see, easy to edit.
Place the other "export name", "file name" etc. in a table, easy to see,
easy to edit.
Break the process into pieces - specify the data source, specify the field
format, specify the delimiter format in the exported line of code.

All of this allows me to build a specific kind of output, commonly known as
fixed "width / delimited".  This is not a report generator, it is used
(typically) to exchange data between computers.  The receiving computer MUST
HAVE FieldX at PositionY, formatted EXACTLY Z.  Getting the client to change
how their program accepts the data is not an option.  

I have run into this a dozen times or more, and I see this same request come
by the list several times a year so others are seeing it as well.  Every
time I have this requirement, these things are what I needed to format,
never anything else, which is why I have "blinders on" I suppose.  I admit
to having a specific purpose in mind.

You obviously have something more far ranging in mind, so please take a few
minutes to expand what you have in mind, and how we can accomplish this.
Everything you have said so far is so generalized that I cannot even
understand what you have in mind.  I understand "plug-in modules".  My
questions are:

1) Given what I am attempting to accomplish how do you create the module
that you swap out.  
2) What piece(s) of this puzzle is modified when I swap out this module for
another.
3) How do the modules interface together.
4) How is this accomplished in VBA?
5) Give me a specific example of the kind of export / format that you are
running into which isn't fixed/delimited/formatted and how modularizing this
program would allow you to handle this kind of export.



It sounds like my blinders are crippling me but we all think about handling
the problems that we run into.  It appears that you are running into
problems that I am not seeing, and without ever seeing your problems, I
can't comprehend how to respond to them.

John W. Colby
www.ColbyConsulting.com 

Contribute your unused CPU cycles to a good cause:
http://folding.stanford.edu/

-----Original Message-----
From: accessd-bounces at databaseadvisors.com
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Shamil
Salakhetdinov
Sent: Friday, October 21, 2005 5:10 AM
To: Access Developers discussion and problem solving
Subject: Re: [AccessD] Data Export Spec - Rev1


<<<
It is intentionally (or intended to be) that simple.
>>>
OK. I see now.
And as far as I see the only source data transformation method is to apply
specified format. This looks of rather limited applicability. I think with a
few additional efforts this program can be made to use additionally runtime
pluggable data formatters - then it would be more useful.

I'd also add that some export files (for banks e.g here in Russia) have
special header and footer sections - I'd add this feature to your spec. Of
course this feature can be added later when there will be real customer's
request to add it.

> There are only a handful of classes envisioned,
> possibly as few as two.
If you add pluggable data formatters feature then you can get hundreds
additional small classes written by others for their own needs. You can
combine them in one library database (MS Access), ATL/COM library(VB6 or C++
ATL/COM) or class library (.NET). Or they can be used as separate pluggable
units. And hey will be used by your generic core code.

> The entire system will likely
> be a pair of classes.
OK. But it may grow as far as I see to use different data sources then
you'll have duplicated code to synchronize. Classes are small but when you
do changes even in small classes then they have to be tested. There should
be a solution with generic core code ready for new data sources to be added.
And this core code will not be needed to changes/retest when these new data
sources will be added. Small change in you code design and small additional
work on this project phase will results in big savings on next phases.

>  If additions are needed later, then you do like
> you do in any project, you attempt to make
> them fit.
With this approach the risk is high to rewrite whole system in attempts to
make the new features to fit it. Or add a lot of patch code here and there
(as it usually happens) getting migrated your simple code into "spaghetti"
code. Yes, I see your system is small but nowadays mainstream requirement is
to design the system as easy adaptable as possible to the future changes.
Yes, to "foresee" the future changes some additional work is needed but in
long run this small additional work will result in considerable savings. If
in your case "long run"(many changes in the future) is not planned then
there is no need of course in any additional work on your small system
architecture. You can go coding "blindly"...

> It just occurred to
> me that you might be able to scale it up using parallel processing, 
> have
10
> machines each append to it's own file, then append the 10 files 
> together.
John, when I'm talking on scaling I'm NOT talking will your system be able
to make export of huge amount of source rows as speedy as the small
amount(here 10 machines "brute force" is one of the possible solutions) -
I'm talking will your system be able to do that export of huge source data
sources at all on ONE PC without stopping all the other work on it? The time
it will spend on this export and transformation doesn't matter in this case.

> However, if you envision something that needs to export millions of
records
> at a shot, then a custom highly optimized solution is probably 
> preferred.
When your customer will grow to have million records to export then you will
write a speedy custom solution or adapt your one to run in parallel on many
threads/PCs/processors (in the case you use VB.NET).

But WHILE your customer is growing you will not need to adjust your solution
if you make it a little bit more flexible NOW.

You will sleep well, your customer will sleep well when their data will be
exported night time. Yes, you will have less work for this customer for this
certain project in long run(patch coding work will not be needed) but they
will like you and they will deliver you more work in other areas and they
will recommend you to their partners - they gain in long run, you gain in
long run, everybody gains - "win-win" approach with some small additional
efforts on first stages.

Well, maybe the picture I'm drawing is too bright :)

If your customer is not ready to pay for this additional efforts on first
stage or you do not want to invest your time - then just go coding without
trying to envision future changes. You ( and I and anybody else) can't
foresee all the possible changes of course - if they will try they will end
up in "analysis paralysis". But modern design and programming approaches
allow to make right design decisions without detailed analysis of what
changes may come in the future...

> I can time it to get a feel for performance, but like anything, 
> performance will always "depend" on a lot of different variables.  
> Your Mileage May Vary GREATLY.
I'm not talking about system performance - it depends of course. I'm talking
about programming ready for many(but not all of course) future changes
without many troubles...

> It is just plain silly to code
> this over and over and over (and over).
Well, as Gustav noted for him hardcoded approach works well.
I did program last year for my customer on C++ a rather flexible approach
based on XML metadefinitions - it was still a semi-hardcoded because I used
inherited legacy code I started with and there were no time and resources to
generalize it. And this semi-hardcoded approach worked rather well.

But you is keen to make it more generic as far as I see. And ready to go
with that generic coding. OK.

For your certain case if you foresee you will have in the near future 10
customers requests for 10 different export formats and this work will take,
say one working week (5 days) but now for developing the code for two export
formats your customers is ready to pay you for two working days and you will
make your generic solution in these two days, which will fit all the ten
formats then you of course can go with a generic solution. Then your
customer will save money and you will have a generic solution applicable in
other projects. If you invest one additional day or convince your customer
to pay you for this additional day to work on your small system
architecture(this additional day's work will be spread on three days work)
then you will get even more flexible more generic solution, easily portable
to VB.NET and scalable.

Does it makes sense to spend this additional working day now or not? This a
question to you and your customer.... (I say 5 working days, 1 additional
working day etc. just as an example not as an evaluation of the amount of
work needed for this certain project or for the work needed to work on this
project's architecture).

> BTW, I will eventually be porting this to .NET
> as I have a requirement for that.
Here we go!
Then later they will ask you to convert it to a Web Service but they will
not have enough money for that work because these money will be spend on
patch coding, scaling and VBA->VB.NET conversion of your "cowboy coding"
solution.

And then they will find Indian, Pakistanian, Romanian, Russian,... young
guys who will use your code and make it converted to Web service because
these young guys can work on 5USD/hour rate or less and they are rather
good(very good sometimes) programmers after all and your customer will have
money to pay these young guys and maybe they will make him(your lost
customer) happy maybe not...

But in long run you will loose your customer.
Your customer will think you're not a good developer because your code
wasn't flexible enough to adapt to the changes. Young Indian,... guys will
spoil your customer by "free-cheese" kind of work but the chances they will
make your customer happy in long run are questionnable...

Everybody unhappy, resources wasted, mutual credibility low - isn't that
situation what is so often happens in IT industry?

Maybe I'm generalizing too much for this small project, I know...

Shamil





More information about the AccessD mailing list