[AccessD] PITA and temporal database (was: Access and SQL Server)

Gustav Brock Gustav at cactus.dk
Tue Mar 8 09:53:48 CST 2011


Hi John

It is even worse - if you consider referential integrity. If so, you
have to record the object itself as the first and then, in separate
table(s), any attribute that may change over time. Thus, for example for
a customer table, as everything can change except the registration
number (VAT or whatever), your main table may end up containing only
this and the Id, while the name of the company, address, and phone
number, etc. must be kept in one or more child tables.

However, it may not turn out that labourious. Think about it: How often
do you need all info of a customer?
But you are right. Even though this is an extremely powerful storing
method, the client still has to pay.

/gustav


>>> jwcolby at colbyconsulting.com 08-03-2011 15:49 >>>
While this is a nice idea, and perhaps critical somewhere, I cannot see
it being usable in most 
environments.  I have a claim table with 50 or 100 fields and I am
going to store all of that again 
because a single date in one field is changed?

In an ideal world without real world constraints, go for it.  However
we are replacing natural keys 
with surrogates because of issues with storing just a handful of fields
"over again" and the 
negative impacts on speed and storage requirements that natural keys
bring.

If I were going to do a temporal kind of db I would go with a
"normalized" solution where I stored 
just the PKID of the record changing, name of the field (or ID of the
field) the "old" field data, 
and the change time.

Then you end up with a "change log" concept.  All of which demands
intense analysis to determine 
where the change log becomes less efficient than storing the whole
record.

One way or the other (in my experience), very few clients are willing
to pay for this kind of thing.

John W. Colby
www.ColbyConsulting.com 

On 3/8/2011 6:42 AM, Gustav Brock wrote:
> Hi Arthur
>
> This is what you brought up 4½ years ago:
>
> ---
>>>> artful at rogers.com 2006-11-22 13:14:50>>>
>
> There is a whole other subject on this, about which I have written,
but
> I googled it and it didn't come up, so perhaps I wrote it and forgot
to
> sell it to somebody.  The gist is this: it's called PITA, which
doesn't
> mean pain in the arse, but rather Point In Time Architecture.
Without
> PITA, the central problem with relational databases is that they
don't
> provide an instant "roll back to August 1" capability. With PITA,
they
> do. It's not all that complicated, but it does require a detailed
> walk-through so you can understand all the implications, the most
> critical of which is, "Nothing is ever updated. An updated row is
> actually replaced, and the updated row's EndDate column is updated
to
> reflect the datetime on which the row was "changed". Thus it becomes
> possible to issue a query that reflects the state of the database on
> August 1, 2005. Obviously this increases the size of the db
> significantly, but in certain environments (such as medical), this
is
> critical -- who was JWC's physician on that date, and what tests
were
> performed, and by which medicos, and so on.
>
> So. Today's job is to dig out that PITA article and pitch it to
> somebody.
> ---
>
> Somehow you must have succeeded because your writing can found here,
> dated 2007-02-22:
>
>
http://www.simple-talk.com/sql/database-administration/database-design-a-point-in-time-architecture/

>
> As I wrote back also on 2006-11-22, what you describe is a temporal
> database or - more precise - a bitemporal:
>
> <quote>
> In the literature, two time lines of interest have been mentioned,
> transaction time and valid time. The valid time line represents when
a
> fact is valid in modelled world (i.e. when it was believed) and the
> transaction time line represents when a transaction was performed. A
> bitemporal database is a combination of valid time and transaction
time
> databases where these two time lines are considered to be
orthogonal.
> (Snodgrass&  Ahn 1986)
> </quote>
>
> This is a fascinating area, and your article describes nicely how -
> using SQL Server - to deal with some of the issues for a practical
> implementation of this theory (from 1986).
>
> However, I miss the connection to your eggs.
>
> To me, the collection of eggs describes rather a batch: A given
> population of 200 hens produce each day a collected batch of eggs
which
> perhaps are stamped with a producer id and batch id but at least
packed
> with other eggs from the same batch only. The batch id is written on
the
> package.
> This way a bad egg at the consumer can be tracked back to the
package,
> the producer, the date, the packing machine, the population of hens,
and
> - perhaps - the possible bags of corn (or whatever) used to feed
these
> hens.
> You will record all associated data in a write-once/read-many
database,
> but as you by definition never will change or correct these data, I
see
> no scenario for a temporal or PITA database, it's more like a log
file.
> The only date field needed here is the packing date.
>
> And how about the autonumber and the index maintenance Jim brought
up?
>
> /gustav
>
>
>>>> fuller.artful at gmail.com 08-03-2011 01:00>>>
> Let us distinguish two problems:
>
> The first is the "egg" problem. I have 200 chickens each of which
lays
> several eggs per day, each of which is plonked into an
arbitrary-numbered
> case. At some point, it may be interesting to know which chicken laid
which
> eggs and into which cartons they were placed. Most often, this level
of
> detail is not interesting, but occasionally it is vital and
potentially
> life-saving.
>
> The second is the "serial number" problem. Every crankshaft or
windshield or
> manifold coming off an assembly line has a unique serial number,
unlike the
> aforementioned eggs. Each one of these parts can be traced to a shift
and a
> line and perhaps ultimately to a worker.
>
> Big difference in these problems, and big difference in which
attributes we
> choose to model.
>
> IME, I have dealt more with the egg problem than the serial number
problem,
> but in recent years this has changed. To further complicate things,
this
> latter problem has been compounded by the PITA issue (Point in Time
> Archictecture; for details on this problem and its solution see my
piece at
> Red Gate's site).
>
> Arthur



More information about the AccessD mailing list