[AccessD] PITA and temporal database (was: Access and SQL Server)

Gustav Brock Gustav at cactus.dk
Tue Mar 8 05:42:02 CST 2011


Hi Arthur

This is what you brought up 4½ years ago:

---
>>> artful at rogers.com 2006-11-22 13:14:50 >>>

There is a whole other subject on this, about which I have written, but
I googled it and it didn't come up, so perhaps I wrote it and forgot to
sell it to somebody.  The gist is this: it's called PITA, which doesn't
mean pain in the arse, but rather Point In Time Architecture. Without
PITA, the central problem with relational databases is that they don't
provide an instant "roll back to August 1" capability. With PITA, they
do. It's not all that complicated, but it does require a detailed
walk-through so you can understand all the implications, the most
critical of which is, "Nothing is ever updated. An updated row is
actually replaced, and the updated row's EndDate column is updated to
reflect the datetime on which the row was "changed". Thus it becomes
possible to issue a query that reflects the state of the database on
August 1, 2005. Obviously this increases the size of the db
significantly, but in certain environments (such as medical), this is
critical -- who was JWC's physician on that date, and what tests were
performed, and by which medicos, and so on.

So. Today's job is to dig out that PITA article and pitch it to
somebody. 
--- 

Somehow you must have succeeded because your writing can found here,
dated 2007-02-22:

http://www.simple-talk.com/sql/database-administration/database-design-a-point-in-time-architecture/

As I wrote back also on 2006-11-22, what you describe is a temporal
database or - more precise - a bitemporal:

<quote>
In the literature, two time lines of interest have been mentioned,
transaction time and valid time. The valid time line represents when a
fact is valid in modelled world (i.e. when it was believed) and the
transaction time line represents when a transaction was performed. A
bitemporal database is a combination of valid time and transaction time
databases where these two time lines are considered to be orthogonal.
(Snodgrass & Ahn 1986)
</quote>

This is a fascinating area, and your article describes nicely how -
using SQL Server - to deal with some of the issues for a practical
implementation of this theory (from 1986).

However, I miss the connection to your eggs. 

To me, the collection of eggs describes rather a batch: A given
population of 200 hens produce each day a collected batch of eggs which
perhaps are stamped with a producer id and batch id but at least packed
with other eggs from the same batch only. The batch id is written on the
package.
This way a bad egg at the consumer can be tracked back to the package,
the producer, the date, the packing machine, the population of hens, and
- perhaps - the possible bags of corn (or whatever) used to feed these
hens.
You will record all associated data in a write-once/read-many database,
but as you by definition never will change or correct these data, I see
no scenario for a temporal or PITA database, it's more like a log file.
The only date field needed here is the packing date.

And how about the autonumber and the index maintenance Jim brought up?

/gustav


>>> fuller.artful at gmail.com 08-03-2011 01:00 >>>
Let us distinguish two problems:

The first is the "egg" problem. I have 200 chickens each of which lays
several eggs per day, each of which is plonked into an
arbitrary-numbered
case. At some point, it may be interesting to know which chicken laid
which
eggs and into which cartons they were placed. Most often, this level
of
detail is not interesting, but occasionally it is vital and
potentially
life-saving.

The second is the "serial number" problem. Every crankshaft or
windshield or
manifold coming off an assembly line has a unique serial number, unlike
the
aforementioned eggs. Each one of these parts can be traced to a shift
and a
line and perhaps ultimately to a worker.

Big difference in these problems, and big difference in which
attributes we
choose to model.

IME, I have dealt more with the egg problem than the serial number
problem,
but in recent years this has changed. To further complicate things,
this
latter problem has been compounded by the PITA issue (Point in Time
Archictecture; for details on this problem and its solution see my
piece at
Red Gate's site).

Arthur

On Mon, Mar 7, 2011 at 11:58 AM, Gustav Brock <Gustav at cactus.dk>
wrote:

> Hi Jim
>
> I'm not so sure about that. As far as I know, maintenance cost of a
an
> autonumber index is close to zero for adding records, zero for
updates of
> other fields, and tables that large typically are for
appending/reading
> only.
>
> But, of course, scenarios exist where you have to optimise where
possible.
>
> /gustav



More information about the AccessD mailing list