[AccessD] Clustered Index Debate - Was: RE: Primary keys and entry points

Gustav Brock Gustav at cactus.dk
Fri Jun 26 16:42:16 CDT 2009


Hi Shamil

I think the idea is to have both - that you via the framework can use the designer to create your database schema without touching the normal database tools, and that the framework later - by the knowledge it has about the schema - is capable of creating on the fly the SQL commands needed to have the framework do what your code requests from it. The quality of this SQL is arguable, and purists may find it clumsy, bloated and less than optimized. 

However, if it works I couldn't care less what it feeds to the db engine as long as I get the results via the framework I expect and get them fast. This is, of course, an area of continuous improvement.

As for how the ADO.NET EF is specified, I don't know but I guess it should be easy to find out if you joined the open-source groups for either SQlite or FireBird.

/gustav


>>> shamil at smsconsulting.spb.ru 26-06-2009 19:18 >>>
<<<<
... one could choose to let the ADO.NET EF do the work and let it design the
database and leave to it the decisions for building indexes and keys...
>>>>
Hi Gustav,

Thank you for your prompt response.

I meant to let ADO.NET EF to *generate* "under the hood" join
expressions/SQL data select (where) clauses based on designs made by
application developer/DBA, as well as to let MS SQL Server to *choose* what
indexes to use to interpret/execute SQL statements generated by ADO.NET EF
given data model + indexes designed by application developer/DBA.

Is it getting more clear now in what direction I wanted to "navigate"/drive?
:) As you can see I'm navigating/driving with some "blind" spots on the map
- the task is to investigate/"workaround"/bypass those "blind" spots and to
safely reach the destination... whatever it will be the time we will say we
have got reached it...

As for keeping ADO.NET EF discussion here that shouldn't be an issue despite
the fact that this is MS Access related list? - AFAICG ADO.NET EF could
become (soon?) available to access/manipulate MS Access (and other
relational and not only relational) databases provided there will be an open
specification how to write ADO.NET EF Data Providers, or does this
specification exist already? AFAIS there are some other ADO.NET EF Providers
- http://msdn.microsoft.com/en-us/data/dd363565.aspx  - what open(?)
specification did their developers use?

--
Shamil

-----Original Message-----
From: accessd-bounces at databaseadvisors.com 
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Gustav Brock
Sent: Friday, June 26, 2009 7:56 PM
To: accessd at databaseadvisors.com 
Subject: Re: [AccessD] Clustered Index Debate - Was: RE: Primary keys and
entry points

Hi Shamil

Yes, let's keep it here for a while - at least I cannot provide much more
useful input as I don't have experience with SQL Server and large databases.

As you mention, one could choose to let the ADO.NET EF do the work and let
it design the database and leave to it the decisions for building indexes
and keys. However, I'm not sure I'm prepared to go that far until the EF is
a bit more mature.

Thanks for all the links. I'll study them at a later occasion, time is too
limited currently.

/gustav


>>> shamil at smsconsulting.spb.ru 26-06-2009 17:10 >>>
Hi Gustav,

Yes, SQL Server table can have no more than one clustered index (see links
below).
And "if the table has a clustered index, or the index is on an indexed view,
the row locator is the clustered index key for the row."
The latter means (AFAIU feel free to correct me if you suppose I'm wrong)
that if we introduce surrogate key additionally to clustered index created
on natural (primary key) columns and if we also create an index on surrogate
key column (to e.g. speed-up joins execution one may suppose) then clustered
index will anyway be involved in sql selection operations to lookup actual
data rows - and that will result in joins of large data tables using
surrogate keys would run (considerably) slower than joins on tables without
surrogate keys. And not even joins: if you have a surrogate
Autonumber/Identity clustered primary key/index on a table (that's what we
usually have most of the time currenlty?) plus a natural (compound) (unique)
key/index then using where expression on the columns from the natural
key/index will involve two indexes on sql execution time, and if result set
is large then execution time will be bad...

Well, the above doesn't yet sound clear for me therefore I'd propose to have
this discussion kept here for some time to find all "pros" and contra" for
this or that approach. As e.g. it's done here on "the GUIDs as primary key"
subject:

"Is it a bad idea to use GUIDs as primary keys in MS SQL?"
http://stackoverflow.com/questions/537145/is-it-a-bad-idea-to-use-guids-as-p 
rimary-keys-in-ms-sql

<<<<<<<<<<<<<<<<<<<<
GUID Pros
* Unique across every table, every database, every server
* Allows easy merging of records from different databases
* Allows easy distribution of databases across multiple servers
* You can generate IDs anywhere, instead of having to roundtrip to the database
* Most replication scenarios require GUID columns anyway 

GUID Cons

* It is a whopping 4 times larger than the traditional 4-byte index value;
this can have serious performance and storage implications if you're not careful
* Cumbersome to debug (where userid='{BAE7DF4-DDF-3RG-5TY3E3RF456AS10}')
* The generated GUIDs should be partially sequential for best performance...
>>>>>>>>>>>>>>>>>>>

I mean I expect this thread will finally come to similar "Pros" and "Cons"
list - in fact that would be two lists:

1)
Natural Primary Keys with Clustered Index and No Surrogate Keys - Pros

Natural Primary Keys with Clustered Index and No Surrogate Keys - Contra

2)
Surrogate Primary Keys with Clustered index - Pros

Surrogate Primary Keys with Clustered index - Contra

As usual the "truth should be somewhere in between" - what I wanted to
achieve is to stop using surrogate keys almost "religiously" in all and
every table  as I do now - I think I can stop doing that now if I will use
ADO.NET Entity Framework as my main data access/manipulation technology
because ADO.NET Entity Framework will take care for me to "write/generate
lengthy join expressions on-the-fly (under the hood)" as well as it (ADO.NET
Entity Framework) will let me to not rewrite my join expressions in the case
of the data model changes/reorganization.

Thank you.

--
Shamil

P.S. Here are some links related to the subject of this thread:

---
The ADO.NET Entity Framework Overview
http://msdn.microsoft.com/en-us/library/aa697427(VS.80).aspx 

---
Table and Index Organization
http://msdn.microsoft.com/en-us/library/ms189051.aspx 

Tables and Index Data Structures Architecture
http://msdn.microsoft.com/en-us/library/ms180978.aspx 

Logical and Physical Operators Reference
http://msdn.microsoft.com/en-us/library/ms191158.aspx 

--
Clustered Index Structures
http://msdn.microsoft.com/en-us/library/ms177443.aspx 

"If the table has a clustered index, or the index is on an indexed view, the
row locator is the clustered index key for the row. "


Nonclustered Index Structures
http://msdn.microsoft.com/en-us/library/ms177484.aspx 

Heap Structures
http://msdn.microsoft.com/en-us/library/ms188270.aspx 

SQL Statement Processing
http://msdn.microsoft.com/en-us/library/ms190623.aspx 

---
Is it a bad idea to use GUIDs as primary keys in MS SQL?
http://stackoverflow.com/questions/537145/is-it-a-bad-idea-to-use-guids-as-primary-keys-in-ms-sql
---
Should a Composite Primary Key be clustered in SQL Server?
http://stackoverflow.com/questions/389348/should-a-composite-primary-key-be-clustered-in-sql-server

What happens when I drop a clustered primary key in SQL 2005
http://stackoverflow.com/questions/705504/what-happens-when-i-drop-a-clustered-primary-key-in-sql-2005

Where to place a primary key
http://stackoverflow.com/questions/575450/where-to-place-a-primary-key

What is the optimal indexing strategy for a relation table?
http://stackoverflow.com/questions/851908/what-is-the-optimal-indexing-strategy-for-a-relation-table
----

-----Original Message-----
From: accessd-bounces at databaseadvisors.com 
[mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Gustav Brock
Sent: Friday, June 26, 2009 12:40 PM
To: accessd at databaseadvisors.com 
Subject: Re: [AccessD] Clustered Index Debate - Was: RE: Primary keys and entry points

Hi Shamil and John

Reading these chatty fellows is a bit difficult as they refer to quite a lot
before and outside the article and many versions of SQL Server including
6.5, pack it in "if"s and different scenarios, and - as I'm not a DBA -
expressions unknown by me, but to me it looks more like a discussion on
indexes rather than keys with zero conclusion.

However - if that is true - if only one clustered index can be applied to a
table then, of course, it has to be designed with great care considering the
expected purpose and usage of the table and, of course, that may very well
include a natural (compound) key. But I can't see why that should indicate a
farewell to the surrogate key.

How the ADO.NET Entity Framework handles this is really out of our hands but
I guess the old rule still applies: the groundwork sets the quality of the
house.

/gustav


<<< snip >>>





More information about the AccessD mailing list