[dba-SQLServer] Using Databases on separate machines - performance

Francisco Tapia fhtapia at gmail.com
Wed May 16 11:21:18 CDT 2007


So in short the answer to your question really was "it depends"

from the MS article:

When possible, SQL Server pushes relational operations such as joins,
restrictions, projections, sorts, and group by operations to the OLE DB data
source. SQL Server does not default to scanning the base table into SQL
Server and performing the relational operations itself. SQL Server queries
the OLE DB provider to determine the level of SQL grammar it supports, and,
based on that information, pushes as many relational operations as possible
to the provider. For more information, see SQL Dialect Requirements for OLE
DB Providers <http://msdn2.microsoft.com/en-us/library/ms187442.aspx>.

On 5/16/07, Francisco Tapia <fhtapia at gmail.com> wrote:
>
> John,
>   You're describing distributed queries, which you can perform such as
>
>
> SELECT * FROM ServerName.Database.Owner.Table
>
>
> These instructions will pass the request to the OLEDB provider and pass
> the "work" to the linked server.  If you were hooked into some other type of
> engine ie, JET/Foxpro excel, then the work is performed by sql server as the
> oledb provider would have restrictions on the datasource.  More info on
> distributed queries in the links below.  I do use distributed queries in my
> environment, but in my situation I generally cause them to join against
> local tables, which then it has to pull data over to do the join locally.
> If instead I perform an entire query on the linked server the processing
> occurs on the remote server.
>
> such as:
>
> SELECT * FROM ServerName.Database.Owner.Table Where Field1 IN ('a','b',
> 'c', 'd')
>
>
> hth
>
> White Paper on performance:
>
> http://citeseer.ist.psu.edu/rd/93453806%2C732761%2C1%2C0.25%2CDownload/http%3AqSqqSqhome.comcast.netqSq%7EevilconroyqSqICDE2005-MicrosoftSQLServerDistributedQuery.pdf
>
> Architecture:
> http://msdn2.microsoft.com/en-us/library/ms191277.aspx
>
> Distributed Queries
> http://msdn2.microsoft.com/en-us/library/ms188721.aspx
>
> On 5/16/07, jwcolby <jwcolby at colbyconsulting.com> wrote:
> >
> > I now have two "high powered" servers built up.  Both run Windows 2003
> > and
> > SQL Server 2005.
> >
> > ATM I am running my entire database on a single machine.  The database
> > consists of a couple of largish "raw data" tables (65 million recs /
> > 700+
> > fields; 97 million recs / 149 fields), which contain the data pulled in
> > from
> > text files, with a autoincrement PK added for tracking purposes.  Each
> > raw
> > data table then has address / PK fields pulled out and "sent out" for
> > address validation.  The results are then reimported back in to SQL
> > Server,
> > into the same DB as the raw data table sits in.  I have created separate
> > DBF
> > files for each "database" (raw / validated).
> >
> > As I mentioned earlier, I am creating a new dbf file set for each
> > "order" I
> > receive from my customer, where I build up the views required to process
> > that specific order.  That is working quite well BTW.
> >
> > I have a bunch of questions re performance.  I have discovered that I
> > can
> > create queries / views that pull the data straight out of the desired db
> > /
> > table when I use that data in another db, simply by just referencing the
> >
> > database / table.  I think I can do the same thing if pieces are on
> > another
> > server instance.  What I am considering doing is placing these huge raw
> > /
> > validated database files out on the StoneHenge server, leaving the Azul
> > server to contain and process the orders.  Stonehenge is the newer
> > machine
> > and has a single partition with 1.6 tbytes open, and I will be adding
> > another 500 gb to that soon.  Thus this seems like the logical home for
> > the
> > big databases.
> >
> > I have a gbit switch between all the machines on my network.
> >
> > My question is, will there be any pros/cons to organizing things this
> > way.
> > I can get about 450mb burst streaming data off of my raid arrays which
> > is
> > considerably above the 1 gb switch capacity, but it seems unlikely that
> > SQL
> > Server would actually process data at that speed anyway.  So I want to
> > place
> > the big source databases on the new server and the order database on the
> >
> > original server.
> >
> > To give an example of a real order I created a set of queries:
> >
> > 1) One query talks to the tblHSIDRaw table (75 million records / 700
> > fields), asking for
> >
> > "SELECT PKID WHERE ... "
> >
> > The Where clause encompasses about 9 different fields.  All the fields
> > are
> > indexed, though how useful the indexes are (in all cases) is in doubt.
> >
> > 2) Another query uses a small table of 180 ZIPS provided by the client.
> > That ZIP table is joined on the ZIP column of tblAZHSID which is the
> > table
> > that has been processed for valid addresses.  tblAZHSID is ~50 million
> > records with about 20 fields.
> >
> > "SELECT PK from tblZIPS inner join tblAZHSID on ..."
> >
> > 3) A third query requests data from tblHSIDRaw for a where on a single
> > specific field.
> >
> > "SELECT PK FROM tblHSIDRaw WHERE FieldX IN ()
> >
> > I did that just because this has to be an AND with all the 9 fields in
> > the
> > first query.  Perhaps not the most efficient but it works.
> >
> > 4) A query that inner joins all the PKs and returns a result set.
> >
> > So what happens if I place these tblHSIDRaw and tblAZHSID out on
> > Stonehenge?
> > I assume that processor is passed the select clauses and performs the
> > processing required to return the PKs requested?
> >
> > Does Azul do any processing?  Perhaps the join of the result sets in
> > query 4
> > to create the final data set?
> >
> > Is the process faster or slower than if the whole shootin match ran on a
> > single machine?
> >
> >
> > John W. Colby
> > Colby Consulting
> > www.ColbyConsulting.com
> >
> > _______________________________________________
> > dba-SQLServer mailing list
> > dba-SQLServer at databaseadvisors.com
> > http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
> > http://www.databaseadvisors.com
> >
> >
>
>
> --
> -Francisco
> http://sqlthis.blogspot.com | Tsql and More...




-- 
-Francisco
http://sqlthis.blogspot.com | Tsql and More...



More information about the dba-SQLServer mailing list