[dba-SQLServer] Using Databases on separate machines - performance

Eric Barro ebarro at verizon.net
Wed May 16 11:24:41 CDT 2007


You do give up the chance to use locking hints with linked servers.

Ex:

SELECT <fields> FROM nativeTable (nolock)

This will not work with linked servers...

SELECT <fields> FROM LinkedServerName.DatabaseName.dbo.linkedserverTableName
(nolock)

(nolock) helps speed up SELECT queries considerably. I use them all the
time.



-----Original Message-----
From: dba-sqlserver-bounces at databaseadvisors.com
[mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of Francisco
Tapia
Sent: Wednesday, May 16, 2007 9:19 AM
To: dba-sqlserver at databaseadvisors.com
Subject: Re: [dba-SQLServer] Using Databases on separate machines -
performance

John,
  You're describing distributed queries, which you can perform such as


SELECT * FROM ServerName.Database.Owner.Table


These instructions will pass the request to the OLEDB provider and pass the
"work" to the linked server.  If you were hooked into some other type of
engine ie, JET/Foxpro excel, then the work is performed by sql server as the
oledb provider would have restrictions on the datasource.  More info on
distributed queries in the links below.  I do use distributed queries in my
environment, but in my situation I generally cause them to join against
local tables, which then it has to pull data over to do the join locally.
If instead I perform an entire query on the linked server the processing
occurs on the remote server.

such as:

SELECT * FROM ServerName.Database.Owner.Table Where Field1 IN ('a','b', 'c',
'd')


hth

White Paper on performance:
http://citeseer.ist.psu.edu/rd/93453806%2C732761%2C1%2C0.25%2CDownload/http%
3AqSqqSqhome.comcast.netqSq%7EevilconroyqSqICDE2005-MicrosoftSQLServerDistri
butedQuery.pdf

Architecture:
http://msdn2.microsoft.com/en-us/library/ms191277.aspx

Distributed Queries
http://msdn2.microsoft.com/en-us/library/ms188721.aspx

On 5/16/07, jwcolby <jwcolby at colbyconsulting.com> wrote:
>
> I now have two "high powered" servers built up.  Both run Windows 2003 
> and SQL Server 2005.
>
> ATM I am running my entire database on a single machine.  The database 
> consists of a couple of largish "raw data" tables (65 million recs / 
> 700+ fields; 97 million recs / 149 fields), which contain the data 
> pulled in from text files, with a autoincrement PK added for tracking 
> purposes.  Each raw data table then has address / PK fields pulled out 
> and "sent out" for address validation.  The results are then 
> reimported back in to SQL Server, into the same DB as the raw data 
> table sits in.  I have created separate DBF files for each "database" 
> (raw / validated).
>
> As I mentioned earlier, I am creating a new dbf file set for each "order"
> I
> receive from my customer, where I build up the views required to 
> process that specific order.  That is working quite well BTW.
>
> I have a bunch of questions re performance.  I have discovered that I 
> can create queries / views that pull the data straight out of the 
> desired db / table when I use that data in another db, simply by just 
> referencing the database / table.  I think I can do the same thing if 
> pieces are on another server instance.  What I am considering doing is 
> placing these huge raw / validated database files out on the 
> StoneHenge server, leaving the Azul server to contain and process the 
> orders.  Stonehenge is the newer machine and has a single partition 
> with 1.6 tbytes open, and I will be adding another 500 gb to that 
> soon.  Thus this seems like the logical home for the big databases.
>
> I have a gbit switch between all the machines on my network.
>
> My question is, will there be any pros/cons to organizing things this way.
> I can get about 450mb burst streaming data off of my raid arrays which 
> is considerably above the 1 gb switch capacity, but it seems unlikely 
> that SQL Server would actually process data at that speed anyway.  So 
> I want to place the big source databases on the new server and the 
> order database on the original server.
>
> To give an example of a real order I created a set of queries:
>
> 1) One query talks to the tblHSIDRaw table (75 million records / 700 
> fields), asking for
>
> "SELECT PKID WHERE ... "
>
> The Where clause encompasses about 9 different fields.  All the fields 
> are indexed, though how useful the indexes are (in all cases) is in doubt.
>
> 2) Another query uses a small table of 180 ZIPS provided by the client.
> That ZIP table is joined on the ZIP column of tblAZHSID which is the 
> table that has been processed for valid addresses.  tblAZHSID is ~50 
> million records with about 20 fields.
>
> "SELECT PK from tblZIPS inner join tblAZHSID on ..."
>
> 3) A third query requests data from tblHSIDRaw for a where on a single 
> specific field.
>
> "SELECT PK FROM tblHSIDRaw WHERE FieldX IN ()
>
> I did that just because this has to be an AND with all the 9 fields in 
> the first query.  Perhaps not the most efficient but it works.
>
> 4) A query that inner joins all the PKs and returns a result set.
>
> So what happens if I place these tblHSIDRaw and tblAZHSID out on 
> Stonehenge?
> I assume that processor is passed the select clauses and performs the 
> processing required to return the PKs requested?
>
> Does Azul do any processing?  Perhaps the join of the result sets in 
> query
> 4
> to create the final data set?
>
> Is the process faster or slower than if the whole shootin match ran on 
> a single machine?
>
>
> John W. Colby
> Colby Consulting
> www.ColbyConsulting.com
>
> _______________________________________________
> dba-SQLServer mailing list
> dba-SQLServer at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
> http://www.databaseadvisors.com
>
>


--
-Francisco
http://sqlthis.blogspot.com | Tsql and More...
_______________________________________________
dba-SQLServer mailing list
dba-SQLServer at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
http://www.databaseadvisors.com

No virus found in this incoming message.
Checked by AVG Free Edition. 
Version: 7.5.467 / Virus Database: 269.7.1/805 - Release Date: 5/15/2007
10:47 AM
 




More information about the dba-SQLServer mailing list