[AccessD] Needs analysis

Tue Jun 22 11:52:28 CDT 2010

http://en.wikipedia.org/wiki/GUID_Partition_Table#cite_note-microsoft-7
http://www.microsoft.com/whdc/device/storage/GPT_FAQ.mspx#ELD

Seems you can have only non-boot partitions on the 32 bit win2003 on over 
2TB disks.

L

----- Original Message ----- 
From: "Drew Wutka" <DWUTKA at marlow.com>
To: "Access Developers discussion and problem solving" 
<accessd at databaseadvisors.com>
Sent: Tuesday, June 22, 2010 5:45 PM
Subject: Re: [AccessD] Needs analysis

> Hmmm, our main file server and a backup server we have both have Windows
> 2003, with more then 2 terrabytes.  Both are 32 bit, and operate Windows
> 2003.
>
> Drew
>
> -----Original Message-----
> From: accessd-bounces at databaseadvisors.com
> [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of Lembit Soobik
> Sent: Tuesday, June 22, 2010 5:42 AM
> To: Access Developers discussion and problem solving
> Subject: Re: [AccessD] Needs analysis
>
>> If you happen to run across that 2 Terrabyte issue, send it my way,
> this
>> is the first I've heard about it on a Windows server.
>
> Drew, John,
> I just touched this issue when I configured my new workstation for video
>
> editing:
> I use two 1.5 TB disks for a 3 TB RAID 0.
> the problem is the Master Boot Record MBR can only handle 2 TB because
> it
> uses the LBA addressing (4 Byte)
> to address sectors. with 512 Byte sectors as in XP you get 2 TB.
> The newer operating systems use GPT (GUID Partition Table) instead.
> this is for WinXP 64 bit SP1, Win 2003 64 bit SP1, Win 2008, Vista, Win
> 7.
> Since my new WS has Win 7, 64 bit, I dont have a problem.
>
> So you might - depending on your OS - not have any problem at all.
>
> Lembit
>
> ----- Original Message ----- 
> From: "Drew Wutka" <DWUTKA at marlow.com>
> To: "Access Developers discussion and problem solving"
> <accessd at databaseadvisors.com>
> Sent: Tuesday, June 22, 2010 10:24 AM
> Subject: Re: [AccessD] Needs analysis
>
>
>> Thanks for the detail.  Clears some things up for me. You've
> definitely
>> have your work cut out for you.
>>
>> If you happen to run across that 2 Terrabyte issue, send it my way,
> this
>> is the first I've heard about it on a Windows server.
>>
>> As far as the RAID stuff goes, you do have some misinformation there.
>> First, the lure of RAID 6, with having 2 drive failure redundancy is a
>> common misnomer as to it's usefulness. RAID 5 is a VERY common setup,
>> but it's a setup SPECIFICALLY for maximizing disk space.  For example,
>> we have that for our main file server.  A 5 terrabyte volume, of which
>> we have around 2 Terrabytes actively used.  It's a file server, so the
>> performance hit of RAID 5 is minimized by the network speed.
>>
>> If you have a 3 drive RAID 5, you don't necessarily pull data from all
>> three drives at once.  First, it depends if your RAID is based on
> chunk
>> size or bits.  If it's bits, then yes, you actually read an entire
> byte
>> from 2 drives three times, and one drive two times.  If it's based on
>> chunk size, then it's possible that for a small read, you only hit one
>> drive.  RAID 6 was touted as being an improvement, but from all the
> pros
>> I have talked to, RAID 6 is the Pinto of RAID systems.  On paper it
>> sounds great, in practice, it's horrible.  Keep in mind, for the past
>> few years, this is what I have been involved with. (Not necessarily
> raid
>> drives, but servers and network administration)
>>
>> A RAID 1 actually can perform better then a single drive, but only for
> a
>> read.  With a RAID 1, the raid controller can read data from both
> drives
>> simultaneously because it's the same data.
>>
>> For your final reasoning, about the timing, I can't really sit here
> and
>> say, JWC, you're problems would be over with a RAID 10 replacing your
>> RAID 6.  What I CAN say, is that it is QUITE possible that you will
> see
>> a much bigger performance boost then just a 3x.
>>
>> Think of it this way, if  you were doing a straight read, like pulling
>> 40 gigs from a drive to memory, first, in a proper RAID 10, with 3 to
> 4
>> striped mirrors, that 40 gigs would be read very quickly.  Let's, for
>> simple math, say that it takes 20 seconds in your RAID 6.  And going
> to
>> a RAID 10 gives you a 2x boost.  Great, now you save 10 seconds.  But
>> that's in a straight data read, so defragmentation aside,  the actual
>> performance of how the data is being pulled from the drives in a
>> straight data pull would be a small factor of improvement.  Where the
>> REAL boost that you should really be concerned about is that SQL
> Server
>> isn't reading multiple files.  It is reading parts of a file, a part
>> here, a part there, an index is pointing to another part.  All of
> those
>> are small individual seeks (no matter how much data gets pulled in the
>> end).  That is where a RAID 6 slows things down in a MUCH higher
> factor.
>>
>> Ok, it's 3 in the morning, was up finishing up a system (with a RAID 0
>> +1...LOL), and decided to holler back.
>>
>> Good luck with your project, let me know when you plan to scrap one of
>> these systems, I may buy it from ya! ;)
>>
>> Drew
>>
>> -----Original Message-----
>> From: accessd-bounces at databaseadvisors.com
>> [mailto:accessd-bounces at databaseadvisors.com] On Behalf Of jwcolby
>> Sent: Monday, June 21, 2010 9:03 PM
>> To: Access Developers discussion and problem solving
>> Subject: Re: [AccessD] Needs analysis
>>
>> Drew,
>>
>> > First, you said you split your RAID volume into 2 partitions due to
> a
>> Windows limitation... what
>> limitation is that?  FAT 32 is limited to 2 terrabytes, but NTFS can
>> have WAY more than that.  We
>> have several servers here running Windows 2003, (standard and
> enterprise
>> mixed) which have 5+ terrbytes.
>>
>> When I was studying the issue, my take was that drives larger than 2
>> terabytes used something funky
>> that caused extreme performance issues.  Yea, it can be done, and yea,
>> you don't want to do that.
>> So I didn't do that.
>>
>> > I am specifically telling you to stay away from RAID 6 (and RAID 5).
>> Either go with RAID 0 (with
>> no redundancy) or a RAID 10.
>>
>> When I was studying the issue, RAID 6 allowed multiple drive failures.
>> Raid 6 is also one of the
>> fastest at streaming READ, simply because it is pulling data off of so
>> many drives simultaneously.
>> Making the assumption (yea, I know) that once it found data, the data
>> would be contiguous, high
>> streaming read is good.  Raid 6 and Raid 5 are approximately the same
>> speed reading.
>>
>> I had a very limited budget.  Back when I was doing this, each
> terabyte
>> drive (consumer grade) was
>> around $140 or $150.  I managed to scrape together the money for 10 of
>> them but basically what Raid
>> 6 allowed was to minimize the "overhead" of the raid parity
> information.
>> I did what I did strictly
>> because of budget.  Both raid 6 and raid 5 are significantly faster
> than
>> raid 1 and can approach
>> raid 0 FOR READS.  Not for writes of course because of the parity
>> calculations.
>>
>> Basically Raid1 is no faster than the individual drive.  It is
>> essentially just whatever your
>> individual drive specs are.  Raid 0, 5 or 6 read stripes of data off
> of
>> each drive in the set.  My
>> understanding wast that in the case of Raid5 and Raid 6 you even got
> the
>> advantage of the data
>> coming off of the parity drives.  IOW, raid 0 2 drive array reads data
>> from two drives
>> simultaneously, therefore 2X the individual drive specs in STREAMING
>> READ.  As I understood it, Raid
>> 5 actually read data off of all three drives so you got 3X the
>> individual drive speed.  Etc.
>>
>> Writes are another matter of course, but remember that I do not write
> -
>> AT ALL!!!.  OK, I write to
>> log files, but not back to the data disks.
>>
>> > Where I am confused about SQL Server, is that while I know it uses
>> whatever memory it can take, I
>> didn't think it runs strictly off of memory, whether you are simply
>> reading or not.  IE, if you
>> access xx number of records, sure, it would cache what it pulls, but I
>> don't think it pulls the
>> entire DB into memory from the get go.
>>
>> I think you are correct.  However I also think (and it is a big black
>> box really, so there is a lof
>> of "if I were doing this") that if you are reading data about two
>> tables, joined on a single field,
>> with where clauses on 4 or 5 other fields from one or both of those
> two
>> tables...
>>
>> SQL Server is going to be pulling as much as it can to work with the
>> joins, to filter the records etc.
>>
>> So if you have 13 gigs of RAM available and it has 40 gigs of index
> data
>> that needs processing...
>>
>> vs... you have 60 gigs of RAM available and 40 gigs of index data
>> needing processing.
>>
>> I have to believe that all other things being equal, 60 gigs available
>> memory will trump 13 gigs
>> available.  Maybe not, but I think so.  No matter how you slice it,
>> having 4 times as much ram for
>> SQL Server to cache and process the stuff it is working with is going
> to
>> go a long way.
>>
>> > I would assume (and you know what happens when we assume things!
>> LOL), that if you are truly
>> treating a db as read only, then the disks are only going to be
> accessed
>> for new information.
>>
>> Well, what does that mean?  Let's use an analogy.
>>
>> Assume I have two hundred feet of board.  On this board is painted a
>> random number (limited to
>> 10000) every inch.  I have another 50 foot section of board.  Again,
>> painted with a random number
>> every inch (limited to 10000).
>>
>> You are given a window within which you can lay the boards and slide
>> them back and forth trying to
>> match the numbers on one board with the numbers on the other board.
> The
>> window is 10 feet wide.
>> How much sliding of the boards will have to occur?
>>
>> Now get a window 40 feet wide.  How much sliding has to occur?
>>
>> There is no such thing as "only accessing new information".  The
> server
>> does not have enough memory
>> to process both entire indexes so it pulls a tiny piece of one index
> and
>> a tiny piece of another,
>> then another tiny piece etc.  It is sliding the pieces in and out of
>> memory trying to find the matches.
>>
>> > So here's a question for you, have you ever thought about splitting
>> the load between different
>> machines on a gigabit network?
>>
>> Yes, I have but not seriously.  And now, when I can get a motherboard
>>and 16 cores for $1000 (plus
>> memory) it makes even less sense.  If I had fiber between the machines
>> maybe but gigabit is really
>> pretty slow, especially when a good chunk is protocol.
>>
>> > I have noted many of the performance issues you've mentioned in the
>> past, and many of them are
>> functionally identical to when a RAID goes beserk, or can't handle the
>> load.
>>
>> LOL.  Yea, I/O limitations in the disk subsystem.  It is a known
> issue,
>> and I have it, and I know I
>> have it.  However just attacking the disk system will only get me so
>> far.  Let's assume I can triple
>> the disk speed.  If my queries are taking a half hour I am down to 10
>> minutes.  That's good but not
>> good enough.
>>
>> John W. Colby
>> www.ColbyConsulting.com
>>
>>
>> Drew Wutka wrote:
>>> Hmmm, couple of questions.  But before I begin, please understand
> that
>> I
>>> work with SQL Server, but nowhere near as in depth as you apparently
>>> are.  So PLEASE correct me if I am wrong, and I would appreciate any
>>> links/info you have to share on it.  I do understand quite a bit
> about
>>> hardware though, so that's why I am still tossing my two cents in
>> here.
>>>
>>> First, you said you split your RAID volume into 2 partitions due to a
>>> Windows limitation... what limitation is that?  FAT 32 is limited to
> 2
>>> terrabytes, but NTFS can have WAY more than that.  We have several
>>> servers here running Windows 2003, (standard and enterprise mixed)
>> which
>>> have 5+ terrbytes.
>>>
>>> Next, I am not saying 'get faster disks'.  LOL, I really am trying to
>>> just be helpful here.  In fact, right now, I'm in process of getting
>> two
>>> new servers for our systems, so I know how expensive things can be!
>> LOL.
>>> I am specifically telling you to stay away from RAID 6 (and RAID 5).
>>> Either go with RAID 0 (with no redundancy) or a RAID 10.  I certainly
>>> understand the cost involved, and I'm not trying to say 'tough, do
>> this
>>> or die!' LOL.  I am simply pointing out that while you are looking at
>>> memory and processor specs, factor in a drive set that will
> compliment
>>> them.
>>>
>>> Where I am confused about SQL Server, is that while I know it uses
>>> whatever memory it can take, I didn't think it runs strictly off of
>>> memory, whether you are simply reading or not.  IE, if you access xx
>>> number of records, sure, it would cache what it pulls, but I don't
>> think
>>> it pulls the entire DB into memory from the get go.  That sounds more
>>> like the 'new' dbs that have been talked about, which do run
>> completely
>>> in memory.
>>>
>>> I would assume (and you know what happens when we assume things!
> LOL),
>>> that if you are truly treating a db as read only, then the disks are
>>> only going to be accessed for new information.  But ANY write backs,
>>> would cause multiple memory <--> disk operations.  Memory is always
>>> going to be faster than disk, so the quicker the disk operations are,
>>> the less bottlenecking you'll face.
>>>
>>> So here's a question for you, have you ever thought about splitting
>> the
>>> load between different machines on a gigabit network?
>>>
>>> Again, sorry if I seem clunky in my thoughts here, I've perused your
>>> threads about this stuff, but have too much going on, on my own, to
>> keep
>>> too much detail of it in my head.
>>>
>>> I have noted many of the performance issues you've mentioned in the
>>> past, and many of them are functionally identical to when a RAID goes
>>> beserk, or can't handle the load.
>>>
>>> Drew
>>
>> -- 
>> AccessD mailing list
>> AccessD at databaseadvisors.com
>> http://databaseadvisors.com/mailman/listinfo/accessd
>> Website: http://www.databaseadvisors.com
>> The information contained in this transmission is intended only for
> the
>> person or entity
>> to which it is addressed and may contain II-VI Proprietary and/or
> II-VI
>> Business
>> Sensitive material. If you are not the intended recipient, please
> contact
>> the sender
>> immediately and destroy the material in its entirety, whether
> electronic
>> or hard copy.
>> You are notified that any review, retransmission, copying, disclosure,
>
>> dissemination,
>> or other use of, or taking of any action in reliance upon this
> information
>> by persons
>> or entities other than the intended recipient is prohibited.
>>
>>
>> -- 
>> AccessD mailing list
>> AccessD at databaseadvisors.com
>> http://databaseadvisors.com/mailman/listinfo/accessd
>> Website: http://www.databaseadvisors.com
>
> -- 
> AccessD mailing list
> AccessD at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/accessd
> Website: http://www.databaseadvisors.com
> The information contained in this transmission is intended only for the 
> person or entity
> to which it is addressed and may contain II-VI Proprietary and/or II-VI 
> Business
> Sensitive material. If you are not the intended recipient, please contact 
> the sender
> immediately and destroy the material in its entirety, whether electronic 
> or hard copy.
> You are notified that any review, retransmission, copying, disclosure, 
> dissemination,
> or other use of, or taking of any action in reliance upon this information 
> by persons
> or entities other than the intended recipient is prohibited.
>
>
> -- 
> AccessD mailing list
> AccessD at databaseadvisors.com
> http://databaseadvisors.com/mailman/listinfo/accessd
> Website: http://www.databaseadvisors.com