Henry Simpson
hsimpson88 at hotmail.com
Thu May 15 11:22:37 CDT 2003
My response of last night has not showed up so I'll try again: Speed/simplicity versus storage requiements is given as a reason for using a bigger datatype. If so, why didn't Microsoft make the boolean a four byte datatype? And when we get to 64 bit processors and busses? The time taken by the processor to mask a byte for use by a 32 bit processor compared to the time to transfer a byte from disk to memory is insignificant. The time taken to transfer that same byte over a LAN is many orders of magnitude greater. Now factor in a WAN or a dial up connection and the smaller datatype gains significant advantages in speed. I keep hearing that storage is cheap but even so, the data must move from storage to processor to memory and back to be used and it is the transit time that has the greatest impact on performance. Assume 100,000 records. Assume a requirement of 8 yes/no fields. Case 2 byte Yes/No datatype: 8*2*100,000 = 1,600,000 bytes Case 1 byte bit field: 1*100,000 = 100,000 bytes The difference is 1,500,000 bytes and the transit time is enormous compared to the processor time required to work with the byte in a 32 bit environment. You lose in simplicity in querying but this can still be handled with a bit of skill. Assume a requirement of 6 yes/no fields and you 'waste' a couple bits Case 2 byte Yes/No datatype: 6*2*100,000 = 1,200,000 bytes Case 1 byte bit field: 1*100,000 = 100,000 bytes Even with wasted bits, the saving is enormous Assume a requirement of 8 yes/no fields and you use a signed byte vs. the two byte boolean datatype. Case 2 byte Yes/No datatype: 8*2*100,000 = 1,600,000 bytes Case 1 byte boolean field: 8*100,000 = 800,000 bytes In this case, you still have better speed due to the overriding impact on performance of transfer time plus there is no impact on query complexity. This is both simple and faster than a two byte datatype. In addition, you have the benefit of a smaller file (less time to transfer by mail or otherwise between remote servers, lower backup requirements) plus you benefit from reduced overall network traffic so all users benefit no matter what applications they are running. I think that a smaller datatype is a no brainer on all counts and was merely wondering why Microsoft approached the boolean type with No Brain? Whatever their reason, it would appear that a signed byte would handle 0 and -1. Hen It's for speed/simplicity versus storage requirements. A 32 bit (long) data type is the quickest and easiest to retrieve/mainipuate/store in a 32 bit OS. If you used a single bit to store a yes/no field, it would reduce the storage requirement, but think about how you would then have to store/locate/ index etc that particualr bit. It would make it a very slow datatype to use in comparison. If you only have a single Yes/No field in the table, what are you going to do with the other 7 bits in the byte anyway - so you might as well just store it as a whole byte. Once you take that step, you might as well use a long for speed rather than a byte (You don't use single byte Integer variables do you <g>) _________________________________________________________________ MSN 8 with e-mail virus protection service: 2 months FREE* http://join.msn.com/?page=features/virus