Charlotte Foust
charlotte.foust at gmail.com
Tue Oct 30 13:57:32 CDT 2012
So you could create dimension tables for at least the most commonly used of those fields that would allow you to roll up the data quickly.. Charlotte On Tue, Oct 30, 2012 at 10:55 AM, jwcolby <jwcolby at colbyconsulting.com>wrote: > In these databases there are only ever two tables, a "name/address" table > and a "fact" table. The two are related 1-1 with a PKID originating in the > fact table and propagated into the Name/Address table. > > The Name/Address table originates in the fact table and I split it out as > the first step in processing the database. The name/address component is > sent out for Address validation / cleaning and then imported back into its > own separate table. I do that simply because it allows me to standardize > the address validation using my own field names as well as using additional > fields returned by the address validation software. > > Basically the fact table has information like "Marital status", "Length of > residence", "Teen Driver", "collectible antiques" etc etc. 270 fields of > etc etc. > > Every field is not filled in for every record, but some fields are filled > in for all records. There are 225 million fact records and only a small > portion of them are filled in for any given field. > > In fact the point of this exercise is to discover how many records are > filled in for each field. We may get 5% for some fields and 90% (or even > 100%) for others. > > > John W. Colby > Colby Consulting > > Reality is what refuses to go away > when you do not believe in it > > On 10/30/2012 12:56 PM, Charlotte Foust wrote: > >> Are you using dimension tables at all, John? I'd think a hybrid snowflake >> schema might help with that kind of grouping. >> >> Charlotte >> >> On Tue, Oct 30, 2012 at 7:04 AM, jwcolby <jwcolby at colbyconsulting.com>** >> wrote: >> >> If I didn't have time for iHumor I would go iCrazy. >>> >>> I am currently working on a system to build a data dictionary table and >>> populate it for a given database. Then I need to select which columns >>> (fields) to count the data items in - a groupby / count query on all >>> selected fields. >>> >>> The databases I use tend to have a ton of columns (277 fields in the one >>> I >>> am working on now), each column tends to have a 1/0, T/F or a limited >>> range >>> of codes 1-9 and a-z. Unfortunately other columns have names, zips, >>> address fields etc so I can't do a groupby on those fields or I will get >>> tens of millions of counts - not the intention. >>> >>> So I build a data dictionary table which reads out the standard stuff - >>> >>> [SchemaName] >>> ,[TableName] >>> ,[ColumnId] >>> ,[ColumnName] >>> ,[DataType] >>> ,[CharacterMaximumLength] >>> ,[ColumnDescription] >>> ,[ColumnDefault] >>> ,[IsNullable] >>> >>> plus a couple of extra fields of my own [PerformCount] and (denormalized) >>> [CountInfo]. >>> >>> I then populate that table, unselect the fields with things like names / >>> addresses, then run a groupby/count on the remaining fields, updating the >>> [CountInfo] with the resulting data. Push it out to a spreadsheet and >>> send >>> to the customer. >>> >>> Keeps me busy. >>> >>> >>> John W. Colby >>> Colby Consulting >>> >>> Reality is what refuses to go away >>> when you do not believe in it >>> >>> On 10/30/2012 9:34 AM, jack drawbridge wrote: >>> >>> iLike it John, but when do you get time... you always seem up to your >>>> iBalls in real work. >>>> jack >>>> >>>> On Tue, Oct 30, 2012 at 8:45 AM, jwcolby <jwcolby at colbyconsulting.com>* >>>> *** >>>> wrote: >>>> >>>> From: >>>> >>>>> >>>>> http://www.techradar.com/us/******news/phone-and-**communications/****<http://www.techradar.com/us/****news/phone-and-communications/****> >>>>> <http://**www.techradar.com/us/**news/**phone-and-communications/**<http://www.techradar.com/us/**news/phone-and-communications/**> >>>>> > >>>>> mobile-phones/apple-samsung-******smartphone-profits-add-up-**to-**** >>>>> 106-percent-1108455<http://**w**ww.techradar.com/us/news/**<http://www.techradar.com/us/news/**> >>>>> phone-and-communications/****mobile-phones/apple-samsung-** >>>>> smartphone-profits-add-up-to-****106-percent-1108455<http://** >>>>> www.techradar.com/us/news/**phone-and-communications/** >>>>> mobile-phones/apple-samsung-**smartphone-profits-add-up-to-** >>>>> 106-percent-1108455<http://www.techradar.com/us/news/phone-and-communications/mobile-phones/apple-samsung-smartphone-profits-add-up-to-106-percent-1108455> >>>>> > >>>>> >>>>> >>>>>> >>>>> Apple's sales may only be hampered by the fact that its 7.9-inch iPad >>>>> >>>>>> >>>>>> mini could be in short supply, as it has already sold out of its >>>>> initial >>>>> launch stock. Customers who pre-order the WiFi model of an iPad mini >>>>> online Monday face a two-week shipping wait time. >>>>> >>>>> Uhhh yep, they are soooo cool that Apple can't build enough for all the >>>>> iFolks standing in those long iLines. >>>>> >>>>> My biggest concern is all of the poor iFolks standing in iLines in the >>>>> northeast as the hurricane sweeps overhead. You know they won't >>>>> evacuate >>>>> and lose their iPlace in iLine. There are already reports of entire >>>>> iLines >>>>> of iDiots being washed out to sea. >>>>> >>>>> The upside is that the US average IQ is inching upwards. >>>>> >>>>> ;) >>>>> >>>>> BTW, before I even pressed send I was served with an iCease and iDesist >>>>> for using the iCopyrighted phrase iFolks, iPlace and iLine. For some >>>>> reason iDiot wasn't included in the iCease and iDesist. iStrange! >>>>> >>>>> ;) >>>>> >>>>> OK, iBack to iWork. >>>>> >>>>> -- >>>>> John W. Colby >>>>> Colby Consulting >>>>> >>>>> Reality is what refuses to go away >>>>> when you do not believe in it >>>>> >>>>> -- >>>>> AccessD mailing list >>>>> AccessD at databaseadvisors.com >>>>> http://databaseadvisors.com/******mailman/listinfo/accessd<http://databaseadvisors.com/****mailman/listinfo/accessd> >>>>> <htt**p://databaseadvisors.com/****mailman/listinfo/accessd<http://databaseadvisors.com/**mailman/listinfo/accessd> >>>>> > >>>>> <http:**//databaseadvisors.**com/**mailman/listinfo/accessd<http://databaseadvisors.com/**mailman/listinfo/accessd> >>>>> **<http://databaseadvisors.com/**mailman/listinfo/accessd<http://databaseadvisors.com/mailman/listinfo/accessd> >>>>> > >>>>> >>>>>> >>>>>> Website: http://www.databaseadvisors.******com<http://www.** >>>>> databaseadvisors.com <http://www.databaseadvisors.**com<http://www.databaseadvisors.com> >>>>> >> >>>>> >>>>> >>>>> -- >>> AccessD mailing list >>> AccessD at databaseadvisors.com >>> http://databaseadvisors.com/****mailman/listinfo/accessd<http://databaseadvisors.com/**mailman/listinfo/accessd> >>> <http:**//databaseadvisors.com/**mailman/listinfo/accessd<http://databaseadvisors.com/mailman/listinfo/accessd> >>> > >>> Website: http://www.databaseadvisors.****com<http://www.** >>> databaseadvisors.com <http://www.databaseadvisors.com>> >>> >>> > -- > AccessD mailing list > AccessD at databaseadvisors.com > http://databaseadvisors.com/**mailman/listinfo/accessd<http://databaseadvisors.com/mailman/listinfo/accessd> > Website: http://www.databaseadvisors.**com<http://www.databaseadvisors.com> >