[dba-SQLServer] Test for empty columns

jwcolby jwcolby at colbyconsulting.com
Mon Sep 21 14:18:05 CDT 2009


Elizabeth,

That did indeed give me a list of all tables, and fields in the tables.

Now...

The goal is to select a specific table, iterate through all of the fields in that table, and 
discover whether each field has any data, even in just a single row.

The first table I will be doing this to contains 50 million records, and ~640 fields.  Obviously 
there are not indexes on each field, though there are on perhaps a hundred of the 640 fields.  I 
need to store the results such that when the process is finished I have a record of which fields are 
empty (have no data in any record).  This table is completely static to this point, i.e. I never 
update any of the fields.  Thus by dropping any empty fields I can potentially speed up all further 
operations on this table, with no discernible downside.

So is an acceptable strategy to simply create a count query on-the-fly and count() each column? 
Some faster way (remember there will not be an index for most fields).  Would it be faster to make 
an index for one field, count the field, then drop the index?

Or... does the system database somewhere contain this information already?  Which fields contain data.

John W. Colby
www.ColbyConsulting.com


Elizabeth.J.Doering at wellsfargo.com wrote:
> This sproc gets at all the columns in all the tables:
> 
>  
> CREATE  procedure [dbo].[_DataDictionary]
> --Create on the fly data dictionary.
> as
> -build your destination table if necessary
> IF NOT EXISTS (Select name FROM sys.objects WHERE object_id = OBJECT_ID(N'[DataDictionary]') AND type in (N'U'))
> 		BEGIN
> 		CREATE TABLE [dbo].[DataDictionary](
> 			[table_qualifier] [varchar](50) NULL,
> 			[table_owner] [varchar](50) NULL,
> 			[table_name] [varchar](50) NOT NULL,
> 			[column_name] [varchar](50) NOT NULL,
> 			[data_type] [int] NULL,
> 			[type_name] [varchar](50) NULL,
> 			[precision] [varchar](50) NULL,
> 			[length] [varchar](50) NULL,
> 			[scale] [varchar](50) NULL,
> 			[radix] [varchar](50) NULL,
> 			[nullable] [varchar](50) NULL,
> 			[remarks] [varchar](50) NULL,
> 			[column_def] [varchar](50) NULL,
> 			[sql_data_type] [varchar](50) NULL,
> 			[sql_datetime_sub] [varchar](50) NULL,
> 			[char_octet_length] [varchar](50) NULL,
> 			[ordinal_position] [varchar](50) NULL,
> 			[is_nullable] [varchar](50) NULL,
> 			[ss_data_type] [varchar](50) NULL,
> 		 CONSTRAINT [PK_DataDictionary] PRIMARY KEY CLUSTERED 
> 		(
> 			[table_name] ASC,
> 			[column_name] ASC
> 		)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
> 		) ON [PRIMARY]
> 		END
> -but if the destination already exists, just empty it
> truncate table datadictionary
> 
> -select all the object names in the database into a cursor 
> 
> DECLARE tnames_cursor CURSOR
> FOR
>   --2000 version: select   name, type from sysobjects where (type = 'v' or type = 'u') order by name 
>   --2005 version: 
> select   name, type from sys.all_objects where is_ms_shipped = 0 and  (type = 'v' or type = 'u') and name <> 'sysdiagrams' order by name 
> OPEN tnames_cursor;
> DECLARE @tablename varchar(100);
> DECLARE @type varchar(5)
> 
> -loop through the cursor, pulling off all the column names into your table
> 
> FETCH NEXT FROM tnames_cursor INTO @tablename, @type;
> WHILE (@@FETCH_STATUS <> -1)
> BEGIN
>    IF (@@FETCH_STATUS <> -2)
>    BEGIN   
>       SELECT @tablename = RTRIM(@tablename); 
> 	  INSERT INTO [dbo].[DataDictionary]
>            ([table_qualifier]
>            ,[table_owner]
>            ,[table_name]
>            ,[column_name]
>            ,[data_type]
>            ,[type_name]
>            ,[precision]
>            ,[length]
>            ,[scale]
>            ,[radix]
>            ,[nullable]
>            ,[remarks]
>            ,[column_def]
>            ,[sql_data_type]
>            ,[sql_datetime_sub]
>            ,[char_octet_length]
>            ,[ordinal_position]
>            ,[is_nullable]
>            ,[ss_data_type]) exec sys.sp_columns @tablename
>  
>    END;
>    FETCH NEXT FROM tnames_cursor INTO @tablename, @type;
> END;
> CLOSE tnames_cursor;
> DEALLOCATE tnames_cursor; 
> 
> 
> 
> 
> 
> 
> Thanks, 
> 
> Liz 
> 
> 
> This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.
> 
> 
> -----Original Message-----
> From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of jwcolby
> Sent: Monday, September 21, 2009 12:42 PM
> To: Dba-Sqlserver
> Subject: [dba-SQLServer] Test for empty columns
> 
> My database from hell has at least some columns where the column simply contains no data in any 
> record.  I am wondering how I would find and list these columns so that I can delete them from my table.
> 
> Would I write a stored procedure to get the columns from the database and then just iterate through 
> that recordset building a dynamic query?  Is there already something out there that does this? 
> Anybody doing this?
> 
> TIA,
> 



More information about the dba-SQLServer mailing list