[dba-SQLServer] Test for empty columns

Elizabeth.J.Doering at wellsfargo.com Elizabeth.J.Doering at wellsfargo.com
Mon Sep 21 14:55:53 CDT 2009


John,

I thank God every day that I don't deal with any thing the magnitude of your db-f-h.

I would probably do the query on the fly and count(*) approach...but it would be all trial and error and I would probably be sorry on account of at least some of the trials.  

You can build a quick and dirty version of the queries on fly like this:

Create PROCEDURE [dbo].[procDataDictionarySearch]
	@TableName as varchar(32),
	@SearchString as varchar(32) = null,
	@LastDate as datetime = null
as
	declare @SQL as varchar(4000);
 

 SELECT @SearchString = '%' + @SearchString + '%'

 DECLARE rsTemp CURSOR
	for 
	Select distinct column_name, data_type from dbo.DataDictionary where table_name = @tablename
	OPEN rsTemp
	DECLARE @column_name varchar(30);
	DECLARE @data_type int

	FETCH NEXT FROM rsTemp INTO @column_name, @data_type;
	if @SearchString is null 
		begin
		set @SQL = 'Select * from ' + @tablename + ' where ('  + @column_name + ' is null'
		end
	else
		begin		
		set @SQL = 'Select * from ' + @tablename + ' where (' + @column_name + ' like ' + char(39) + @SearchString + char(39)
		end

	WHILE @@Fetch_Status =0
	begin
	
	if @SearchString is null 
		begin
		set @SQL = @SQL + ' OR ' + @column_name + ' is null'
		end
	else
		begin				
		set @SQL = @SQL + ' OR ' + @column_name + ' like ' + char(39) + @SearchString + char(39)
		end
	print @SQL
		FETCH NEXT FROM rsTemp INTO @column_name, @data_type;
	end
 
CLOSE rsTemp;
DEALLOCATE rsTemp;

set @sql = @sql + ')'

--print @sql
 execute (@SQL)


This version won't get you counts, just selects.  So you would want to make that improvement before you go on.

Good luck.

Liz



This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.


-----Original Message-----
From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of jwcolby
Sent: Monday, September 21, 2009 2:18 PM
To: Discussion concerning MS SQL Server
Subject: Re: [dba-SQLServer] Test for empty columns

Elizabeth,

That did indeed give me a list of all tables, and fields in the tables.

Now...

The goal is to select a specific table, iterate through all of the fields in that table, and 
discover whether each field has any data, even in just a single row.

The first table I will be doing this to contains 50 million records, and ~640 fields.  Obviously 
there are not indexes on each field, though there are on perhaps a hundred of the 640 fields.  I 
need to store the results such that when the process is finished I have a record of which fields are 
empty (have no data in any record).  This table is completely static to this point, i.e. I never 
update any of the fields.  Thus by dropping any empty fields I can potentially speed up all further 
operations on this table, with no discernible downside.

So is an acceptable strategy to simply create a count query on-the-fly and count() each column? 
Some faster way (remember there will not be an index for most fields).  Would it be faster to make 
an index for one field, count the field, then drop the index?

Or... does the system database somewhere contain this information already?  Which fields contain data.

John W. Colby
www.ColbyConsulting.com


Elizabeth.J.Doering at wellsfargo.com wrote:
> This sproc gets at all the columns in all the tables:
> 
>  
> CREATE  procedure [dbo].[_DataDictionary]
> --Create on the fly data dictionary.
> as
> -build your destination table if necessary
> IF NOT EXISTS (Select name FROM sys.objects WHERE object_id = OBJECT_ID(N'[DataDictionary]') AND type in (N'U'))
> 		BEGIN
> 		CREATE TABLE [dbo].[DataDictionary](
> 			[table_qualifier] [varchar](50) NULL,
> 			[table_owner] [varchar](50) NULL,
> 			[table_name] [varchar](50) NOT NULL,
> 			[column_name] [varchar](50) NOT NULL,
> 			[data_type] [int] NULL,
> 			[type_name] [varchar](50) NULL,
> 			[precision] [varchar](50) NULL,
> 			[length] [varchar](50) NULL,
> 			[scale] [varchar](50) NULL,
> 			[radix] [varchar](50) NULL,
> 			[nullable] [varchar](50) NULL,
> 			[remarks] [varchar](50) NULL,
> 			[column_def] [varchar](50) NULL,
> 			[sql_data_type] [varchar](50) NULL,
> 			[sql_datetime_sub] [varchar](50) NULL,
> 			[char_octet_length] [varchar](50) NULL,
> 			[ordinal_position] [varchar](50) NULL,
> 			[is_nullable] [varchar](50) NULL,
> 			[ss_data_type] [varchar](50) NULL,
> 		 CONSTRAINT [PK_DataDictionary] PRIMARY KEY CLUSTERED 
> 		(
> 			[table_name] ASC,
> 			[column_name] ASC
> 		)WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
> 		) ON [PRIMARY]
> 		END
> -but if the destination already exists, just empty it
> truncate table datadictionary
> 
> -select all the object names in the database into a cursor 
> 
> DECLARE tnames_cursor CURSOR
> FOR
>   --2000 version: select   name, type from sysobjects where (type = 'v' or type = 'u') order by name 
>   --2005 version: 
> select   name, type from sys.all_objects where is_ms_shipped = 0 and  (type = 'v' or type = 'u') and name <> 'sysdiagrams' order by name 
> OPEN tnames_cursor;
> DECLARE @tablename varchar(100);
> DECLARE @type varchar(5)
> 
> -loop through the cursor, pulling off all the column names into your table
> 
> FETCH NEXT FROM tnames_cursor INTO @tablename, @type;
> WHILE (@@FETCH_STATUS <> -1)
> BEGIN
>    IF (@@FETCH_STATUS <> -2)
>    BEGIN   
>       SELECT @tablename = RTRIM(@tablename); 
> 	  INSERT INTO [dbo].[DataDictionary]
>            ([table_qualifier]
>            ,[table_owner]
>            ,[table_name]
>            ,[column_name]
>            ,[data_type]
>            ,[type_name]
>            ,[precision]
>            ,[length]
>            ,[scale]
>            ,[radix]
>            ,[nullable]
>            ,[remarks]
>            ,[column_def]
>            ,[sql_data_type]
>            ,[sql_datetime_sub]
>            ,[char_octet_length]
>            ,[ordinal_position]
>            ,[is_nullable]
>            ,[ss_data_type]) exec sys.sp_columns @tablename
>  
>    END;
>    FETCH NEXT FROM tnames_cursor INTO @tablename, @type;
> END;
> CLOSE tnames_cursor;
> DEALLOCATE tnames_cursor; 
> 
> 
> 
> 
> 
> 
> Thanks, 
> 
> Liz 
> 
> 
> This message may contain confidential and/or privileged information. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation.
> 
> 
> -----Original Message-----
> From: dba-sqlserver-bounces at databaseadvisors.com [mailto:dba-sqlserver-bounces at databaseadvisors.com] On Behalf Of jwcolby
> Sent: Monday, September 21, 2009 12:42 PM
> To: Dba-Sqlserver
> Subject: [dba-SQLServer] Test for empty columns
> 
> My database from hell has at least some columns where the column simply contains no data in any 
> record.  I am wondering how I would find and list these columns so that I can delete them from my table.
> 
> Would I write a stored procedure to get the columns from the database and then just iterate through 
> that recordset building a dynamic query?  Is there already something out there that does this? 
> Anybody doing this?
> 
> TIA,
> 
_______________________________________________
dba-SQLServer mailing list
dba-SQLServer at databaseadvisors.com
http://databaseadvisors.com/mailman/listinfo/dba-sqlserver
http://www.databaseadvisors.com





More information about the dba-SQLServer mailing list