SQL Server stores information about all objects and their properties as metadata that can be accessed through system views. In addition, some of the system views hold interesting nuances that can help to better understand how a DBMS works.
To see the system view body, just as for any other script object, the OBJECT_DEFINITION function is used:
- PRINT OBJECT_DEFINITION(OBJECT_ID('sys.objects'))
However, OBJECT_DEFINITION, as well as its analogue sp_helptext, has a significant disadvantage; it does not allow the return script description for a table object.
- IF OBJECT_ID('dbo.Table1', 'U') IS NOT NULL
- DROP TABLE dbo.Table1
- GO
-
- CREATE TABLE dbo.Table1 (ColumnID INT PRIMARY KEY)
- GO
-
- EXEC sys.sp_helptext 'dbo.Table1'
- SELECT OBJECT_DEFINITION(OBJECT_ID('dbo.Table1', 'U'))
When executing sp_helptext, we will get the following error:
Msg 15197, Level 16, State 1, Procedure sp_helptext, Line 107
There is no text for object 'dbo.Table1'.
Under the same conditions, a system function OBJECT_DEFINITION returns NULL.
Fetching from sys.sql_modules will also not solve the problem, since the same old OBJECT_DEFINITION function call is used inside this system view:
- CREATE VIEW sys.sql_modules AS
- SELECT object_id = o.id,
- definition = object_definition(o.id),
- ...
- FROM sys.sysschobjs o
Such behavior is rather disappointing. Sometimes it is neecessary to retrieve a script description of a table for some scripts. Well, let’s look at system views and create an OBJECT_DEFINITION function analogue for working with table objects.
To start, let's create a test table, in order for the process of script writing to be more clear:
- IF OBJECT_ID('dbo.WorkOut', 'U') IS NOT NULL
- DROP TABLE dbo.WorkOut
- GO
-
- CREATE TABLE dbo.WorkOut
- (
- WorkOutID BIGINT IDENTITY(1,1) NOT NULL,
- TimeSheetDate AS DATEADD(DAY, -(DAY(DateOut) - 1), DateOut),
- DateOut DATETIME NOT NULL,
- EmployeeID INT NOT NULL,
- IsMainWorkPlace BIT NOT NULL DEFAULT 1,
- DepartmentUID UNIQUEIDENTIFIER NOT NULL,
- WorkShiftCD NVARCHAR(10) NULL,
- WorkHours REAL NULL,
- AbsenceCode VARCHAR(25) NULL,
- PaymentType CHAR(2) NULL,
- CONSTRAINT PK_WorkOut PRIMARY KEY CLUSTERED (WorkOutID)
- )
- GO
And proceed to the first step, getting a list of columns and their properties.
Essentially, the list of columns can be obtained by simply referencing one of the several system views. Thus, it is important to fetch from the simplest system views, in order for the query execution time to be minimal.
Here are a few examples along with their execution plans made in dbForge Studio for SQL Server:
-
- SELECT *
- FROM INFORMATION_SCHEMA.COLUMNS c
- WHERE c.TABLE_SCHEMA = 'dbo'
- AND c.TABLE_NAME = 'WorkOut'
-
- SELECT c.*
- FROM sys.columns c WITH(NOLOCK)
- JOIN sys.tables t WITH(NOLOCK) ON c.[object_id] = t.[object_id]
- JOIN sys.schemas s WITH(NOLOCK) ON t.[schema_id] = s.[schema_id]
- WHERE t.name = 'WorkOut'
- AND s.name = 'dbo'
-
- SELECT *
- FROM sys.columns c WITH(NOLOCK)
- WHERE OBJECT_NAME(c.[object_id]) = 'WorkOut'
- AND OBJECT_SCHEMA_NAME(c.[object_id]) = 'dbo'
-
- SELECT *
- FROM sys.columns c WITH(NOLOCK)
- WHERE c.[object_id] = OBJECT_ID('dbo.WorkOut', 'U')
The presented plans show that the #1 and #2 approaches contain excessive amount of connections that will increase the query execution time, while the #3 approach leads to the complete scan of the index, making it the least efficient of all.
In terms of performance, the #4 approach remains the most attractive to me.
However, data contained in sys.columns (as well as in INFORMATION_SCHEMA.COLUMNS) is not enough to completely describe the table structure. This forces joins to other system views to be established as in the following:
- SELECT
- c.name
- , [type_name] = tp.name
- , type_schema_name = s.name
- , c.max_length
- , c.[precision]
- , c.scale
- , c.collation_name
- , c.is_nullable
- , c.is_identity
- , ic.seed_value
- , ic.increment_value
- , computed_definition = cc.[definition]
- , default_definition = dc.[definition]
- FROM sys.columns c WITH(NOLOCK)
- JOIN sys.types tp WITH(NOLOCK) ON c.user_type_id = tp.user_type_id
- JOIN sys.schemas s WITH(NOLOCK) ON tp.[schema_id] = s.[schema_id]
- LEFT JOIN sys.computed_columns cc WITH(NOLOCK) ON
- c.[object_id] = cc.[object_id]
- AND c.column_id = cc.column_id
- LEFT JOIN sys.identity_columns ic WITH(NOLOCK) ON
- c.[object_id] = ic.[object_id]
- AND c.column_id = ic.column_id
- LEFT JOIN sys.default_constraints dc WITH(NOLOCK) ON dc.[object_id] = c.default_object_id
- WHERE c.[object_id] = OBJECT_ID('dbo.WorkOut', 'U')
Accordingly, the execution plan will look not so optimistic, as before. Note that the column list is even read out 3 times:
Have a look inside sys.default_constraints:
- ALTER VIEW sys.default_constraints AS
- SELECT name, object_id, parent_object_id,
- ...
- object_definition(object_id) AS definition,
- is_system_named
- FROM sys.objects$
- WHERE type = 'D ' AND parent_object_id > 0
There is an OBJECT_DEFINITION call inside the system view. So, to retrieve the description of the default constraint, we don’t need to establish joining.
OBJECT_DEFINITION is still used in sys.computed_columns:
- ALTER VIEW sys.computed_columns AS
- SELECT object_id = id,
- name = name,
- column_id = colid,
- system_type_id = xtype,
- user_type_id = utype,
- ...
- definition = object_definition(id, colid),
- ...
- FROM sys.syscolpars
- WHERE number = 0
- AND (status & 16) = 16
- AND has_access('CO', id) = 1
We seem to have already avoided 2 joins. The case with sys.identity_columns is more curious:
- ALTER VIEW sys.identity_columns AS
- SELECT object_id = id,
- name = name,
- column_id = colid,
- system_type_id = xtype,
- user_type_id = utype,
- ...
- seed_value = IdentityProperty(id, 'SeedValue'),
- increment_value = IdentityProperty(id, 'IncrementValue'),
- last_value = IdentityProperty(id, 'LastValue'),
- ...
- FROM sys.syscolpars
- WHERE number = 0
- AND (status & 4) = 4
- AND has_access('CO', id) = 1
To retrieve information about IDENTITY properties, an undocumented property IDENTITYPROPERTY is used. After a check, its unchanging behavior on SQL Server 2005 and higher was ascertained.
As a result of calling these functions directly, the column list obtaining query becomes significantly simplified:
- SELECT
- c.name
- , [type_name] = tp.name
- , type_schema_name = s.name
- , c.max_length
- , c.[precision]
- , c.scale
- , c.collation_name
- , c.is_nullable
- , c.is_identity
- , seed_value = CASE WHEN c.is_identity = 1 THEN IDENTITYPROPERTY(c.[object_id], 'SeedValue') END
- , increment_value = CASE WHEN c.is_identity = 1 THEN IDENTITYPROPERTY(c.[object_id], 'IncrementValue') END
- , computed_definition = OBJECT_DEFINITION(c.[object_id], c.column_id)
- , default_definition = OBJECT_DEFINITION(c.default_object_id)
- FROM sys.columns c WITH(NOLOCK)
- JOIN sys.types tp WITH(NOLOCK) ON c.user_type_id = tp.user_type_id
- JOIN sys.schemas s WITH(NOLOCK) ON tp.[schema_id] = s.[schema_id]
- WHERE c.[object_id] = OBJECT_ID('dbo.WorkOut', 'U')
And the execution plan becomes more efficient:
Finally, instead of joining to sys.schemas, the SCHEMA_NAME system function can be called, that triggers much more faster than JOIN. This is true, provided that the number of schemes does not exceed the number of user objects. And since such a situation is unlikely, it can be neglected.
Next, get a list of columns included in the primary key. The most obvious approachis to use sys.key_constraints:
- SELECT
- pk_name = kc.name
- , column_name = c.name
- , ic.is_descending_key
- FROM sys.key_constraints kc WITH(NOLOCK)
- JOIN sys.index_columns ic WITH(NOLOCK) ON
- kc.parent_object_id = ic.object_id
- AND ic.index_id = kc.unique_index_id
- JOIN sys.columns c WITH(NOLOCK) ON
- ic.[object_id] = c.[object_id]
- AND ic.column_id = c.column_id
- WHERE kc.parent_object_id = OBJECT_ID('dbo.WorkOut', 'U')
- AND kc.[type] = 'PK'
In most cases, PRIMARY KEY is a clustered index and the Unique constraint.
At the metadata level, SQL Server sets index_id to 1 for all clustered indexes, so we can make a selection from sys.indexes filtering by is_primary_key = 1 or by index_id = 1 (not recommended).
Additionally, to avoid joining to sys.columns, the COL_NAME system function can be used:
- SELECT
- pk_name = i.name
- , column_name = COL_NAME(ic.[object_id], ic.column_id)
- , ic.is_descending_key
- FROM sys.indexes i WITH(NOLOCK)
- JOIN sys.index_columns ic WITH(NOLOCK) ON
- i.[object_id] = ic.[object_id]
- AND i.index_id = ic.index_id
- WHERE i.is_primary_key = 1
- AND i.[object_id] = object_id('dbo.WorkOut', 'U')
Now combine the obtained queries into one query to get the following final query:
- DECLARE
- @object_name SYSNAME
- , @object_id INT
- , @SQL NVARCHAR(MAX)
-
- SELECT
- @object_name = '[' + OBJECT_SCHEMA_NAME(o.[object_id]) + '].[' + OBJECT_NAME([object_id]) + ']'
- , @object_id = [object_id]
- FROM (SELECT [object_id] = OBJECT_ID('dbo.WorkOut', 'U')) o
-
- SELECT @SQL = 'CREATE TABLE ' + @object_name + CHAR(13) + '(' + CHAR(13) + STUFF((
- SELECT CHAR(13) + ' , [' + c.name + '] ' +
- CASE WHEN c.is_computed = 1
- THEN 'AS ' + OBJECT_DEFINITION(c.[object_id], c.column_id)
- ELSE
- CASE WHEN c.system_type_id != c.user_type_id
- THEN '[' + SCHEMA_NAME(tp.[schema_id]) + '].[' + tp.name + ']'
- ELSE '[' + UPPER(tp.name) + ']'
- END +
- CASE
- WHEN tp.name IN ('varchar', 'char', 'varbinary', 'binary')
- THEN '(' + CASE WHEN c.max_length = -1
- THEN 'MAX'
- ELSE CAST(c.max_length AS VARCHAR(5))
- END + ')'
- WHEN tp.name IN ('nvarchar', 'nchar')
- THEN '(' + CASE WHEN c.max_length = -1
- THEN 'MAX'
- ELSE CAST(c.max_length / 2 AS VARCHAR(5))
- END + ')'
- WHEN tp.name IN ('datetime2', 'time2', 'datetimeoffset')
- THEN '(' + CAST(c.scale AS VARCHAR(5)) + ')'
- WHEN tp.name = 'decimal'
- THEN '(' + CAST(c.[precision] AS VARCHAR(5)) + ',' + CAST(c.scale AS VARCHAR(5)) + ')'
- ELSE ''
- END +
- CASE WHEN c.collation_name IS NOT NULL AND c.system_type_id = c.user_type_id
- THEN ' COLLATE ' + c.collation_name
- ELSE ''
- END +
- CASE WHEN c.is_nullable = 1
- THEN ' NULL'
- ELSE ' NOT NULL'
- END +
- CASE WHEN c.default_object_id != 0
- THEN ' CONSTRAINT [' + OBJECT_NAME(c.default_object_id) + ']' +
- ' DEFAULT ' + OBJECT_DEFINITION(c.default_object_id)
- ELSE ''
- END +
- CASE WHEN cc.[object_id] IS NOT NULL
- THEN ' CONSTRAINT [' + cc.name + '] CHECK ' + cc.[definition]
- ELSE ''
- END +
- CASE WHEN c.is_identity = 1
- THEN ' IDENTITY(' + CAST(IDENTITYPROPERTY(c.[object_id], 'SeedValue') AS VARCHAR(5)) + ',' +
- CAST(IDENTITYPROPERTY(c.[object_id], 'IncrementValue') AS VARCHAR(5)) + ')'
- ELSE ''
- END
- END
- FROM sys.columns c WITH(NOLOCK)
- JOIN sys.types tp WITH(NOLOCK) ON c.user_type_id = tp.user_type_id
- LEFT JOIN sys.check_constraints cc WITH(NOLOCK)
- ON c.[object_id] = cc.parent_object_id
- AND cc.parent_column_id = c.column_id
- WHERE c.[object_id] = @object_id
- ORDER BY c.column_id
- FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 7, ' ') +
- ISNULL((SELECT '
- , CONSTRAINT [' + i.name + '] PRIMARY KEY ' +
- CASE WHEN i.index_id = 1
- THEN 'CLUSTERED'
- ELSE 'NONCLUSTERED'
- END +' (' + (
- SELECT STUFF(CAST((
- SELECT ', [' + COL_NAME(ic.[object_id], ic.column_id) + ']' +
- CASE WHEN ic.is_descending_key = 1
- THEN ' DESC'
- ELSE ''
- END
- FROM sys.index_columns ic WITH(NOLOCK)
- WHERE i.[object_id] = ic.[object_id]
- AND i.index_id = ic.index_id
- FOR XML PATH(N''), TYPE) AS NVARCHAR(MAX)), 1, 2, '')) + ')'
- FROM sys.indexes i WITH(NOLOCK)
- WHERE i.[object_id] = @object_id
- AND i.is_primary_key = 1), '') + CHAR(13) + ');'
-
- PRINT @SQL
Which, when executed, will generate the following script for the test table:
- CREATE TABLE [dbo].[WorkOut]
- (
- [WorkOutID] [BIGINT] NOT NULL IDENTITY(1,1)
- , [TimeSheetDate] AS (dateadd(day, -(datepart(day,[DateOut])-(1)),[DateOut]))
- , [DateOut] [DATETIME] NOT NULL
- , [EmployeeID] [INT] NOT NULL
- , [IsMainWorkPlace] [BIT] NOT NULL CONSTRAINT [DF__WorkOut__IsMainW__52442E1F] DEFAULT ((1))
- , [DepartmentUID] [UNIQUEIDENTIFIER] NOT NULL
- , [WorkShiftCD] [NVARCHAR](10) COLLATE Cyrillic_General_CI_AS NULL
- , [WorkHours] [REAL] NULL
- , [AbsenceCode] [VARCHAR](25) COLLATE Cyrillic_General_CI_AS NULL
- , [PaymentType] [CHAR](2) COLLATE Cyrillic_General_CI_AS NULL
- , CONSTRAINT [PK_WorkOut] PRIMARY KEY CLUSTERED ([WorkOutID])
- );
As you can see, the topic is too broad and it is not limited to a column list and primary key.
That's why generation of indexes, foreign keys and other statements are planned to be revealed in the next part of this topic.