Introduction
TranslateSQL is a utility for generating SQL-Server SQL scripts based on an existing Oracle schema, in essence 'translating' Oracle schemas into SQL-Server databases.
The advantage over the built-in SQL-Server import wizard is that this program imports the constraints, indexes and primary keys. On the other hand, it does not copy the data, as does the SQL-Server wizard.
Perhaps of greater interest, I have diligently documented the source code. Most open source projects I've examined have such poor documentation that it makes modifying the code very difficult, and in particular in the cases where you want to learn from the code.
The program uses the USER_* views in Oracle to get the list and definitions of:
- tables
- columns
- indexes
- primary keys
- foreign keys constraints
- check constraints
- unique constraints
All built-in column data types are supported except for the following, which have no direct equivalent in SQL-Server:
Default values for columns are transferred over and the following special values are converted to their SQL-Server equivalent:
- SYSDATE --> GetDate()
- USER --> system_user
The assumption is that the user is trying to build an SQL-Server equivalent to his Oracle build scripts. The baseline DML statements (INSERTs) should be usable as-is, although the stored procedures and triggers would need manual conversion.
Although the program can be used for a single-shot conversion, it also makes supporting both RDBMS's relatively simple. You would make modifications to your schema in Oracle, and whenever testing / rolling out for SQL-Server simply run TranslateSQL to get the exact replica. This method is much safer than trying to keep the two versions in synch, since sooner or later the developers will forget to duplicate a change.
Origin
The utility was originally written for 2ndSight Software while I worked there as a programmer. The company graciously gave me rights to this little utility program.
Usage
Figure 1. TranslateSQL Main Screen
The Source Data parameters are those needed to connect to Oracle.
The program can convert NUMBERs into ints and floats. This can offer performance gains, but risks breaking client code. For example, the .Net data access libraries are very picky about data types, and modifications to the code may be required to read ints instead of numerics.
Identity columns in Oracle are generally implemented as SEQUENCEs, using triggers on inserts to put the next value in the column. As such, there is no easy way to programmatically determine what columns should be made into identity columns. Instead, the user should enter a carriage return delimited list of [table_name].[column_name]. Any table column that matches the an entry in the list will be made an identity column.
The program stores all parameters (except for the password for the schema owner) in the registry, so values are kept between runs.
Notes
SQL-Server already has an import utility that converts Oracle tables, except that it is severely limited-it ignores primary keys, indexes and constraints. On the other hand, it copies all the data in the tables, which TranslateSQL does not do.
As an aside, it would be relatively easy to modify the project to copy all of the data using generic ADO commands. This would need to be done before the foreign keys are created, since otherwise the INSERTs would need to be done in a specific order to avoid foreign key violations.
N.B.: The program does not run against Oracle 8i databases, although it runs fine with Oracle 9 and 10. This is because the USER_* views seem to have some small differences. For those who need Oracle 8i compatibility, this should be relatively easy to fix in the source code.
The process is relatively simple:
- Query the Oracle USER_* views to get all the table, index, constraint and primary key details.
- For each object, create the appropriate CREATE / ALTER statement and save the results to file.
All database objects are derived from DbObject, whose only virtual member is GenerateSql().
DbSchema is the main public interface for getting everything done. Calling GetDefinition() does step 1 and calling GenerateSQL() does step 2.
Because many objects hold references to each other (ex.: a foreign key has a reference to a DbTable and a
DbConstraintPrimary), DbDictionary is used for lookups to find the appropriate object. However, because primary keys and unique constraints exist as both indexes and themselves, they need to be stored separately from other objects (and indexes in particular).
Data Type Conversions
Most built-in column data types are supported, with the exception of:
- Date / Time Intervals
- BFILEs
- Row Ids
User defined types (e.g. Object Types, REFs, Varrays, Nested Tables, ...) are not converted.
Table 1 lists how data types are being mapped. If you find any mistakes I would appreciate knowing of them....
Table 1. Data Type Conversions - Oracle to SQL-Server
Oracle |
SQL-Server |
CHAR |
char |
NCHAR |
nchar |
VARCHAR2 |
varchar |
NVARCHAR2 |
nvarchar |
NUMBER |
numeric* |
BINARY_FLOAT |
float(24) |
BINARY_DOUBLE |
float(53) |
DATE TIMESTAMP |
datetime |
LONG_RAW BLOB |
image |
LONG CLOB |
text |
NCLOB |
ntext |
RAW |
varbinary |
* If the checkbox 'Convert Numbers into ints and floats' is checked, NUMBERs are converted into tinyints, smallints, ints, bigints, floats, or kept as numerics they cannot fit in any of the others.
The DbColumn class contains all the code dealing with data type conversions:
Table 2. DbColumn Data Type Conversion Code
Member |
Description |
enum DataType |
Lists all data types. 'Other' is used for any data type that is not supported. Such columns are then skipped by GenerateSQL() |
DbColumn() |
Constructor - converts the data read from the USER_COLUMNS view into the appropriate enum DataType |
GenerateSQL() |
Generates the SQL string that will create the column |
DbObjects
As mentioned above, DbObjects are pretty straightforward. The only virtual function is GenerateSQL(), which all derived classes must implement.
Figure 1. DbObject and Derived Classes
N.B.: underlined names are static members, pure virtual functions are in italics, a '+' signifies a public member and '#' a protected member.
Derived classes mostly just have whatever extra data members they need to store all their parameters. DbColumn is different in that other classes (e.g. DbTable) need searchable ArrayLists of their columns. As such, DbColumn implements operator==(), operator!=() and GetHashCode().
In Conclusion
As described in the Overview section above, there is not much more to it than that:
- The Schema class starts and directs the whole process
- The DbObject derived classes each represent one of the database entities that needs to be translated
- Of these DbObjects, DbColumn is of particular interest since it decides how data types are mapped
- DbDictionnary is used throughout to map references between DbObjects
- However, DbDictionnary stores primary and unique constraints separately from other objects because these entities also exist as indexes.