Tuesday, June 17, 2008

Oracle Transparent Gateway SQL Server Encoding issues resolution

This article for all those who are involved in migration of data from Oracle to SQL server or vice versa.
It highlights a number of encoding issues that you might be facing.


You cannot represent the left and right quotation marks in Latin 8859-1.
They are represented as ‘(â\200\230) and ’ (\200\231).
Only when you set the format to UTF-8 (Unicode Transport Format), you can see them as ( ‘ ) and ( ’ ). These symbols are not found on the Windows keyboards but they are generated by tools like Microsoft Word.

When inserted inside the database, they may get messed up.

There is another problem while inserting these characters in SQL Server database using TG:
SS uses Collation Latin1_General_CI_AS by default. And UCS-2 is the UTF8 equivalent of Oracle in SS. But by no means, they are compatible.

So, a character needs to be converted to ANSI before moving over to SS.

Another problem with UTF8 is with the German characters involving Umlauts. If that is the last character in an insert statement, then 'ORA-01756: quoted string not properly terminated' is encountered.
eg: insert into tmp_enc@dblinkToSS values ('Umlauts ä ö ü ß Ä Ö Ü ');
would result in an error.

Adding a space before the terminating quote character will result in Success.
insert into tmp_enc@dblinkToSS values ('Umlauts ä ö ü ß Ä Ö Ü ');


This is being discussed at the following thread too:

http://www.orafaq.com/forum/t/89484/0/

No comments: