Oracle9i - Sorting Capabilities

Oracle provides linguistic sort capabilities that handle the complex sorting requirements of different languages and cultures. Different languages have different sort orders. What's more, different cultures or countries using the same alphabets may sort words differently.
For example, in Danish, the letter Æ is after Z, while Y and Ü are considered to be variants of the same letter. Sort order can be case sensitive or insensitive, and can ignore accents or not. It can be either phonetic or based on the appearance of the character, such as ordering by the number of strokes or by radicals for East Asian ideographs.Another common sorting issue is when letters are combined. For example, in traditional Spanish, "ch"is a distinct character, which means that the correct order would be: cerveza, Colorado, cheremoya, and so on. This means that the letter "c" cannot be sorted until checking to see if the next letter is an "h". Oracle provides several different types of sort, and can achieve a linguistically correct sort as well as the new multilingual ISO standard (10646) designed to handle many languages at the same time.
Using Binary SortsConventionally, when character data is stored, the sort sequence is based on the numeric values of the characters defined by the character encoding scheme. This is called a binary sort. Binary sorts are the fastest type of sort, and produce reasonable results for the English alphabet because the ASCII and EBCDIC standards define the letters A to Z in ascending numeric value. Note, however, that in the ASCII standard, all uppercase letters appear before any lowercase letters. In the EBCDIC standard, the opposite is true: all lowercase letters appear before any uppercase letters. When characters used in other languages are present, a binary sort generally does not produce reasonable results.
For example, an ascending ORDER BY query would return the character strings ABC, ABZ, BCD, ÄBC, in the sequence, when the Ä has a higher numeric value than B in the character encoding scheme. For languages using Chinese characters, a binary sort is not linguistically meaningful.
Note:- To obtain the full article, mail to r.krishnachaitanya@gmail.com

No comments:

Post a Comment