| Register | FAQ | Calendar | Search | Today's Posts | Mark Forums Read |
|
#1
|
| Hi Everyone, This is a little off-topic but, I'm at my wits end on this one.... I've been asked to write an OME function that does a base64 encoding on nvarchar and nchar types. Now this seems simple enough... * Allow for Ingres being little endian when storing the unicode (UTF- 16) characters. ie U+671D is stored as 1D67 * Allow for standard rules on 'short strings' by padding with zero bytes, and overwriting output with a requisite number of '='.. * Divide the input into 6bit chunks and then use that value as an offset into the standard base64 array of characters ie. A - Z, a - z, 0-9, +, /. So 671D is 0110 (6) 0111(7) 0001(1) 1101(D) 0000 0000 Which in groups of 6 becomes: 011001 (25) == Z, 110001 (49) == x, 110100 (52) == 0 Hence we should get a return of 'Zx0='. Trouble is that's not what MySQL gives my programmers on the same data. It insists that this is a string starting with 's6\'. I've counter checked this conversion with some web based conversion utilities and they seem to agree. So it occurred that the problem was that MySQL must be using UTF-8 to represent the character. Which is cool, so I thought I can convert the UTF-16 into UTF-8 and convert the output of that into base64. Trouble is that in UTF-8, U+671D becomes E6 9C 9D, which when converted to base64 becomes the string: '5pyd'. I've confirmed this UTF-16 --> UTF-8 conversion using Ingres to copy the nvarchar into a file and running 'od -ax' on that file. If I decode the s6\ string it means that my first UTF-8 character must be B3 AF D1. But that's not well formed UTF-8! Does anyone have any idea what I'm doing wrong? Martin Bowes -- Random Duckman Quote #114: King Chicken: How dare you insult me in front of my wife, whose still dangerously coherent. |
![]() |
| Thread Tools | |
| Display Modes | |