MyWebUniversity.com Home Page
 



OpenSolaris man pages main menu


Standard C Library Functions                   uconvu16tou32(3C)



NAME
     uconvu16tou32,        uconvu16tou8,        uconvu32tou16,
     uconvu32tou8, uconvu8tou16, uconvu8tou32 - Unicode encod-
     ing conversion functions

SYNOPSIS
     #include 
     #include 
     #include 

     int uconvu16tou32(const uint16t *utf16str, sizet *utf16len,
          uint32t *utf32str, sizet *utf32len, int flag);


     int uconvu16tou8(const uint16t *utf16str, sizet *utf16len,
          uchart *utf8str, sizet *utf8len, int flag);


     int uconvu32tou16(const uint32t *utf32str, sizet *utf32len,
          uint16t *utf16str, sizet *utf16len, int flag);


     int uconvu32tou8(const uint32t *utf32str, sizet *utf32len,
          uchart *utf8str, sizet *utf8len, int flag);


     int uconvu8tou16(const uchart *utf8str, sizet *utf8len,
          uint16t *utf16str, sizet *utf16len, int flag);


     int uconvu8tou32(const uchart *utf8str, sizet *utf8len,
          uint32t *utf32str, sizet *utf32len, int flag);


PARAMETERS
     utf16str    A pointer to a UTF-16 character string.


     utf16len    As an input  parameter,  the  number  of  16-bit
                 unsigned  integers in utf16str as UTF-16 charac-
                 ters to be converted or saved.

                 As an output parameter,  the  number  of  16-bit
                 unsigned  integers in utf16str consumed or saved
                 during conversion.


     utf32str    A pointer to a UTF-32 character string.


     utf32len    As an input  parameter,  the  number  of  32-bit
                 unsigned   integers   in   utf32str   as  UTF-32



SunOS 5.11          Last change: 18 Sep 2007                    1






Standard C Library Functions                   uconvu16tou32(3C)



                 characters to be converted or saved.

                 As an output parameter,  the  number  of  32-bit
                 unsigned  integers in utf32str consumed or saved
                 during conversion.


     utf8str     A pointer to a UTF-8 character string.


     utf8len     As an input parameter, the number  of  bytes  in
                 utf8str  as  UTF-8 characters to be converted or
                 saved.

                 As an output parameter, the number of  bytes  in
                 utf8str consumed or saved during conversion.


     flag        The possible conversion options  that  are  con-
                 structed  by  a bitwise-inclusive-OR of the fol-
                 lowing values:

                 UCONVINBIGENDIAN

                     The input parameter is in  big  endian  byte
                     ordering.


                 UCONVOUTBIGENDIAN

                     The output parameter should be in big endian
                     byte ordering.


                 UCONVINSYSTEMENDIAN

                     The input parameter is in the  default  byte
                     ordering of the current system.


                 UCONVOUTSYSTEMENDIAN

                     The  output  parameter  should  be  in   the
                     default byte ordering of the current system.


                 UCONVINLITLENDIAN

                     The input parameter is in little endian byte
                     ordering.





SunOS 5.11          Last change: 18 Sep 2007                    2






Standard C Library Functions                   uconvu16tou32(3C)



                 UCONVOUTLITLENDIAN

                     The output parameter  should  be  in  little
                     endian byte ordering.


                 UCONVIGNORENUL

                     The null or U]0000 character should not stop
                     the conversion.


                 UCONVINACEPTBOM

                     If the Byte Order Mark (BOM, U]FEF) charac-
                     ter  exists  as  the  first character of the
                     input parameter, interpret  it  as  the  BOM
                     character.


                 UCONVOUTEMITBOM

                     Start the output parameter with  Byte  Order
                     Mark (BOM, U]FEF) character to indicate the
                     byte ordering if the output parameter is  in
                     UTF-16 or UTF-32.



DESCRIPTION
     The uconvu16tou32() function reads the  given  utf16str  in
     UTF-16  until  U]0000 (zero) in utf16str is encountered as a
     character or until the number of  16-bit  unsigned  integers
     specified  in  utf16len  is read. The UTF-16 characters that
     are read are converted into UTF-32 and the result  is  saved
     at  utf32str. After the successful conversion, utf32len con-
     tains the  number  of  32-bit  unsigned  integers  saved  at
     utf32str as UTF-32 characters.


     The uconvu16tou8() function reads  the  given  utf16str  in
     UTF-16  until  U]0000 (zero) in utf16str is encountered as a
     character or until the number of  16-bit  unsigned  integers
     specified  in  utf16len  is read. The UTF-16 characters that
     are read are converted into UTF-8 and the result is saved at
     utf8str.  After  the successful conversion, utf8len contains
     the number of bytes saved at utf8str as UTF-8 characters.


     The uconvu32tou16() function reads the  given  utf32str  in
     UTF-32  until  U]0000 (zero) in utf32str is encountered as a
     character or until the number of  32-bit  unsigned  integers



SunOS 5.11          Last change: 18 Sep 2007                    3






Standard C Library Functions                   uconvu16tou32(3C)



     specified  in  utf32len  is read. The UTF-32 characters that
     are read are converted into UTF-16 and the result  is  saved
     at  utf16str. After the successful conversion, utf16len con-
     tains the  number  of  16-bit  unsigned  integers  saved  at
     utf16str as UTF-16 characters.


     The uconvu32tou8() function reads  the  given  utf32str  in
     UTF-32  until  U]0000 (zero) in utf32str is encountered as a
     character or until the number of  32-bit  unsigned  integers
     specified  in  utf32len  is read. The UTF-32 characters that
     are read are converted into UTF-8 and the result is saved at
     utf8str.  After  the successful conversion, utf8len contains
     the number of bytes saved at utf8str as UTF-8 characters.


     The uconvu8tou16() function  reads  the  given  utf8str  in
     UTF-8  until  the null ('\0') byte in utf8str is encountered
     or until the number of bytes specified in utf8len  is  read.
     The UTF-8 characters that are read are converted into UTF-16
     and the result is saved at utf16str.  After  the  successful
     conversion,  utf16len contains the number of 16-bit unsigned
     integers saved at utf16str as UTF-16 characters.


     The uconvu8tou32() function  reads  the  given  utf8str  in
     UTF-8  until  the null ('\0') byte in utf8str is encountered
     or until the number of bytes specified in utf8len  is  read.
     The UTF-8 characters that are read are converted into UTF-32
     and the result is saved at utf32str.  After  the  successful
     conversion,  utf32len contains the number of 32-bit unsigned
     integers saved at utf32str as UTF-32 characters.


     During the conversion, the input and the  output  parameters
     are treated with byte orderings specified in the flag param-
     eter. When not specified, the default byte ordering  of  the
     system  is used. The byte ordering flag value that is speci-
     fied for UTF-8 is ignored.


     When UCONVINACEPTBOM is specified as the  flag  and  the
     first character of the string pointed to by the input param-
     eter is the BOM character, the value of  the  BOM  character
     dictates  the  byte ordering of the subsequent characters in
     the string pointed to by the input parameter, regardless  of
     the  supplied  input  parameter  byte  ordering  option flag
     values. If the UCONVINACEPTBOM is not specified, the BOM
     as the first character is treated as a regular Unicode char-
     acter: Zero Width No Break Space (ZWNBSP) character.





SunOS 5.11          Last change: 18 Sep 2007                    4






Standard C Library Functions                   uconvu16tou32(3C)



     When UCONVIGNORENUL is specified, regardless  of  whether
     the  input  parameter  contains  U]0000  or  null  byte, the
     conversion continues until the  specified  number  of  input
     parameter  elements  at  utf16len,  utf32len, or utf8len are
     entirely consumed during the conversion.


     As output parameters, utf16len, utf32len,  and  utf8len  are
     not changed if conversion fails for any reason.

RETURN VALUES
     Upon successful conversion, the  functions  return  0.  Upon
     failure,  the  functions  return  one of the following errno
     values:

     EILSEQ    The conversion detected an illegal or out of bound
               character value in the input parameter.


     E2BIG     The conversion  cannot  finish  because  the  size
               specified in the output parameter is too small.


     EINVAL    The conversion stops due to an incomplete  charac-
               ter at the end of the input string.


     EBADF     Conflicting byte-ordering option flag  values  are
               detected.


EXAMPLES
     Example 1 Convert a UTF-16 string in little-endian byte ord-
     ering into UTF-8 string.

       #include 
       #include 
       #include 
       .
       .
       .
       uint16t u16s[MAXNAMELEN ] 1];
       uchart u8s[MAXNAMELEN ] 1];
       sizet u16len, u8len;
       int ret;
       .
       .
       .
       u16len = u8len = MAXNAMELEN;
       ret = uconvu16tou8(u16s, &u16len, u8s, &u8len,
           UCONVINLITLENDIAN);
       if (ret != 0) {



SunOS 5.11          Last change: 18 Sep 2007                    5






Standard C Library Functions                   uconvu16tou32(3C)



            /* Conversion error occurred. */
            return (ret);
       }
       .
       .
       .


     Example 2 Convert a UTF-32 string in big endian byte  order-
     ing into little endian UTF-16.

       #include 
       #include 
       #include 
       .
       .
       .
       /*
         * An UTF-32 character can be mapped to an UTF-16 character with
         * two 16-bit integer entities as a "surrogate pair."
         */
       uint32t u32s[101];
       uint16t u16s[101];
       int ret;
       sizet u32len, u16len;
       .
       .
       .
       u32len = u16len = 100;
       ret = uconvu32tou16(u32s, &u32len, u16s, &u16len,
           UCONVINBIGENDIAN  UCONVOUTLITLENDIAN);
       if (ret == 0) {
            return (0);
       } else if (ret == E2BIG) {
            /* Use bigger output parameter and try just one more time. */
            uint16t u16s2[201];

            u16len = 200;
            ret = uconvu32tou16(u32s, &u32len, u16s2, &u16len,
               UCONVINBIGENDIAN  UCONVOUTLITLENDIAN);
            if (ret == 0)
                 return (0);
       }

       /* Otherwise, return -1 to indicate an error condition. */
       return (-1);


     Example 3 Convert a UTF-8  string  into  UTF-16  in  little-
     endian byte ordering.





SunOS 5.11          Last change: 18 Sep 2007                    6






Standard C Library Functions                   uconvu16tou32(3C)



     Convert a UTF-8 string into  UTF-16  in  little-endian  byte
     ordering  with  a  Byte  Order  Mark  (BOM) character at the
     beginning of the output parameter.


       #include 
       #include 
       #include 
       .
       .
       .
       uchart u8s[MAXNAMELEN ] 1];
       uint16t u16s[MAXNAMELEN ] 1];
       sizet u8len, u16len;
       int ret;
       .
       .
       .
       u8len = u16len = MAXNAMELEN;
       ret = uconvu8tou16(u8s, &u8len, u16s, &u16len,
           UCONVINLITLENDIAN  UCONVEMITBOM);
       if (ret != 0) {
            /* Conversion error occurred. */
            return (ret);
       }
       .
       .
       .


ATRIBUTES
     See attributes(5) for descriptions of the  following  attri-
     butes:



     
           ATRIBUTE TYPE               ATRIBUTE VALUE       
    
     Interface Stability          Committed                   
    
     MT-Level                     MT-Safe                     
    


SEE ALSO
     attributes(5), uconvu16tou32(9F)


     The Unicode Standard (http:/www.unicode.org)





SunOS 5.11          Last change: 18 Sep 2007                    7






Standard C Library Functions                   uconvu16tou32(3C)



NOTES
     Each UTF-16 or UTF-32 character maps to an  UTF-8  character
     that might need one to maximum of four bytes.


     One UTF-32 or UTF-8 character can yield two 16-bit  unsigned
     integers as a UTF-16 character, which is a surrogate pair if
     the Unicode scalar value is bigger than U]F.


     Ill-formed UTF-16 surrogate pairs are seen as illegal  char-
     acters during the conversion.











































SunOS 5.11          Last change: 18 Sep 2007                    8



OpenSolaris man pages main menu

Contact us      |       About us      |       Term of use      |       Copyright © 2000-2010 MyWebUniversity.com ™