convert one UTF-16 encoded character to
c16rtomb(char * restrict s,
char16_t c16, mbstate_t * restrict
This function converts one UTF-16 encoded character to UTF-8. In some cases, it is necessary to call the function twice to convert a single character.
passing the first 16-bit code unit of the UTF-16 encoded character in
c16. If the return value is greater than 0, the
character is part of the UCS-2 range, the complete UTF-8 encoding consisting
of at most
MB_CUR_MAX bytes has been written to the
storage starting at s, and the function does not need
to be called again.
If the return value is 0, the first 16-bit code unit is a UTF-16
high surrogate and the function needs to be called a second time, this time
passing the second 16-bit code unit of the UTF-16 encoded character in
c16 and passing the same mbs
again that was also passed to the first call. If the second 16-bit code unit
is a UTF-16 low surrogate, the second call returns a value greater than 0,
the surrogate pair represents a Unicode code point beyond the basic
multilingual plane, and the complete UTF-8 encoding consisting of at most
MB_CUR_MAX bytes is written to the storage starting
The output encoding that
uses in s is determined by the
LC_CTYPE category of the current locale.
OpenBSD only supports UTF-8 and ASCII output, and
this function is only useful for UTF-8.
The following arguments cause special processing:
- c16 == 0
- A NUL byte is stored to *s and the state object pointed to by mbs is reset to the initial state. On operating systems other than OpenBSD that support state-dependent multibyte encodings, a special byte sequence (“shift sequence”) is written before the NUL byte to return to the initial state if that is required by the output encoding and by the current output encoding state.
- mbs ==
- An internal mbstate_t object specific to the
c16rtomb() function is used instead of the mbs argument. This internal object is automatically initialized at program startup and never changed by any libc function except
- s ==
- The object pointed to by mbs, or the internal object
if mbs is a
NULLpointer, is reset to its initial state, c16 is ignored, and 1 is returned.
c16rtomb() returns the number of bytes
written to s on success or
(size_t)-1 on failure, specifically:
- The first 16-bit code unit was successfully decoded as a UTF-16 high surrogate. Nothing was written to s yet.
- The first 16-bit code unit was successfully decoded as a character in the
range U+0000 to U+007F, or s is
- The first 16-bit code unit was successfully decoded as a character in the range U+0080 to U+07FF.
- The first 16-bit code unit was successfully decoded as a character in the range U+0800 to U+D7FF or U+E000 to U+FFFF.
- The second 16-bit code unit was successfully decoded as a UTF-16 low surrogate, resulting in a character in the range U+10000 to U+10FFFF.
- Return values greater than 4 may occur on operating systems other than OpenBSD for output encodings other than UTF-8, in particular when a shift sequence was written.
- UTF-16 input decoding or
LC_CTYPEoutput encoding failed, or mbs is invalid. Nothing was written to s, and errno has been set.
c16rtomb() causes an error in the
- UTF-16 input decoding failed because the first 16-bit code unit is neither
a UCS-2 character nor a UTF-16 high surrogate, or because the second
16-bit code unit is not a UTF-16 low surrogate; or output encoding failed
because the resulting character cannot be represented in the output
encoding selected with
- mbs points to an invalid or uninitialized mbstate_t object.
mbrtoc16(3), setlocale(3), wcrtomb(3)
c16rtomb() conforms to
c16rtomb() has been available since
The C11 standard only requires the c16
argument to be interpreted according to UTF-16 if the predefined environment
__STDC_UTF_16__ is defined with a value of 1.
<uchar.h> provides this
definition. Other operating systems which do not define
__STDC_UTF_16__ could theoretically use a different,
implementation-defined input encoding for c16 instead
of UTF-16. Using UTF-16 becomes mandatory in C23.