converts a multibyte character to a wide character
* restrict wc
const char * restrict
, mbstate_t *
() function examines at most
bytes of the multibyte character byte
string pointed to by s
, converts those bytes
to a wide character, and stores the wide character in the wchar_t object
pointed to by wc
points to a valid character.
Conversion happens in accordance with the conversion state described by the
mbstate_t object pointed to by mbs
mbstate_t object must be initialized to zero before the application's first
(). If the previous call to
() did not return (size_t)-1, the
mbstate_t object can safely be reused without reinitialization.
The behaviour of
() is affected by the
category of the current locale. If
the locale is changed without reinitialization of the mbstate_t object pointed
to by mbs
, the behaviour of
() is undefined.
() will accept an incomplete byte
sequence pointed to by s
which does not form
a complete character but is potentially part of a valid character. In this
() consumes all such bytes.
The conversion state saved in the mbstate_t object pointed to by
will be used to restart the suspended
conversion during the next call to
In state-dependent encodings, s
may point to a
special sequence of bytes called a “shift sequence”. Shift
sequences switch between character code sets available within an encoding
scheme. One encoding scheme using shift sequences is ISO/IEC 2022-JP, which
can switch e.g. from ASCII (which uses one byte per character) to JIS X 0208
(which uses two bytes per character). Shift sequence bytes correspond to no
individual wide character, so
treats them as if they were part of the subsequent multibyte character.
Therefore they do contribute to the number of bytes in the multibyte
Special cases in interpretation of arguments are as follows:
- wc == NULL
- The conversion from a multibyte character to a wide character is performed
and the conversion state may be affected, but the resulting wide character
This can be used to find out how many bytes are contained in the multibyte
character pointed to by s.
- s == NULL
n, and behaves equivalent to
which attempts to use the mbstate_t object pointed to by
mbs to start or continue conversion using
the empty string as input, and discards the conversion result.
If conversion succeeds, this call always returns zero. Unlike
mbtowc(3), the value
returned does not indicate whether the current encoding of the locale is
state-dependent, i.e. uses shift sequences.
mbrtowc(NULL, "", 1, mbs);
- mbs == NULL
mbrtowc() uses its own internal state
object to keep the conversion state, instead of an mbstate_t object
pointed to by mbs. This internal
conversion state is initialized once at program startup. It is not safe to
mbrtowc() again with a
mbs argument if
mbrtowc() returned (size_t)-1 because
at this point the internal conversion state is undefined.
Calling any other functions in libc never
changes the internal conversion state object of
- The bytes pointed to by s form a
terminating NUL character. If wc is not
NULL, a NUL wide character has been
stored in the wchar_t object pointed to by
- s points to a valid character, and the
value returned is the number of bytes completing the character. If
wc is not
NULL, the corresponding wide character
has been stored in the wchar_t object pointed to by
- s points to an illegal byte sequence
which does not form a valid multibyte character in the current locale.
errno to EILSEQ. The conversion state
object pointed to by mbs is left in an
undefined state and must be reinitialized before being used again.
Because applications using
shielded from the specifics of the multibyte character encoding scheme, it
is impossible to repair byte sequences containing encoding errors. Such
byte sequences must be treated as invalid and potentially malicious input.
Applications must stop processing the byte string pointed to by
s and either discard any wide characters
already converted, or cope with truncated input.
- s points to an incomplete byte sequence
of length n which has been consumed and
contains part of a valid multibyte character. The character may be
completed by calling
with s pointing to one or more subsequent
bytes of the multibyte character and mbs
pointing to the conversion state object used during conversion of the
incomplete byte sequence.
() function may cause an error in
the following cases:
- s points to an invalid multibyte
- mbs points to an invalid or uninitialized
() function conforms to ISO/IEC
9899/AMD1:1995 (“ISO C90, Amendment 1”). The restrict qualifier
is added at ISO/IEC 9899:1999 (“ISO C99”).
() is not suitable for programs that
care about internals of the character encoding scheme used by the byte string
pointed to by s
It is possible that
() fails because
of locale configuration errors. An “invalid” character sequence
may simply be encoded in a different encoding than that of the current locale.
The special cases for s
== NULL and
== NULL do not make any sense. Instead of
can be used.
Earlier versions of this man page implied that calling
() with a
argument would always set mbs
to the initial
conversion state. But this is true only if the previous call to
did not return (size_t)-1 or (size_t)-2.
It is recommended to zero the mbstate_t object instead.