NAME
mbtowc —
    converts a multibyte character to a
    wide character
SYNOPSIS
#include
    <stdlib.h>
int
  
  mbtowc(wchar_t
    * restrict pwc, const
    char * restrict s, size_t
    n);
DESCRIPTION
The
    mbtowc()
    function converts the multibyte character pointed to by
    s to a wide character, and stores it in the wchar_t
    object pointed to by pwc. This function may inspect at
    most n bytes of the array pointed to by
    s.
Unlike mbrtowc(3), the first n bytes pointed to by s need to form an entire multibyte character. Otherwise, this function returns an error and the internal state will be undefined.
If a call to
    mbtowc()
    results in an undefined internal state, parsing of the string starting at
    s cannot continue, not even at a later byte, and
    mbtowc() must be called with s
    set to NULL to reset the internal state before it
    can safely be used again on a different string.
The behaviour of
    mbtowc()
    is affected by the LC_CTYPE category of the current
    locale. Calling any other functions in
    libc never
    changes the internal state of mbtowc(), except for
    calling setlocale(3) with the LC_CTYPE category set
    to a different locale. Such
    setlocale(3) calls cause the internal state of this function to be
    undefined.
In state-dependent encodings such as ISO/IEC
    2022-JP, s may point to the special sequence of bytes
    to change the shift-state. Because such sequence bytes do not correspond to
    any individual wide character,
    mbtowc()
    treats them as if they were part of the subsequent multibyte character.
The following special cases apply to the arguments:
- s == NULL
- mbtowc() initializes its own internal state to the initial state, and determines whether the current encoding is state-dependent.- mbtowc() returns 0 if the encoding is state-independent, otherwise non-zero. pwc is ignored.
- pwc == NULL
- mbtowc() behaves just as if pwc was not- NULL, including modifications to internal state, except that the result of the conversion is discarded. This can be used to determine the size of the wide character representation of a multibyte string. Another use case is a check for illegal or incomplete multibyte sequences.
- n == 0
- In this case, the first n bytes of the array pointed
      to by s never form a complete character and
      mbtowc() always fails.
RETURN VALUES
Normally, mbtowc() returns:
- 0
- s points to a null byte (‘\0’).
- positive
- Number of bytes for the valid multibyte character pointed to by
      s. There are no cases where the value returned is
      greater than the value of the MB_CUR_MAXmacro.
- -1
- s points to an invalid or an incomplete multibyte character. errno is set to indicate the error.
When s is NULL,
    mbtowc() returns:
- 0
- The current encoding is state-independent.
- non-zero
- The current encoding is state-dependent.
EXAMPLES
The following program parses a UTF-8 string and reports encoding errors:
#include <limits.h>
#include <locale.h>
#include <stdio.h>
#include <stdlib.h>
int
main(void)
{
	char	 s[LINE_MAX];
	wchar_t	 wc;
	int	 i, len;
	setlocale(LC_CTYPE, "C.UTF-8");
	if (fgets(s, sizeof(s), stdin) == NULL)
		*s = '\0';
	for (i = 0, len = 1; len != 0; i += len) {
		switch (len = mbtowc(&wc, s + i, MB_CUR_MAX)) {
		case 0:
			printf("byte %d end of string 0x00\n", i);
			break;
		case -1:
			printf("byte %d invalid 0x%0.2hhx\n", i, s[i]);
			len = 1;
			break;
		default:
			printf("byte %d U+%0.4X %lc\n", i, wc, wc);
			break;
		}
	}
	return 0;
}
Recovering from encoding errors and continuing to parse the rest of the string as shown above is only possible for state-independent character encodings. For full generality, the error handling can be modified to reset the internal state. In that case, the rest of the string has to be skipped if the encoding is state-dependent:
		case -1:
			printf("byte %d invalid 0x%0.2hhx\n", i, s[i]);
			len = !mbtowc(NULL, NULL, MB_CUR_MAX);
			break;
ERRORS
mbtowc() will set
    errno in the following cases:
- [EILSEQ]
- s points to an invalid or incomplete multibyte character.
SEE ALSO
STANDARDS
The mbtowc() function conforms to
    ANSI X3.159-1989 (“ANSI C89”).
    The restrict qualifier is added at ISO/IEC 9899:1999
    (“ISO C99”). Setting errno
    is an IEEE Std 1003.1-2008 (“POSIX.1”)
    extension.
CAVEATS
On error, callers of mbtowc() cannot tell
    whether the multibyte character was invalid or incomplete. To treat
    incomplete data differently from invalid data the
    mbrtowc(3) function can be used instead.