unistr/u8-mbtouc: Improve handling of ill-formed UTF-8 input.
authorBruno Haible <bruno@clisp.org>
Sat, 13 Nov 2010 18:43:06 +0000 (19:43 +0100)
committerBruno Haible <bruno@clisp.org>
Sat, 13 Nov 2010 18:43:06 +0000 (19:43 +0100)
commit8cada094a301d3f78c086ef0291e8ca88cbe7a1d
tree5fc00956d12f0416f701173c07f8e293948bb58a
parenta14bd223642bd4673e2561c477190875764583eb
unistr/u8-mbtouc: Improve handling of ill-formed UTF-8 input.

* lib/unistr/u8-mbtouc.c (u8_mbtouc): For an invalid multibyte
character, return the number of bytes that belong together, not always
1.
* lib/unistr/u8-mbtouc-unsafe.c (u8_mbtouc_unsafe): Likewise.
* lib/unistr/u8-mbtouc-aux.c (u8_mbtouc_aux): Likewise.
* lib/unistr/u8-mbtouc-unsafe-aux.c (u8_mbtouc_unsafe_aux): Likewise.
* lib/unistr/u8-mbsnlen.c (u8_mbsnlen): Use u8_mbtouc to determine the
number of bytes of an invalid character.
* tests/unistr/test-u8-mbtouc.c (test_safe_function): New function.
(main): Invoke it.
* tests/unistr/test-u8-mbtouc.h (test_function): Update two test results.
* tests/unistr/test-u8-mbsnlen.c (main): Test various kinds of
malformed byte sequences.
* modules/unistr/u8-mbtouc (configure.ac): Bump version number.
* modules/unistr/u8-mbtouc-unsafe (configure.ac): Likewise.
* modules/unistr/u8-mbsnlen (configure.ac): Likewise.
Reported by Ben Pfaff and Paolo Bonzini.
12 files changed:
ChangeLog
lib/unistr/u8-mbsnlen.c
lib/unistr/u8-mbtouc-aux.c
lib/unistr/u8-mbtouc-unsafe-aux.c
lib/unistr/u8-mbtouc-unsafe.c
lib/unistr/u8-mbtouc.c
modules/unistr/u8-mbsnlen
modules/unistr/u8-mbtouc
modules/unistr/u8-mbtouc-unsafe
tests/unistr/test-u8-mbsnlen.c
tests/unistr/test-u8-mbtouc.c
tests/unistr/test-u8-mbtouc.h