utf8.c: Add UTF-8 validation and utility functions
authorSean Bright <sean.bright@gmail.com>
Mon, 13 Jul 2020 20:06:14 +0000 (16:06 -0400)
committerKevin Harwell <kharwell@digium.com>
Tue, 28 Jul 2020 14:45:29 +0000 (09:45 -0500)
commit7d96b3e43746c2f3a16314acead2be53ee83f3d3
tree4a733c938526c4d816e80a17f91736fc8393ecaf
parentc10ed8d4d665e4ec770db2f7c0cf695f334c0463
utf8.c: Add UTF-8 validation and utility functions

There are various places in Asterisk - specifically in regards to
database integration - where having some kind of UTF-8 validation would
be beneficial. This patch adds:

* Functions to validate that a given string contains only valid UTF-8
  sequences.

* A function to copy a string (similar to ast_copy_string) stopping when
  an invalid UTF-8 sequence is encountered.

* A UTF-8 validator that allows for progressive validation.

All of this is based on the excellent UTF-8 decoder by Björn Höhrmann.
More information is available here:

    https://bjoern.hoehrmann.de/utf-8/decoder/dfa/

The API was written in such a way that should allow us to replace the
implementation later should we determine that we need something more
comprehensive.

Change-Id: I3555d787a79e7c780a7800cd26e0b5056368abf9
include/asterisk/utf8.h [new file with mode: 0644]
main/asterisk.c
main/utf8.c [new file with mode: 0644]