In case of torn UTF8 in the input data we might end up going
past the end of the string since we don't account for length.
While validation won't be performed on a sequence with a NULL
byte it's better to avoid going past the end to beging with.
Fix by taking the length into consideration.
Author: Jacob Champion <
[email protected]>
Reviewed-by: Daniel Gustafsson <[email protected]>Discussion: https://postgr.es/m/CAOYmi+mTnmM172g=_+Yvc47hzzeAsYPy2C4UBY3HK9p-AXNV0g@mail.gmail.com
const unsigned char *p = (const unsigned char *) source;
int l;
int num_chars = 0;
+ size_t len = strlen(source);
- while (*p)
+ while (len)
{
l = pg_utf_mblen(p);
- if (!pg_utf8_islegal(p, l))
+ if (len < l || !pg_utf8_islegal(p, l))
return -1;
p += l;
+ len -= l;
num_chars++;
}