Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
142 commits
Select commit Hold shift + click to select a range
8581253
Slightly tested UTF-8 support for Centrallix SQL
danielrothfus Jun 10, 2011
ee43333
Fixed exp_functions.c
danielrothfus Jun 28, 2011
9750ca5
Re-fixed exp-functions.c
danielrothfus Jun 28, 2011
bd74408
Added function stSeparate to Centrallix's stParse
danielrothfus Jun 29, 2011
3847811
Loading of charsetmap.cfg files
danielrothfus Jun 29, 2011
1eba9d2
Adding charsetmap.cfg file referenced earlier
danielrothfus Jun 29, 2011
ca6a132
Added HTTP stating of content charset type.
danielrothfus Jun 29, 2011
a3d634a
Adding JavaScript support for UTF-8
Jul 5, 2011
4afdf7a
Added multiple character set support to MySQL driver
Jul 5, 2011
58af1ea
Changed the internals of JavaScript to use (decode|encode)URL
Jul 5, 2011
2ef09bd
Re-fixed QPrintf documentation
Jul 5, 2011
faf809c
Migrated all multibyte string functionality to chr module
danielrothfus Jul 7, 2011
6fd75b5
Added ASCII to charsetmap
danielrothfus Jul 7, 2011
74b09ef
Getting rid of some UTF-8 based test code
danielrothfus Jul 7, 2011
04f45d9
Updating charset map with charset names for MIME/http, sybase, iso-88…
gbeeley Oct 8, 2011
0ab6724
mbrtowc and mbstowcs return value checks (not all strictly needed; fo…
gbeeley Oct 8, 2011
5549e9f
Adding chrValid() and fixing invalid seq check in UTF-8 version of ch…
gbeeley Oct 8, 2011
6bf4e36
Adding capability for admin to set the locale in the server config.
gbeeley Oct 8, 2011
c224a0c
Fix hints default expression decoding in javascript.
gbeeley Oct 8, 2011
abcf6cc
Allow ascii() SQL function to handle a NULL parameter (singlebyte and…
gbeeley Oct 8, 2011
dcd324e
Partial UTF-8 / international charset support for report writer PS/PDF
gbeeley Oct 10, 2011
669631a
Adding interim UTF-8 support in GetCharacterMetric (font metrics work…
gbeeley Oct 10, 2011
c864c53
Bugfix for character metric in UTF8; word wrapping for UTF8.
gbeeley Oct 10, 2011
2f29f07
Merging utf-8 branch with master
ttobrien Jun 17, 2020
dc1610e
End of Sprint 1
ttobrien Jun 22, 2020
8a8c99c
Ready to begin new work
ttobrien Jun 23, 2020
e3bdb3f
Pre SQL merge
ttobrien Jun 24, 2020
cd2cbd1
Merge branch 'sql-testsuite' into utf-8
ttobrien Jun 24, 2020
64acf63
No Overlong
ttobrien Jun 26, 2020
cd3d0a2
Foreign Lang Tests Started
ttobrien Jul 1, 2020
01b0e32
Reverse works
ttobrien Jul 1, 2020
73aa991
Overlong Testing
ttobrien Jul 2, 2020
080d1ab
End of sprint
ttobrien Jul 2, 2020
ce5195c
EXP FUNCTIONS
ttobrien Jul 7, 2020
456e89f
Beginning modifications to database connection
ttobrien Jul 8, 2020
15b5cb9
UTF-8 upper and lower
ttobrien Jul 8, 2020
552d250
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
ttobrien Jul 8, 2020
2e81f86
Progress on loading Kardia; done expfn_utf8substring
ttobrien Jul 8, 2020
a37f7ea
Kardia works normally
ttobrien Jul 8, 2020
c4b7813
All 12 expfn utf8 fcns pass
ttobrien Jul 10, 2020
402e8ec
Minor Fixes
ttobrien Jul 10, 2020
1ff8024
Move charsets.c to centrallix-lib
ttobrien Jul 10, 2020
3b30f3a
Move charsets.h to centrallix-lib
ttobrien Jul 10, 2020
c6fb280
Undidrefactoring and added chrCharLength to xstring module
ttobrien Jul 10, 2020
ec09943
switched howto.htm to UTF-8
ttobrien Jul 11, 2020
34fe00d
database charset conversion
ttobrien Jul 13, 2020
b7cad9c
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
ttobrien Jul 13, 2020
0b5fa7e
Bug fix
ttobrien Jul 13, 2020
713fa15
Data wont save
ttobrien Jul 13, 2020
5d95c6e
Merged objdrv_mysql.c
ttobrien Jul 13, 2020
c66a972
UTF-8 work with search and adding partnergit add --all
ttobrien Jul 13, 2020
b912ded
minor edits
ttobrien Jul 14, 2020
848e62d
xsFind(WithCharOffest)
ttobrien Jul 16, 2020
47e6ab4
xsFindRev(WIthCharOffset)
ttobrien Jul 16, 2020
0cfa311
Wrote 3 untested xs*WithCharOffset functions
ttobrien Jul 17, 2020
3c93fc7
Charset declared and xstring progress
ttobrien Jul 20, 2020
a2a491e
End of Day
ttobrien Jul 20, 2020
c24c03d
NoOverlong() Testing in Centrallix-lib
ttobrien Jul 22, 2020
fb1f287
xsSubst works with UTF-8
ttobrien Jul 28, 2020
6731e92
xstring tests for insert after char offset & replace after char offset
ttobrien Jul 28, 2020
e480098
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
ttobrien Jul 28, 2020
d0af389
xsSubstWithCharOffset
ttobrien Jul 28, 2020
519cfc7
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
ttobrien Jul 28, 2020
19c323b
xstring tests for xsreplace & xs insertafter
ttobrien Jul 28, 2020
600c8ec
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
ttobrien Jul 28, 2020
fb0735c
xsFind hard tests
ttobrien Jul 29, 2020
6a03c1c
Null tests on Subst, Find, FindRev
ttobrien Jul 29, 2020
5033d6a
fixed hardcoded file path
ttobrien Jul 29, 2020
f37b020
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
ttobrien Jul 29, 2020
803a46a
xstring replace testing
ttobrien Jul 29, 2020
331c999
Additional tests for xsInsertAfter & xsInsertAfterWithCharOffset
ttobrien Jul 30, 2020
e497118
Updated xsInsertAfter* tests
ttobrien Jul 31, 2020
457fc5f
Removed unnecessary xsinsertafter tests
ttobrien Jul 31, 2020
d8b544c
Merge branch 'master' into utf-8
gbeeley Apr 18, 2022
7fd5a30
Merge branch 'master' into utf-8
gbeeley Jun 10, 2022
59dbc32
Added utf8 version of mixed to charsets.c. Note the disabling of a ca…
Jun 29, 2022
e018259
Added utf8 version of mixed to expression. WARNING: it is still untested
Jun 29, 2022
9d240ac
Adding recnet changes to enable working from a fresh set of files. Th…
ttobrien Jul 14, 2022
ab8c3e8
Adding recnet changes to enable working from a fresh set of files. Th…
nboard Jul 14, 2022
88bd602
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
ttobrien Jul 14, 2022
1745ef3
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
nboard Jul 14, 2022
87254a8
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
nboard Jul 14, 2022
ce3fc43
Added tests for chrNoOverlong. Edited chrNoOverlong to catch otherwis…
nboard Jul 15, 2022
2b73dd4
added some utf8 versions of tests for qprintf
nboard Jul 20, 2022
de92832
utf8 friendly cosine compare
captain-nemo-10994 Jul 20, 2022
54d7ab2
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
captain-nemo-10994 Jul 20, 2022
7fc45d5
Changed chrNoOverlong to return a sucess/error code rather than a poi…
nboard Jul 21, 2022
8b7b6e5
updated misc. documents to account for changes to chrNoOverlong
nboard Jul 21, 2022
e173f48
-Added utf-8 validate and character split check to LxSession class. \…
nboard Aug 1, 2022
bf76540
Added C locale tests to older tests (previously were all changed to u…
nboard Aug 1, 2022
b4755dc
Added several tests for utf-8 compatability for qprintf. Started edit…
nboard Aug 8, 2022
edd3b23
added global struct to track qprintf print mode, as well as flags to …
nboard Aug 18, 2022
d1cb099
added and updated qprintf tests
nboard Aug 18, 2022
100f981
updated libmime_DecodeBase64 to correctly handle invalid null bytes
nboard Aug 19, 2022
7d954b4
made sure HEX encoding in qprintf cannot split utf-8 characters. Upda…
nboard Aug 19, 2022
1e607e8
updated FILE and PATH to validate UTF-8 strings. Added UTF-8 cases to…
nboard Aug 19, 2022
182276f
Finished updating tests to check for UTF-8 character splits. Made bas…
nboard Aug 22, 2022
c9e390b
Added checks to B64 and hex decode to enforce UTF-8 when limiting the…
nboard Aug 23, 2022
f41bd33
Prevent net_cgi.c's B64 convert considering NULL a valid char
nboard Aug 24, 2022
69d3e45
Added filename validation to mtlexer. Also added checks to prevent NU…
nboard Aug 24, 2022
28cfc57
Updated some incorrect line spacing. Removed some redundant NULL chec…
nboard Aug 25, 2022
c1f6e3b
fixed a bug in stparse where {{ and }} were not handled properly
nboard Aug 25, 2022
e928c7b
Added tests for stparse to verify that UTF-8 verification from mtlexe…
nboard Aug 25, 2022
4dea720
Added tests to verify the struct driver (objdrv_struct) does not allo…
nboard Aug 31, 2022
a222ee9
added tests to confirm UTF-8 functionality in the report_v3 driver. N…
nboard Sep 1, 2022
bc9e99d
missed some file with previous commit
nboard Sep 1, 2022
dcc8863
Updated various tests. Added back the call to mssLog that previously …
nboard Sep 8, 2022
b58e987
changed chrNoOverlong to verifyUTF8, and moved from centrallix/util/c…
nboard Sep 20, 2022
0ea2574
Updated verifyUTF8 (formerly chrNoOverlong) to return the index of th…
nboard Oct 5, 2022
83f6b98
NOTE: the ~/centrallix folder is UNTESTED. Changed mtlexer to use fla…
nboard Oct 6, 2022
27526e5
Updated use of mtlexer to reflect new flags for enforcing UTF-8 or AS…
nboard Oct 10, 2022
4e9e61d
finished off mtlexer tests, and fixed a minor bug in stparse
nboard Oct 11, 2022
3155d00
added UTF-8 check to JSON. Needs simplified
nboard Oct 12, 2022
37875e1
improved and bug fixed json parser's UTF-8 verification
nboard Oct 13, 2022
6cf3eda
Moved function for number of bytes in a character from qprintf to uti…
nboard Oct 19, 2022
e02bb44
Prevent JSON read from splitting UTF-8 chars. Added/updated json test…
nboard Oct 20, 2022
e346ed7
forgot objdrv_json.c changes in last commit
nboard Oct 20, 2022
6538ccc
Updated link driver to be UTF-8 compatable
nboard Oct 24, 2022
7c9dd8b
updated datafile driver and tests to be UTF-8 aware. Also updated ove…
nboard Nov 4, 2022
9077311
Added UTF-8 and ASCII verification to the UX driver. Added tests for …
nboard Nov 7, 2022
b9f5245
Removed comment from Centrallix.c. Added check to password entries fo…
nboard Nov 8, 2022
209d3a3
Updated mysql driver to be more stable and added tests. Changed confi…
nboard Nov 18, 2022
b42fdb6
Updated gzip driver to be UTF-8 freindly. Removed check for gzip comm…
nboard Nov 23, 2022
a474438
Ensured that xml parser validates incomming UTF-8 and UTF-16. Removed…
nboard Dec 9, 2022
a983cd1
fixed incorrect database name in a test file
nboard Jan 5, 2023
dfda2dc
Updated shell driver to be UTF-8 safe. Updated tests for gzip, xml, a…
nboard Jan 18, 2023
8d4abbe
Deleted broken similarity tests
nboard Jan 31, 2023
8be2c17
Updated net_http.c to be UTF-8 friendly and fixed a double free error…
nboard Feb 21, 2023
3f6050b
Updated http driver to declare character set in post requests
nboard Feb 22, 2023
1015a8b
Updated shell tests to use a more universal path. Updated overflong f…
nboard Mar 7, 2023
87413a9
added back a file that was removed by mistake
nboard Mar 7, 2023
bc70579
fixed small mistake in ux driver test
nboard Mar 7, 2023
4bb16ec
Cleaned up various errors
nboard Mar 10, 2023
92fa3b7
Fixed typo in a comment in objdrv_datafile.c
nboard Mar 27, 2023
3e223b2
updated sybase os driver to be utf-8 compatible. Added tests for syba…
nboard Mar 27, 2023
ac0aa6d
Merge branch 'utf-8' of https://github.com/LightSys/centrallix into u…
nboard Mar 27, 2023
a32cb95
merged master into branch UTF-8
nboard Mar 27, 2023
cce393b
updated sybase readme, and removed completed fixme from sybase.c
nboard Mar 27, 2023
ab95f72
Updated qprintf to properly choose to enforce UTF-8 based on locale w…
nboard Mar 28, 2023
ad3ee20
updated documentation. Fixed bug in how stparse removed children from…
nboard Mar 30, 2023
d7e10f0
Merge branch 'master' into utf-8
nboard Mar 30, 2023
98b5a25
Added missing \ to Makefile.in
nboard Mar 30, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 4 additions & 0 deletions centrallix-lib/include/mtlexer.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ typedef struct _LX
char** ReservedWords; /* reserved words */
int LineNumber;
int (*ReadFn)();
int (*ValidateFn)(char *); /* pointer to a validate function */
int (*CharFit)(char, int, int); /* pointer to see if a multibyte char will be split */
void* ReadArg;
}
LxSession, *pLxSession;
Expand Down Expand Up @@ -143,6 +145,8 @@ int mlxSetOffset(pLxSession this, unsigned long new_offset);
#define MLX_F_SSTRING (1<<22) /* Differentiate "" and '' strings */
#define MLX_F_PROCLINE (1<<23) /* (int) Line has been processed. */
#define MLX_F_ALLOWNUL (1<<24) /* Allow nul (\0) bytes in input. */
#define MLX_F_ENFORCEUTF8 (1<<25) /* validate all strings as UTF-8. Incompatable with MLX_F_ENFORCEASCII */
#define MLX_F_ENFORCEASCII (1<<26) /* validate all strings as ASCII. Incompatable with MLX_F_ENFORCEUTF8 */

#define MLX_F_PUBLIC (MLX_F_ICASE | MLX_F_POUNDCOMM | MLX_F_CCOMM | MLX_F_CPPCOMM | MLX_F_SEMICOMM | MLX_F_DASHCOMM | MLX_F_EOL | MLX_F_EOF | MLX_F_IFSONLY | MLX_F_DASHKW | MLX_F_NODISCARD | MLX_F_FILENAMES | MLX_F_DBLBRACE | MLX_F_LINEONLY | MLX_F_SYMBOLMODE | MLX_F_TOKCOMM | MLX_F_NOUNESC | MLX_F_SSTRING | MLX_F_ALLOWNUL)
#define MLX_F_PRIVATE (MLX_F_INSTRING | MLX_F_NOFILE | MLX_F_FOUNDEOL | MLX_F_INCOMM | MLX_F_PROCLINE)
Expand Down
8 changes: 8 additions & 0 deletions centrallix-lib/include/qprintf.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,13 @@ typedef int (*qpf_grow_fn_t)(char**, size_t*, size_t, void*, size_t);
typedef struct _QPS
{
unsigned int Errors; /* QPF_ERR_T_xxx */
unsigned int Flags;
}
QPSession, *pQPSession;


#define QPF_F_ENFORCE_UTF8 1 /* use UTF-8 validation */

#define QPF_ERR_T_NOTIMPL 1 /* unimplemented feature */
#define QPF_ERR_T_BUFOVERFLOW 2 /* dest buffer too small */
#define QPF_ERR_T_INSOVERFLOW 4 /* NLEN or *LEN restriction occurred */
Expand All @@ -54,11 +58,15 @@ typedef struct _QPS
#define QPF_ERR_T_BADFILE 2048 /* Bad filename for &FILE filter */
#define QPF_ERR_T_BADPATH 4096 /* Bad pathname for &PATH filter */
#define QPF_ERR_T_BADCHAR 8192 /* Bad character for filter (e.g. an octothorpe for &DB64) */
#define QPF_ERR_T_TRUNC 16384 /* To avoid splitting a utf-8 char, not all of the space was used */

#define QPERR(x) (s->Errors |= (x))


/*** QPrintf methods ***/
void qpfInitializeDefaultFlags(int isUTF8);
pQPSession qpfOpenSession();
pQPSession qpfOpenSessionFlags(unsigned int flags);
int qpfCloseSession(pQPSession s);
int qpfClearErrors(pQPSession s);
unsigned int qpfErrors(pQPSession s);
Expand Down
39 changes: 39 additions & 0 deletions centrallix-lib/include/util.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,3 +32,42 @@ extern "C" {

#endif /* UTILITY_H */

#define UTIL_VALID_CHAR (size_t)-1 /** for use with verifyUTF8 **/
#define UTIL_INVALID_CHAR (size_t)-2
#define UTIL_INVALID_ARGUMENT (size_t)-3
/** states for validate **/
#define UTIL_STATE_START 0
#define UTIL_STATE_3_BYTE 1 /** 3 bytes left; was 4 total **/
#define UTIL_STATE_2_BYTE 2 /** 2 bytes left; a leats 3 total **/
#define UTIL_STATE_1_BYTE 3 /** 1 byte left; was at least 2 total **/
#define UTIL_STATE_ERROR 4
#define UTIL_STATE_E_SURROGATE 5
#define UTIL_STATE_E_OVERLONG 6 /** starts with E0, check for overlong **/
#define UTIL_STATE_F_OVERLONG 7 /** starts with F0, check for overlong **/
#define UTIL_STATE_TOO_LARGE 8 /** starts with F4, check for too long **/

/** \brief This function ensures that a string contains valid UTF-8.
\param string The string to verify.
\return The index of the first byte of the first invald char, or a
code if not applicable */
int verifyUTF8(char* str);

/** \brief This function ensures that all of the bytes of a string are
valid ASCII.
\param string The string to verify.
\return returns index of the first invalid string, or a code if not
applicable */
int verifyASCII(char * str);

/** \brief This function ensures that all of the bytes of a string are
valid ASCII. Does not require a NULL terminator.
\param str The string to verify.
\param len the length of the suplied string
\return returns index of the first invalid string, or a code if not
applicable */
int nVerifyUTF8(char* str, int len);

/** \brief Computes the number of bytes in a utf-8 char based on the first byte.
* \param byte the character to be checked
* \return the number of bytes in the character, or -1 on error. */
int numBytesInChar(char byte);
10 changes: 9 additions & 1 deletion centrallix-lib/include/xstring.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
/* realloc'ing string data structure. */
/************************************************************************/


#include <stdlib.h>
#include <stdarg.h>

#define XS_BLK_SIZ 256
Expand Down Expand Up @@ -61,10 +61,15 @@ int xsRTrim(pXString this);
int xsLTrim(pXString this);
int xsTrim(pXString this);
int xsFind(pXString this,char* find,int findlen, int offset);
int xsFindWithCharOffset(pXString this, char* find, int findlen, int offset);
int xsFindRev(pXString this,char* find,int findlen, int offset);
int xsFindRevWithCharOffset(pXString this, char* find, int findlen, int offset);
int xsSubst(pXString this, int offset, int len, char* rep, int replen);
int xsSubstWithCharOffset(pXString this, int offset, int len, char* rep, int replen);
int xsReplace(pXString this, char* find, int findlen, int offset, char* rep, int replen);
int xsReplaceWithCharOffset(pXString this, char* find, int findlen, int offset, char* rep, int replen);
int xsInsertAfter(pXString this, char* ins, int inslen, int offset);
int xsInsertAfterWithCharOffset(pXString this, char* ins, int inslen, int offset);
int xsGenPrintf(int (*write_fn)(), void* write_arg, char** buf, int* buf_size, const char* fmt, ...);
int xsGenPrintf_va(int (*write_fn)(), void* write_arg, char** buf, int* buf_size, const char* fmt, va_list va);
int xsQPrintf(pXString this, char* fmt, ...);
Expand All @@ -73,6 +78,9 @@ int xsConcatQPrintf(pXString this, char* fmt, ...);
pXString xsNew();
void xsFree(pXString this);

/** Needed utiliy function **/
size_t chrCharLength(char* string);

#define XS_U_SEEK 2

#endif /* _XSTRING_H */
Expand Down
Loading