summaryrefslogtreecommitdiffstats
path: root/src/utf8.c
Commit message (Collapse)AuthorAgeFilesLines
* Rename utf8_ord() to utf8_decode()Lars Henriksen2018-06-031-2/+2
| | | | | | | Purely for readability and in preparation for the counterpart utf8_encode(). Signed-off-by: Lars Henriksen <LarsHenriksen@get2net.dk> Signed-off-by: Lukas Fleischer <lfleischer@calcurse.org>
* Update UTF-8 base codeLars Henriksen2017-12-071-17/+8
| | | | | | | | | | UTF-8 encodes characters in one to four bytes (since 2003). Because 0 is a valid code point, the decode function utf8_ord() should return -1, not 0, on error. As a consequence utf8_width() should return 0 for a continuation byte (as it did previously). Signed-off-by: Lukas Fleischer <lfleischer@calcurse.org>
* Factor out UTF-8 code point decodingLukas Fleischer2017-08-301-16/+15
| | | | Signed-off-by: Lukas Fleischer <lfleischer@calcurse.org>
* Update copyright rangesLukas Fleischer2017-01-121-1/+1
| | | | Signed-off-by: Lukas Fleischer <lfleischer@calcurse.org>
* Refactor UTF-8 choppingLukas Fleischer2016-02-261-0/+23
| | | | | | | Add a function that makes sure a string does not exceed a given display size. If the string is too long, dots ("...") are appended. Signed-off-by: Lukas Fleischer <lfleischer@calcurse.org>
* Update copyright rangesLukas Fleischer2016-01-301-1/+1
| | | | Signed-off-by: Lukas Fleischer <lfleischer@calcurse.org>
* Update copyright rangesLukas Fleischer2015-02-071-1/+1
| | | | Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>
* Use a macro to determine the size of arraysLukas Fleischer2013-05-041-1/+1
| | | | | | | | Use following macro instead of "sizeof(x) / sizeof(x[0])" everywhere: #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>
* Use tabs instead of spaces for indentationLukas Fleischer2013-04-141-277/+279
| | | | | | | | | | | This completes our switch to the Linux kernel coding style. Note that we still use deeply nested constructs at some places which need to be fixed up later. Converted using the `Lindent` script from the Linux kernel code base, along with some manual fixes. Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>
* Fix braces in if-else statementsLukas Fleischer2013-02-171-1/+2
| | | | | | | | | | From the Linux kernel coding guidelines: Do not unnecessarily use braces where a single statement will do. [...] This does not apply if one branch of a conditional statement is a single statement. Use braces in both branches. Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>
* Update copyright rangesLukas Fleischer2013-02-041-1/+1
| | | | | | | Add 2013 to the copyright range for all source and documentation files. Reported-by: Frederic Culot <frederic@culot.org> Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>
* Switch to Linux kernel coding styleLukas Fleischer2012-05-211-277/+268
| | | | | | | | | | | | | | Convert our code base to adhere to Linux kernel coding style using Lindent, with the following exceptions: * Use spaces, instead of tabs, for indentation. * Use 2-character indentations (instead of 8 characters). Rationale: We currently have too much levels of indentation. Using 8-character tabs would make huge code parts unreadable. These need to be cleaned up before we can switch to 8 characters. Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>
* Update copyright rangesLukas Fleischer2012-03-261-1/+1
| | | | | | Add 2012 to the copyright range for all source and documentation files. Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>
* utf8_width() performance improvementsLukas Fleischer2011-07-021-39/+50
| | | | | | | | | * Sort character width lookup table by character ranges. * Use binary search instead of linear search for UTF-8 character width lookups which will speed up utf8_width() (O(log n) instead of O(n)). Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>
* Add basic UTF-8 helper functionsLukas Fleischer2011-06-291-0/+333
Add utf8_width() and utf8_strwidth() which can be used to calculate the display width of a single character or a string, respectively. A lookup table is used to spot double width characters, as well as composing characters. There currently isn't any code to deal with ambigious characters. Signed-off-by: Lukas Fleischer <calcurse@cryptocrack.de>