Many hyperlinks are disabled.
Use anonymous login
to enable hyperlinks.
Overview
Comment: | Minor tweaks to the hebrew transliteration tables. |
---|---|
Downloads: | Tarball | ZIP archive |
Timelines: | family | ancestors | translit-tokenizer |
Files: | files | file ages | folders |
SHA1: |
7b6de5c35d1c2e141b1eb666c8dd5ef6 |
User & Date: | drh 2012-05-04 13:22:42.928 |
Context
2012-05-04
| ||
13:22 | Minor tweaks to the hebrew transliteration tables. (Leaf check-in: 7b6de5c35d user: drh tags: translit-tokenizer) | |
02:58 | Add an experimental tokenizer to FTS3/4: one that transliterates latin, greek, cyrillic, and hebrew characters into pure ascii. (check-in: 930115693a user: drh tags: translit-tokenizer) | |
Changes
Changes to ext/fts3/fts3_tokenizer2.c.
︙ | ︙ | |||
952 953 954 955 956 957 958 | 0, /* 891 */ 0, /* 892 */ 0, /* 893 */ 0, /* 894 */ 0, /* 895 */ 0, /* 896 */ 0, /* 897 */ | | | | | 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 | 0, /* 891 */ 0, /* 892 */ 0, /* 893 */ 0, /* 894 */ 0, /* 895 */ 0, /* 896 */ 0, /* 897 */ ( 1*4 + 0), /* u05D0 (א) -> */ /* 898 */ (53*4 + 1), /* u05D1 (ב) -> b */ /* 899 */ (32*4 + 1), /* u05D2 (ג) -> g */ /* 900 */ (20*4 + 1), /* u05D3 (ד) -> d */ /* 901 */ ( 3*4 + 1), /* u05D4 (ה) -> h */ /* 902 */ ( 7*4 + 1), /* u05D5 (ו) -> v */ /* 903 */ (21*4 + 1), /* u05D6 (ז) -> z */ /* 904 */ ( 3*4 + 1), /* u05D7 (ח) -> h */ /* 905 */ (13*4 + 1), /* u05D8 (ט) -> t */ /* 906 */ ( 9*4 + 1), /* u05D9 (י) -> y */ /* 907 */ (54*4 + 1), /* u05DA (ך) -> k */ /* 908 */ (54*4 + 1), /* u05DB (כ) -> k */ /* 909 */ (11*4 + 1), /* u05DC (ל) -> l */ /* 910 */ (55*4 + 1), /* u05DD (ם) -> m */ /* 911 */ (55*4 + 1), /* u05DE (מ) -> m */ /* 912 */ (31*4 + 1), /* u05DF (ן) -> n */ /* 913 */ (31*4 + 1), /* u05E0 (נ) -> n */ /* 914 */ ( 1*4 + 1), /* u05E1 (ס) -> s */ /* 915 */ ( 1*4 + 0), /* u05E2 (ע) -> */ /* 916 */ ( 0*4 + 1), /* u05E3 (ף) -> p */ /* 917 */ ( 0*4 + 1), /* u05E4 (פ) -> p */ /* 918 */ (42*4 + 2), /* u05E5 (ץ) -> ts */ /* 919 */ (42*4 + 2), /* u05E6 (צ) -> ts */ /* 920 */ (56*4 + 1), /* u05E7 (ק) -> q */ /* 921 */ (57*4 + 1), /* u05E8 (ר) -> r */ /* 922 */ ( 2*4 + 2), /* u05E9 (ש) -> sh */ /* 923 */ |
︙ | ︙ |
Changes to ext/fts3/translit01.tcl.
︙ | ︙ | |||
1013 1014 1015 1016 1017 1018 1019 | 05BE 0000 {} {HEBREW PUNCTUATION MAQAF} 05BF 0000 e {HEBREW POINT RAFE} 05C0 0000 * {HEBREW PUNCTUATION PASEQ} 05C1 0000 sh {HEBREW POINT SHIN DOT} 05C2 0000 s {HEBREW POINT SIN DOT} 05C3 0000 * {HEBREW PUNCTUATION SOF PASUQ} 05C4 0000 {} {HEBREW MARK UPPER DOT} | | | | | 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 | 05BE 0000 {} {HEBREW PUNCTUATION MAQAF} 05BF 0000 e {HEBREW POINT RAFE} 05C0 0000 * {HEBREW PUNCTUATION PASEQ} 05C1 0000 sh {HEBREW POINT SHIN DOT} 05C2 0000 s {HEBREW POINT SIN DOT} 05C3 0000 * {HEBREW PUNCTUATION SOF PASUQ} 05C4 0000 {} {HEBREW MARK UPPER DOT} 05D0 0000 {} {HEBREW LETTER ALEF} 05D1 0000 b {HEBREW LETTER BET} 05D2 0000 g {HEBREW LETTER GIMEL} 05D3 0000 d {HEBREW LETTER DALET} 05D4 0000 h {HEBREW LETTER HE} 05D5 0000 v {HEBREW LETTER VAV} 05D6 0000 z {HEBREW LETTER ZAYIN} 05D7 0000 h {HEBREW LETTER HET} 05D8 0000 t {HEBREW LETTER TET} 05D9 0000 y {HEBREW LETTER YOD} 05DA 0000 k {HEBREW LETTER FINAL KAF} 05DB 0000 k {HEBREW LETTER KAF} 05DC 0000 l {HEBREW LETTER LAMED} 05DD 0000 m {HEBREW LETTER FINAL MEM} 05DE 0000 m {HEBREW LETTER MEM} 05DF 0000 n {HEBREW LETTER FINAL NUN} 05E0 0000 n {HEBREW LETTER NUN} 05E1 0000 s {HEBREW LETTER SAMEKH} 05E2 0000 {} {HEBREW LETTER AYIN} 05E3 0000 p {HEBREW LETTER FINAL PE} 05E4 0000 p {HEBREW LETTER PE} 05E5 0000 ts {HEBREW LETTER FINAL TSADI} 05E6 0000 ts {HEBREW LETTER TSADI} 05E7 0000 q {HEBREW LETTER QOF} 05E8 0000 r {HEBREW LETTER RESH} 05E9 0000 sh {HEBREW LETTER SHIN} |
︙ | ︙ |
Changes to test/fts3translit01.test.
︙ | ︙ | |||
37 38 39 40 41 42 43 | \u0427\u0430\u0439\u043a\u043e\u0301\u0432\u0441\u043a\u0438\u0439 chaikovskii \u0391\u1f30\u03c3\u03c7\u03cd\u03bb\u03bf\u03c2 aschylos \u03a3\u03c9\u03ba\u03c1\u03ac\u03c4\u03b7\u03c2 sokratis \u05d1\u05b5\u05bc\u05d9\u05ea\u05dc\u05b6\u05d7\u05b6\u05dd | | | 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | \u0427\u0430\u0439\u043a\u043e\u0301\u0432\u0441\u043a\u0438\u0439 chaikovskii \u0391\u1f30\u03c3\u03c7\u03cd\u03bb\u03bf\u03c2 aschylos \u03a3\u03c9\u03ba\u03c1\u03ac\u03c4\u03b7\u03c2 sokratis \u05d1\u05b5\u05bc\u05d9\u05ea\u05dc\u05b6\u05d7\u05b6\u05dd beaytlehem \u05d9\u05b0\u05e8\u05d5\u05bc\u05e9\u05b8\u05c1\u05dc\u05b7\u05d9\u05b4\u05dd yervashashlayim } # Create a full-text index to use for testing the stemmer. # db close |
︙ | ︙ |