forked from donnekgit/andika
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathspelling_comparison.php
105 lines (86 loc) · 7.24 KB
/
spelling_comparison.php
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
<!--- Comparison tab content --->
<div id="comparison" class="tab-content">
<p>
This section sets out a comparison between the systems in Sheikh Yahya's manuscripts, the Omar/Frankl paper, and <strong>Andika!</strong>. The comparison is summarised in the table below.
</p>
<table class="compare">
<tr>
<th class="leftcompare">Feature</th>
<th>Manuscripts</th>
<th>Paper</th>
<th>Andika!</th>
</tr>
<tr>
<td class="leftcompare"><em>Sakani</em> is marked on long vowels</td>
<td><span class="icon small blue" data-icon="c"></span></td>
<td><span class="icon small blue" data-icon="x"></span></td>
<td><span class="icon small blue" data-icon="x"></td>
</tr>
<tr>
<td class="leftcompare">All short vowels are marked</td>
<td><span class="icon small blue" data-icon="c"></td>
<td><span class="icon small blue" data-icon="x"></td>
<td><span class="icon small blue" data-icon="c"></td>
</tr>
<tr>
<td class="leftcompare"><em>Sakani</em> on consonants denotes syllabicity only</td>
<td><span class="icon small blue" data-icon="x"></td>
<td><span class="icon small blue" data-icon="c"></td>
<td><span class="icon small blue" data-icon="x"></td>
</tr>
<tr>
<td class="leftcompare">Distinction between syllabicity and prenasalisation</td>
<td><span class="icon small blue" data-icon="c"></td>
<td><span class="icon small blue" data-icon="c"></td>
<td><span class="icon small blue" data-icon="x"></td>
</tr>
</table>
<h6><em>Sakani</em> on long vowels</h6>
<p>
In Sheikh Yahya's manuscripts, <span class="sm_swahili">ي و</span> carry a <em>sakani</em> when used to mark length/stress in the penultimate syllable, eg <span class="sm_swahili">مَزِيْوَ</span> (<strong>maziwa</strong>, <em>milk</em>). However, in the Omar/Frankl article, <em>sakani</em> is not used here ﴾eg <span class="sm_swahili">مَزيوَ</span>). The suggested spelling in <strong>Andika!</strong> reflects this (though there is nothing to stop users marking <em>sakani</em> if they wish, and it is possible to choose this as an option in the Roman to Arabic converter).
</p>
<h6>Marking short vowels</h6>
<p>
In Sheikh Yahya's manuscripts, all short vowels are marked, but the Omar/Frankl paper proposed that marking these is unnecessary in certain situations:
<ul>
<li>
If the short (unstressed, non-penultimate) vowel they represent is identical to a preceding short vowel. For example, in <span class="sm_swahili">ثَمنين</span> (<strong>thamanini</strong>, <em>eighty</em>) the second <strong>a</strong> is omitted because it is preceded by an <strong>a</strong> (<em>fataha</em>).
</li>
<li>
If the short vowel they represent is identical to a preceding or following long/stressed (penultimate) vowel represented by <span class="sm_swahili">ي و ا</span>. For example, in <span class="sm_swahili">ثَمنين</span> (<strong>thamanini</strong>, <em>eighty</em>) the last <strong>i</strong> is omitted because it is preceded by <span class="sm_swahili">ي</span>, and in <span class="sm_swahili">ذهابُ</span> (<strong>dhahabu</strong>, <em>gold</em>) the first <strong>a</strong> is omitted because it is followed by <span class="sm_swahili">ا</span>.
</li>
<li>
Where all the vowels in a word are identical, except for length/stress. For example: <span class="sm_swahili">تپكاز</span> (<strong>tapakaza</strong>, <em>scatter</em>), <span class="sm_swahili">فكير</span> (<strong>fikiri</strong>, <em>think</em>), <span class="sm_swahili">شكور</span> (<strong>shukuru</strong>, <em>give thanks</em>).
</li>
</ul>
</p>
<p>
However, the suggested spelling convention in <strong>Andika!</strong>, as in Sheikh Yahya's own manuscripts, is that all short vowels are marked, thus: <span class="sm_swahili">شُكُورُ</span> - <span class="sm_swahili">فِكِيرِ</span> - <span class="sm_swahili">تَپَكَازَ</span> - <span class="sm_swahili">ذَهَابُ</span> - <span class="sm_swahili">ثَمَنِينِ</span>. There are a few practical reasons for this:
<ul>
<li>
Short <strong>e, o</strong> need to be marked in nearly all cases anyway, since the Arabic script has no way otherwise of distinguishing <span class="sm_swahili">ي</span> meaning <strong>i</strong> from <span class="sm_swahili">ي</span> meaning <strong>e</strong>, or <span class="sm_swahili">و</span> meaning <strong>o</strong> from <span class="sm_swahili">و</span> meaning <strong>u</strong>.
</li>
<li>
Omitting short vowel marks may conceivably save time when writing, once the rules above are mastered, but this is unlikely to apply when typing - it is probably faster simply to type more or less what would be typed when using the Roman script, including short vowels.
</li>
<li>
The omission of short vowels means that transliteration into Roman script would require post-editing to add vowels. Even if automating the application of the above rules to avoid this were possible, it is likely that the resulting system would be cumbersome.
</li>
</ul>
</p>
<h6><em>Sakani</em> on consonants</h6>
<p>
Arabic <em>sukun</em> marks the absence of a vowel after a consonant. In Sheikh Yahya's manuscripts, <em>sakani</em> is used consistently for this purpose (alongside its use on long vowels). Thus: <span class="sm_swahili">أُنَڤْيٗوٖيزَ</span> (<strong>unavyoweza</strong>, <em>how you can</em>), <span class="sm_swahili">كْوَ</span> (<strong>kwa</strong>, <em>to, by, for</em>). Its most common occurrence is on a nasal before another consonant: <span class="sm_swahili">أِنْڠَوَ</span> (<strong>ingawa</strong>, <em>although</em>), <span class="sm_swahili">نْجٖيمَ</span> (<strong>njema</strong>, <em>good</em>).
</p>
<p>
Its use on nasals means that <em>sakani</em> can also denote syllabicity, and in the Omar/Frankl paper, its function appears to be limited solely to that. The aim again was most likely to limit the number of diacritics in the text. The suggested convention in <strong>Andika!</strong>, however, is to follow the manuscript practice, and use <em>sakani</em> on the first consonant of multi-consonant clusters. In case some users feel that this leads to clutter, an option is added in the Roman to Arabic converter to turn it off.
</p>
<h6>Distinction between syllabicity and prenasalisation</h6>
<p>
Although the Roman orthography does not distinguish these two sounds, both Sheikh Yahya's manuscripts and the Omar/Frankl paper make a distinction between a syllabic nasal followed by a voiced plosive (eg <strong>m̩b</strong>) and a prenasalised voiced plosive (eg <strong><sup>n</sup>ɓ</strong>). The former is written with a preceding <span class="sm_swahili">ْم</span>, and the latter with a preceding <span class="sm_swahili">ن</span>, as in <span class="sm_swahili">مْبَيَ</span> (<strong>mbaya</strong>, <em>bad [Class1]</em>) compared to
<span class="sm_swahili">نبَايَ</span> (<strong>mbaya</strong>, <em>bad [Class 9]</em>).
</p>
<p>
<strong>Andika!</strong> will of course allow this distinction to be made in the Arabic script should a writer wish to do so. However, the Roman to Arabic converter cannot do this (since the distinction is not reflected in the standard orthography), and will always convert <strong>mb</strong> to <span class="sm_swahili">مْب</span>, so automatically-converted text will need post-editing to reflect this distinction.
</p>
</div><!--- End Comparison tab content --->