Annotated Swadesh wordlists for the Koman group (Komuz family).

Languages included: Kwama [kom-kwm]; Begi Mao [kom-beg]; Opo [kom-opo]; Komo [kom-kmo]; Uduk [kom-udu]; Gule [kom-gul].

DATA SOURCES

I. General

Bender 1983 = Bender, Lionel M. Proto-Koman Phonology and Lexicon. In: Afrika und Übersee, 66, pp. 259-297. // A first attempt at reconstructing the phonology and select basic and cultural lexicon of Proto-Koman. The paper contains rich lexical material on most of the known Koman languages, collected by the author himself, although the quality of notation is far from ideal.

Corfield 1938 = Corfield, F. D. The Koma. In: Sudan Notes and Records, 21.1, pp. 123-165. // An ethnographic sketch of several populations of Koma people. Contains a small wordlist for two varieties of Opo (Kusgilo and Buldiit) and one variety of Komo (Madin).

Wedekind 2002 = Wedekind, Klaus; Wedekind, Charlotte. Sociolinguistic Survey Report of the Asosa-Begi-Ko\-mosha Area. Part II. SIL International. // A standard type SIL report on an area populated by Koman speakers. Contains representative wordlists for two dialects of Kwama (Northern Kwama, or Kwama proper; Begi Mao) and on one dialect of Komo.

II. Kwama.

Leyew 2005 = Leyew, Zelealem. Gwama, a little-known language of Ethiopia: a sketch of its grammar and lexicon. In: ELRC Working Papers, 1/1, pp. 1-52. // A short phonological and grammatical sketch, based on the author's own fieldwork with speakers in Addis Ababa. Contains a small vocabulary of the language.

III. Opo.

Silfhout 2013 = Silfhout, Marijke van. Opuo: Towards a phonology. Bachelor thesis, Leiden University. // A detailed description of the phonology of Opo based on the author's own fieldwork. Accompanied with a representative vocabulary of the language, for now the single largest collection of lexical data on Opo.

IV. Komo.

Krell 2011 = Krell, Amy. A Sociolinguistic Survey of the Ganza, Komo, and "Baruun be Magtole" Language Groups (Blue Nile Province, Sudan). SIL International. // Standard sociolinguistic survey of several Ethiopian languages, including a complete 200-item wordlist for each. Among the languages is a variety of Komo from the Gondolo village.

Otero 2014 = Otero, Manuel A. Notes from the Komo language "Discover your grammar" workshop. SIL International. // A very brief sketch of Komo grammar, accompanied by some textual examples.

Otero 2015 = Otero, Manuel A. Komo-Amharic-English Dictionary. Addis Ababa: SIL Ethiopia. // A representative dictionary of the Komo language (although containing some significant gaps, such as most auxiliary words, and without any prosodic markings).

V. Uduk.

Beam & Cridland 1970 = Beam, Mary S.; Cridland, A. Elizabeth. Uduk-English Dictionary. University of Khartoum. // A very detailed and informative dictionary of the Uduk language, with illustrative examples and thorough prosodic notation of each entry.

Killian 2015 = Killian, Don. Topics in Uduk Phonology and Morpho\-syntax. Ph.D. dissertation, University of Helsinki. // A detailed description of the phonology and grammar of Uduk, well illustrated by lexical and syntactic examples and containing several examples of glossed texts.

Thelwall 1983 = Thelwall, Robin. Twampa phonology. In: Nilo-Saharan Language Studies. Ed. by Lionel M. Bender. Michigan: East Lansing, pp. 323-335. // Brief sketch of Uduk (= Twampa) phonology, well illustrated by lexical examples from the author's own fieldwork; may be used as an additional control source.

VI. Gule.

Lejean 1865 = Lejean, Guillaume. Note sur les Fougn et leur idiome. In: Bulletin de la Société de Géographie IX, pp. 238-252. // Brief ethnographic and linguistic (in the form of brief wordlists) information on some idioms of the present day Blue Nile State, including a variety of Gumuz and Gule/Anej. Transcription quality is predictably poor, but the source is important as the earliest available recording of Gule data.

Seligmann 1911 = Seligmann, Brenda Z. Note on Two Languages in the Sennar Province of Anglo-Egyptian Sudan. In: Zeitschrift für Kolonialsprachen II, pp. 297-308. // This paper provides some brief lexical and paradigmatic information on Gaahmg ("Jebel Tabi") and Gule/Anej ("Jebel Gule"); one of the few existing sources on Gule, particularly precious for bits of information on the verbal paradigm of this language.

NOTES

I. Kwama; Begi Mao.

1. General.

Kwama (Gwama) is currently spoken by several thousand people (census data show serious variation depending on the source) in the South Benishangul-Gumuz region of Ethiopia, where they are heavily interspersed with the Komo, as well as different Cushitic, Omotic, and Nilotic populations. For our main source, we have chosen [Leyew 2006], a grammatical sketch accompanied by a representative vocabulary from which it is rather easy to construct a Swadesh wordlist; it is also the most recent of all available data sources. For additional control, we list alternate data on Komo from the comparative wordlists of M. L. Bender [Bender 1983] and from [Wedekind 2002].

The latter source actually contains a large amount data on two varieties of Kwama: Kwama "proper", or Northern Kwama, and the so-called "Begi Mao" (Mao is an ethnic term applied to several distinct populations of the area, including both Omotic and Koman people), spoken in the Begi area. Although the close relationship of Kwama and Begi Mao is beyond doubt, the latter still shows enough differences in basic lexicon to deserve the construction of a separate wordlist; therefore, we have included the Begi Mao wordlist, selected from the data provided for the Wedekinds by Ato Harun Soso, as a separate entry.

2. Transcription.

Only minimal UTS-required transliterational changes have been necessary for the Kwama wordlist in [Leyew 2006], such as conversion of the doubled vowels indicating vowel length (aa > , uu > , etc.). All the wordlists in [Wedekind 2002] are transcribed in standard IPA and require only the usual small adjustments to UTS (e. g. re-transcription of affricates).

II. Opo.

1. General.

Until recently, the only acceptable source for Opo (Opuuo), a Koman language spoken by about 1,000 people in five villages in the Gambella region, was the comparative survey [Bender 1983], which yielded enough lexical data to allow for the construction of a Swadesh wordlist with minimal gaps; possible phonetic errors and semantic inaccuracies, often manifested in Bender's data collections, had to be accepted as inevitable. As an additional, even less reliable, source, given the overall scarceness of the data, we were able to include material from an early source [Corfield 1938], which provides data on two sub-dialects of Opo: Buldit and Kusgilo. These generally agree with Bender's data; occasional discrepancies, while not really usable to correct lexicostatistical entries, may still be important for etymological research and work on the reconstruction of a proto-wordlist for Koman.

Luckily, the situation has been significantly remedied recently with the appearance of an important piece of research by Marijke van Silfhout [Silfhout 2013], who not only provided an accurate phonetic description of the language but also accompanied it with a large vocabulary, which we have selected as our primary source of data. Discrepancies between Silfhout's and Bender's data largely concern issues of phonetics and phonology; a small bunch of lexical discrepancies may be ascribed to dialectal variety (which is acknowledged in Silfhout's thesis), but given Bender's tendency to err in his semantic glossing, they might as well be caused by inaccuracies, so the situation does not call for the construction of two different wordlists.

2. Transcription.

All of Silfhout's data in the vocabulary are transcribed in the semi-official Opo alphabet as well as in IPA phonetic representation, which requires only the usual minimal reconversion to the UTS system.

III. Komo.

1. General.

Our main source for Komo is [Otero 2015], a mid-size modern dictionary based on fieldwork conducted with a large group of native speakers; some information on pronouns and negations has also been drawn from the earlier (and very brief) grammatical sketch in [Otero 2014]. Other sources on Komo data include [Bender 1983]; [Wedekind 2002]; and a large survey list in [Krell 2011]. There are some notable discrepancies (including lexical) between all these sources, which is hardly surprising, considering that the language is spoken by about 12,000 people in various localities in Sudan and Ethiopia, forming a broken-up continuum. However, as is the case with other Koman languages as well, we have refrained from preparing several different wordlists, since it remains unclear how many of the discrepancies are genuine and how many are simply caused by inaccuracy of semantic glossing; all attested discrepancies are indicated in the Notes section and should be considered specifically during the etymological analysis of the wordlists and the construction of the proto-wordlist.

2. Transcription.

Komo data in [Otero 2015] are transcribed in a somewhat idiosyncratic orthography. Below we list the Komo alphabet in its entirety, along with its UTS representation.

Otero 2015 UTS
a a
b b
bb ɓ
d d
dd ɗ
e ɛ
g g
h h
i i
ɨ ı
k k
kk
l l
m m
n n
o ɔ
p p
pp
r r
s s
sh š (= IPA ʃ)
ss
t t
tt
u u
ʉ ʋ
w w
y y (= IPA j)
z z

IV. Uduk.

1. General.

Uduk (Twampa) is the first Koman language to have received extensive lexicographic coverage, in the form of [Beam & Cridland 1970], an exemplary dictionary that still remains the single best source of information on the lexicon of this language and serves as our default source for the primary lexicostatistical slot. Accurate grammatical information on Uduk used to be much harder to come by, but now a solid description is available in the form of [Killian 2015], where both grammatical information and details on the actual usage of certain basic words may be double-checked.

As additional control sources, we also use [Bender 1983] (mainly to confirm semantics, since Bender's phonetic notation is notoriously inaccurate) and [Thelwall 1983] (an independent description of the language's phonology, well illustrated with lexical examples).

Special gratitude goes to Don Killian in person, who was generous enough to look through the entire wordlist and suggest several important corrections and additions, based on his own experience of field work with the Uduk.

2. Transcription.

Uduk has the most complicated phonological system of all Koman languages, and several of the sources, including the primary source [Beam & Cridland 1970], use highly idiosyncratic transcription systems. Below we list the Uduk alphabet as employed in the dictionary along with its UTS equivalents.

[Beam & Cridland 1970] UTS transliteration
a a
b b
'b ɓ
c ɕ
c_
'c ɕʼ
d d
'd ɗ
dh
e e
g g
h h
i i
j ʓ
k k
k_
'k
l l
m m
n n
ŋ ŋ
ny ɲ
o o
p p
p_
'p
r r
s s
sh š
t t
t_
't
th
t_h_ t̪ʰ
'th t̪ʼ
u u
w w
y y
/ ʔ
VV

Beam and Cridland note three tonal registers in Uduk: high, mid, and low, which (probably because of typographic reasons) they note in parentheses after the main word: e. g. (-'.) corresponds to the trisyllabic tonal structure V̄V́V̀.

Uduk transcription in [Killian 2015] is largely IPA-compatible, so the discrepancies between it and UTS are typically "cosmetic" (Killian's ʃ = UTS š; c = UTS ɕ; also, Killian's j = UTS ʓ). Killian also postulates a large series of labialized consonants for Uduk: , t̪ʷ, , etc., which in [Beam & Cridland 1970] are orthographically transcribed as clusters (pw, kw, etc.).

Note: The palatal series is specifically defined in [Killian 2015] as a series of palatal plosives, i. e. ɕ, ʓ are in reality ȶ, ȡ. However, for reasons of better phonological compatibility with the rest of Koman material, including those of automated phonetic analysis, we still prefer to "technically" mark them as affricates.

V. Gule.

1. General

The Gule language, also called Anej in [Bender 1983] (= Hamaj in some earlier sources), seems to be extinct today and has only marginally been described. The single best source, permitting for the construction of a more or less representative Swadesh wordlist, is [Bender 1983], based on field data collected by the author in 1978-79 from "a few old people" at Jebel Gule. Several much earlier works also give Gule data, which is hardly reliable on its own, but may, to a certain extent, serve as verification of Bender's entries. In the notes section, we list lexical data from [Lejean 1865], the first ever source to talk about Gule (= Fougn), providing the equivalents in their orthographic form (the author uses a heavily francofied transcription system without any notes on phonetics), and from [Seligmann 1911], a somewhat more accurately transcribed source with examples of noun and verb phrases. Some discrepancies between Bender's data and earlier sources have been located, but, naturally, it is impossible to understand whether they reflect dialectal variety, diachronic evolution, or inaccuracy on the part of one of the researchers.

The issue of whether or not Gule should be included into the Koman language family remains officially undecided: thus, Glottolog, quoting Bender, states that the evidence is insufficient and positions Gule as a language isolate. However, lexicostatistical comparison, based on careful analysis of potential cognate distribution as well as fairly strong glimpses of regular phonetic correspondences between Koman and Gule, strongly suggests that the two taxa are more closely related to each other than to anything outside that immediate area (even Gumuz).

Technically speaking, within the GLD a separate database should have been set up for Gule, since even if it is a part of Koman, the split between them must have taken place on family rather than group level. However, we are making an exception here for the express purpose of showing the relationship between Gule and Koman and eradicating any possible doubts about it. At the same time, data on Gule remain so generally dubious that the wordlist is perhaps best assessed only within the context of "Narrow Koman" data.

2. Transcription

The transcription in [Bender 1983] follows the same conventions as his transcription for all other Koman languages. All morphological segmentation is based on structural considerations (e. g. the frequent apparition of -n at the end of nouns suggests that it was a nominal suffix, perhaps connected with singular marking, etc.).

Database compiled and annotated by: G. Starostin (last update: January 2017).