The Global Lexicostatistical
Database: News and updates
NEWS |
26.02.2018. After a long break (caused by various technical reasons), the GLD is
finally back in business with plenty of updates!
1) Indo-European: A wordlist for Tokharian B, extensively illustrated by diagnostic contexts, has
been compiled and annotated by A. Kuritsyna based on published dictionaries and
text corpora of the language. Notes on Tokharian A basic lexicon have also been
added, although a separate wordlist for Tokharian A has not been compiled due
to insufficient data.
2) Indo-European: Wordlists for the standard literary
variants of Lithuanian and Latvian, compiled by M. Saenko on the
baiss of several authoritative dictionaries, have been added to a separate Baltic
databases.
3) Niger-Congo: Two wordlists have been added on the Ikaan and Ayegbe (Isheu) dialects of the Ukaan language, commonly thought of
as an isolated branch within the large Benue-Congo family. Compiled and
annotated by G. Starostin based on several published papers and PhD theses.
4) Kordofanian: Three wordlists have been added for Tagoi, Orig (Turjok), and Tagom
- three closely languages of the Rashad group, whose genetic affiliation with the
other Kordofanian subgroups (such as Heiban and Talodi) remains contested.
Compiled and annotated by G. Starostin based on data either collected or
published from other researchers' field data by Th. Schadeberg. With these
wordlists, the GLD now has a complete set of data for all languages whose
membership in the Kordofanian family has been either ascertained or tentatively
proposed.
5) Nilo-Saharan: A wordlist for the Shilluk language (West Nilotic group)
has been compiled and annotated by G. Starostin based on comparative data from
two available dictionaries.
6) North America: A wordlist for the extinct Barbareño Chumash language,
based on available dictionaries and descriptions, has been compiled and
annotated by M. Zhivlov.
7) South America: A wordlist for the isolated and
nearly extinct Kipea (Kariri)
language of Brazil has been compiled and annotated by A. Nikulin, based on a
modern description as well as data from 17th century sources.
08.10.2017. Today's updates:
1) Africa: Two wordlists added for the Katla-Tima
group (potentially, but not conclusively, a member of the Kordofanian family):
a complete one for Tima, based on a
recent dictionary and grammar, and (largely for the sakes of completism) a
heavily gapped one for Katla, based
on a variety of scant sources. Compiled and annotated by G. Starostin.
2) South America: A wordlist for the Krenak language (Krenak group, Macro-Je
family) compiled and annotated by A. Nikulin, based on a variety of sources.
Some of the wordlists for the Tupari database have been updated as well.
17.09.2017. This week's update:
1) Africa: A completely revised and very significantly
changed wordlist for the Shabo
isolate of Ethiopia re-compiled by G. Starostin, based on several recent
sources, with multiple mistakes in the previous versions (due to imperfection
of older sources) corrected.
2) South America: (a) A wordlist for the Karo language of the Ramarama group
(Tupian family) compiled and annotated by A. Nikulin based on a variety of
sources; (b) A wordlist for the Maxakalí
language of the same-named group (Macro-Je family) compiled and annotated by A.
Nikulin based on a variety of sources.
10.09.2017. First significant update after a long summer break:
1) A wordlist on the Classical Armenian language has been compiled and annotated by Petr
Kocharov, based on a large selection of dictionaries and auxiliary sources and
cross-checked against the relevant text corpus. With notes on Indo-European
etymologies.
2) A wordlist on the Devinska Nova Ves dialect of Chakavian compiled and annotated by
Mikhail Sayenko.
3) Four wordlists representing the Mabaan-Burun
subbranch of Western Nilotic compiled and annotated by G. Starostin: Mabaan, Jumjum, and the Kurmuk
and Mayak dialects of (Northern)
Burun.
4) Two wordlists for the Chiquitano group of the
Macro-Je family: Lomerío
Chiquitano and Porto
Esperidião Chiquitano. Compiled and annotated by A. Nikulin (more
wordlists on South American groupings soon to come).
18.06.2017. Today's updates:
1) Five more wordlists added to the Slavic database: Chakavian (Orlec dialect), Slovenian (Resia dialect), Moravian (Mistrice dialect), Slovak (Pilisszanto dialect), and Lesser Polish (Wieciorka dialect). As
before, all the wordlists have been compiled and annotated by M. Saenko based
on a variety of lexicographic sources as well as new data from informants.
2) A wordlist on the Pa Na
(Bana) language added to the Hmong database. Compiled and annotated by G.
Starostin on the basis of two lexicographic sources. With this wordlist, all
the major languages and dialects of the Hmongic group are finally represented
in the GLD.
3) Two more Karen wordlists added to the Karenic database: Kayan Lahta and Yinbaw, compiled and annotated by G. Starostin based on several
published and digital sources.
4) A wordlist on the Lafofa
(Tegem) language and another one on the (El) Amira language (closely related to Lafofa, but suffering from a
severe lack of data, so the wordlist contains numerous gaps) have been added to
a separate database in the Kordofanian subsection of the Niger-Kordofanian
section. Since these languages are no longer believed to be members of the
Talodi subgroup of Kordofanian, they have been assigned to their own database.
Compiled and annotated by G. Starostin based on data from Th. Schadeberg's
survey and digitally published data from R. Stevenson's archives.
05.01.2017. Several updates
uploaded on this day:
1) 5 new wordlists added to the Slavic database: Chakavian (Orbanici dialect), Kajkavian
(Burgenland dialect), Slovene
(Ljubljana dialect), Belarusian
(Turov dialect), Russian (Deulino
dialect). All the wordlists have been compiled and annotated, sometimes with
extensive examples on usage, by M. Saenko, based on fresh data from speakers as
well as previously published dialectological dictionaries. Even more Slavic
data expected in the upcoming months!
2) A wordlist for Thok Reel,
the somewhat poorly described third member of the Dinka-Nuer subgroup, added to
the Western Nilotic database (East Sudanic section). Compiled and annotated by
G. Starostin based on old survey data by Larry and Lisa Roettger, compared with
a more recent grammatical description of the language.
3) A wordlist for the Dorze
language added to the Gonga-Gimojan database (Omotic section). Compiled and
annotated by G. Starostin based on Alemayehu Abebe's survey of several Ometo
languages.
4) Two wordlists added for Fur
and Amdang (= Biltine, Mimi), the
only two languages of the small Fur-Amdang group / family with still uncertain
higher level connections. Compiled and annotated by G. Starostin on a large
variety of sources for Fur and a recent survey list for Amdang.
5) Four wordlists added to the Talodi database (Kordofanian section),
for Tocho, Acheron, Lumun, and Torona, which more or less completes
the Talodi database altogether. Compiled and annotated by G. Starostin based on
Th. Schadeberg's data for Tocho, and a recent survey paper for the three other
languages.
6) Two wordlists added to the Hmong database for the two main dialects
of the She language, Lianhua She and
Luofu She. Compiled and annotated by
G. Starostin based on Mao Zongwu's comparative description of these dialects
and an additional check source.
02.26.2017. Today's update:
1) A wordlist for Pekon Kayan
added to the Karen database (Sino-Tibetan section). Based essentially on Ken
Manson's fieldwork, re-compiled and annotated by G. Starostin.
2) A wordlist for the Dinka
language added to the West Nilotic database (East Sudanic section). The
wordlist is primarily based on Arthur Nebel's dictionary (Rek dialect), but
includes extensive notes on various other Dinka dialects as well, synthesized
from a variety of sources. Compiled and annotated by G. Starostin.
02.12.2017. Today's update:
1) A wordlist for Surui-Paiter
(Monde group) added to the Tupi section of the database. Compiled and annotated
by A. Nikulin.
2) A wordlist for the Bench
language added to the Gonga-Gimojan database (Omotic section). Compiled and
annotated by G. Starostin based on comparison of two different published
sources.
01.22.2017. The first, and
quite large, update for 2017 includes:
1) From A. Nikulin, two more
wordlists in the South American section: the Arawakan family gets its first
coverage with two wordlists for the Maritime group: Lokono and Añu.
Compiled and annotated based on a variety of recent sources.
2) From A. Trofimov, a wordlist for the extinct Avestan language (largely based on
Young Avestan data) added to the Iranian database; compiled based on
Bartholomae's classic dictionary and cross-checked with actual texts.
3) The Koman database (Komuz family) has been
completed with an incomplete, but testable wordlist for the extinct, distantly
related Gule (Anej) language,
compiled by G. Starostin based on M. L. Bender's data, as well as several much
earlier sources. Additionally, the information on Uduk (Twampa) has been
significantly updated, courtesy of Don Killian who was kind enough to provide
comments based on his own fieldwork.
4) The Talodi database (Kordofanian family) has been
updated with wordlists for the closely related Dagik (Masakin) and Ngile
(Daloka) languages. Compiled and annotated by G. Starostin based on Th.
Schadeberg's wordlists and some additional control sources.
5) In the Sino-Tibetan section, the Karen database has
been updated with wordlists for Kayah
Monu, Brek Kayaw and Yintale languages. Largely based on
data from a comparative study by Myar Doo Myar Reh from 2004, as well as some
additional control sources. Compiled and annotated by G. Starostin.
6) A proto-wordlist has been added to the Khoekhoe database (Central Khoisan
section), with detailed comments. Current reconstruction by G. Starostin, but
largely based on R. Vossen's research originally published in 1997.
12.05.2016. This week's
update:
1) More lists in the South American section: four varieties of Wichi (Eastern, Bazan, Nocten, Güisnay) and Iyojwa'ja Chorote, all belonging to the Matacoan group and family. Compiled and annotated by A. Nikulin
based on a variety of mostly recent sources.
2) The Koman database is nearly completed with a wordlist for the last
major living Koman language, Uduk.
Compiled by G. Starostin based on Beam & Cridland's classic dictionary from
1970, as well as a set of more recent control sources.
11.14.2016. Another lengthy
break, but here are more updates:
1) The Gondi-Kui (Central Dravidian) database updated with a wordlist
for the Konda language. Compiled and
annotated by G. Starostin based on the work of Bh. Krishnamurti, Th. Burrow and
S. Bhattacharya in the 1960s.
2) The Kordofanian section is expanded with the first two wordlists from
the Talodi group: Talodi proper (Jomang)
and Nding. Compiled and annotated
based on Th. Schadeberg's lists.
3) A new database added to the Omotic (Afro-Asiatic) section: the Gonga-Gimojan
database (Ometo languages and their closest relatives) now features lists for Yem (Yemsa) and Chara languages. Compiled and annotated by G. Starostin based on a
variety of recent sources.
10.09.2016. After a summer
break, the GLD updates are back with a bang!
1) From M. Saenko, three more wordlists to round out the Romance
database: one for the Asturian
language and two for two different stages of the Latin language - based on a detailed analysis of textual contexts
for Plautus (Archaic Latin) and Apuleius (Late Classical Latin), as well as
some additional texts as control sources.
2) From A. Trofimov, a detailed wordlist on Vedic Sanskrit, largely based on the Atharvaveda, but with strong
attention to the Rigveda as well; the annotations contain both references to
classic dictionaries (Grassmann, etc.) and to particular textual locations that
help select the optimal candidates for the Swadesh wordlist.
3) From A. Kassian, another update to the Athabaskan database: a
wordlist for Degexit'an, compiled
and annotated from existing dictionaries and auxiliary sources.
4) From A. Nikulin, five wordlists for the Tupari language group of
South America, including Tupari
proper, Akuntsu, Wayoro, Makurap, and Mekens.
Compiled based on existing dictionaries, wordlists, grammar sketches, and other
rare sources.
5) From G. Starostin, seven wordlists on the Heiban languages of the
Kordofanian language family: Utoro, Shirumba, Tiro, Moro, Ko, Warnang, and Logol.
Compiled and annotated based on Th. Schadeberg's lists, but also cross-checked
with other sources on these languages whenever they are available (esp. for
Tiro, Utoro, and Moro). This more or less completes the Heiban database.
6) Also from G. Starostin: the first wordlist of a Nilotic language - Nuer, belonging to the West Nilotic
group. Compiled and annotated based on J. Kiggen's classic dictionary and
cross-checked with two additional sources.
7) Finally, from Timothy Usher, four wordlists for two different
languages (more precisely, two subdialectal varieties of one language and two
very closely related dialects in another branch) of the Bulaka River family of
New Guinea: Maklew (two different
wordlists) and Jelmek (including Jabsch). More information on these languages
may be found on Timothy's own website, Newguineaworld:
https://sites.google.com/site/newguineaworld/families/bulaka-river.
06.06.2016. Two updates:
1) Three more wordlists in the Romance database: for Turinese Piemontese, Savoyard Franco-Provençal, and Walloon. As usual, compiled by M.
Saenko based on work with language speakers and comparison with previously
published sources where available.
2) A new
database has been added to the Nilo-Saharan section: wordlists for Koman
languages Kwama (with a separate
wordlist for its dialectal variety called Begi
Mao), Komo, and Opo. (The wordlist for the fourth Koman
language, Uduk/Twampa, is forthcoming). Compiled and annotated by G. Starostin
based on a variety of old and recent sources.
22.05.2016. Two Indo-European
updates today:
1) The Romance database has been updated with two classical lists: Old Italian, based on the preserved
corpus of Dante Alighieri, and Old
French, based on the corpus of Chrétien de
Troyes. Both wordlists compiled and annotated by M. Saenko.
2) The Iranian database has been updated with lists for Yaghnobi and Parachi languages, compiled and annotated by our new contributor,
Artem Trofimov, based on a variety of published sources.
16.05.2016. Two updates:
1) The Gondi-Kui database (Dravidian section) has been updated with a
wordlist for Kuwi, compiled by G.
Starostin based on M. Israel's dictionary and additional control sources.
2) The Hmong database has been updated with a wordlist for Eastern Xiangxi, compiled by G.
Starostin based on a comparative monograph by Yang Zaibiao. Additionally, some
mistakes were corrected and extra notes written for Western Xiangxi as well.
11.04.2016. Updates:
1) The Karen group database (Sino-Tibetan family) has been expanded with
a list for Western Kayah Li,
compiled and annotated by G. Starostin based on several recent data sources.
Some changes have been made to the Eastern Kayah Li wordlist as well.
2) A wordlist for the Tapirape
language (Tupi-Guarani group) has been compiled and annotated by A. Nikulin,
based on Antonio Almeida's grammar and glossary as well as a more recent
control source.
3) Another two worldlists added to the Romance database: for the Foligno dialect of Italian and the Picard language. Compiled by M. Saenko
based on data collected from actual language speakers.
04.04.2016. Some African
updates:
1) The Dizoid group database (Omotic family) has been updated with two
wordlists for the Nayi and Sheko languages, compiled and annotated
by G. Starostin based on a variety of sources.
2) The Heiban database (Kordofanian family) has been updated with a
wordlist for the Laru language,
compiled and annotated by G. Starostin based on Th. Schadeberg's wordlist and
some additional control sources.
3) A new proto-wordlist, for the Proto-Taa
language, has been added to the Taa database (Peripheral Khoisan). Compiled by
G. Starostin based on his own preliminary reconstruction.
08.03.2016. This week's update
consists of three wordlists added to the Romance database: for Galician, (Standard) Portuguese, and (Standard) French. All three wordlists compiled by
M. Saenko based on existing dictionaries, grammars, and work with native
language speakers.
01.03.2016. This week's
update:
1) The Germanic database has been updated with a wordlist for the Faroese language, primarily based on
Young & Clewer's dictionary with some additional data. Compiled and
annotated by G. Starostin.
2) The Athapaskan database is expanded again, with a wordlist for the
nearly extinct Sarsi language. Compiled
by A. Kassian based on a variety of published sources.
22.02.2016. After another long
break, we are back with some important updates:
1) The Daju database (East Sudanic section) has been completed, with
five new wordlists for Sila, Nyala, Eref, Lagowa, and Nyalgulgule varieties of Western Daju.
Compiled by G. Starostin based on works by Robin Thelwall and a variety of
newer control sources.
2) In the South American section, a new database for Cahuapanan languages is made available,
courtesy of the compiler A. Nikulin, with wordlists for the Shiwilu and Shawi languages, based on the most recent sources, as well as a
partially reconstructed proto-list for Proto-Cahuapanan.
29.12.2015. Our last update
for this year, once again, comes in several parts:
1) More Athapaskan wordlists – this time, for the Central Carrier and Koyukon
languages. Compiled and extensively annotated by A. Kassian based on a large
variety of existing sources (dictionaries, grammars, texts, etc.).
2) More South American wordlists – for the Arikapú and
Djeoromitxí languages of the Jabuti group (Macro-Je family). Compiled
and annotated by A. Nikulin based on several recent dictionaries; also
accompanied by his own reconstruction of the Proto-Jabuti wordlist.
3) A couple Kordofanian wordlists added to the Niger-Congo section, for
the closely related Ebang (= Heiban)
and Abul (= Abul-Heiban) dialects.
Compiled and annotated by G. Starostin based on Thilo Schadeberg's fieldwork
plus additional sources.
19.11.2015. We are alive and well, with a whole new bunch of updates:
1) In the North American section, a wordlist for the Lower Tanana language, based on James
Kari's dictionary and a variety of other sources, has been compiled/annotated
by A. Kassian and added to the Athapaskan database.
2) We finally have our first Australian wordlist
online, for the Gunwinyguan language Ngalakan,
compiled and annotated by M. Zhivlov based on F. Merlan's vocabulary and
grammar.
3) Coverage of the East Sudanic family goes on with
two wordlists for Daju languages Logorik
(= Lagori) and Caning (= Shatt),
compiled and annotated by G. Starostin on the basis of a variety of sources
(most notably Robin Thelwall's published fieldwork).
4) Another addition to the Nilo-Saharan section is a
set of wordlists for four dialectal varieties of the Gumuz language (Sai, Sese, Metemma, Gojjam), also compiled and
annotated by G. Starostin. The primary source is a comparative paper by M. L.
Bender, but many newer sources have been consulted as well for additional
verification.
5) Finally, the Karen database (Sino-Tibetan section)
has been expanded with a wordlist for Eastern
Kayah Li, based on D. Solnit's monographic description of this language.
Compiled and annotated by G. Starostin.
11.10.2015. Two updates:
1) In the North American section, a wordlist for the Ineseño (Samala) language of the
small Chumashan family has been compiled and annotated by M. Zhivlov, based on
the works of R. B. Applegate and the Santa Ynez Band of Chumash Indians.
2) In the African section, a wordlist for the Dizi language (Dizoid group, Omotic
family) has been compiled and annotated by G. Starostin based on recent work by
M. D. Beachy and its comparison with older Swadesh wordlists published by a
variety of researchers.
05.10.2015. Update in the East Sudanic section: three wordlists for the small
Temein group of Nuba Mountain languages (Temein
proper, Doni /= Keiga Jirru/, and Tese /= These, Teisei umm Danab/),
drawn largely from the manuscripts of Roland C. Stevenson and compared with
some additional sources, have been compiled, annotated, and uploaed by G.
Starostin.
25.09.2015. Three updates today:
1) A wordlist for Mesa
Grande 'Iipay has been uploaded to the Yuman database (Hokan family). Compiled
and annotated by Mikhail Zhivlov based on several sources from the 1970s.
2) Two wordlists for the Pharasa and the Cappadocian
(Aravan) dialects of the Greek language have been compiled and heavily
annotated by Alexei Kassian, based on a variety of old sources.
3) The up-to-now largely empty section on South
America gets a small, but significant boost from our most recent contributor,
André Nikulin: three wordlists for languages of the Nadahup group (Hup, Dâw, Nadëb) have been compiled and annotated by him,
based on largely recent sources. More to come on South America in the near
future!
05.09.2015. Two updates to kick off the fall season:
1) A wordlist for Southern
Tsakonian Greek, based on major comprehensive dictionaries as well as some fairly
recent sources, has been compiled and annotated by Alexei Kassian for the Greek
database.
2) The East Sudanic section has been expanded with a
database for two languages of the small Nyimang group in the Nuba Mountains: Ama (or Nyimang proper) and Afitti (Dinik). Both lists compiled and
annotated by G. Starostin based on a large variety of older and more recent
sources.
18.08.2015. Two updates in the African section:
1) Finally, a proto-wordlist has been uploaded for Proto-!Wi, the common ancestor of the
largest known (but mostly extinct) branch of South Khoisan (= !Wi-Taa).
Although the reconstructions are quite preliminary and approximate (largely due
to the unreliable nature of the data), they nevertheless illustrate some
interesting and important diachronic phonetic and semantic processes in !Wi
languages (due to the accompanying notes section) and may be used for external
lexicostatistical comparison on higher levels. All work on the reconstructions
performed by G. Starostin.
2) A wordlist for the Gaam (= Gaahmg, Ingassana) language of the Jebel group of Eastern
Sudanic has been compiled and annotated by G. Starostin based on a comparative
set of older and more recent data (the primary source is M. L. Bender and Malik
Ayre's dictionary of the language). This completes the Jebel database, except
for a proto-wordlist.
06.08.2015. A small, but important update: a wordlist for the deeply isolated Shabo language of Ethiopia has been
added to the Nilo-Saharan section (very schematically, since there is really no
strong evidence that would tie Shabo to «Nilo-Saharan») of the site. Compiled
and annotated by G. Starostin based on a variety of published sources.
03.08.2015. Another massive update of the Romance database today: three wordlists
for Venetian dialects (Venice, Primiero, Bellunese), four wordlists for Sicilian dialects (Palermitan, Messinese, Catanian, South-Eastern), six wordlists for
Catalan dialects (Central, North-Western, Minorcan, Castello de la
Plana, Valencian, Manises), one wordlist for Castilian Spanish, and one wordlist for
Provençal Occitan. All of the
wordlists, as usual, were compiled and annotated by M. Saenko based on
information from native speakers, except for the Provençal wordlist,
compiled from a 1995 dictionary.
01.07.2015. Two updates:
1) A wordlist for Standard Swedish added to the Germanic database. Compiled and annotated by
G. Starostin on the basis of several large dictionaries. All the major
«literary» Scandinavian languages are now included.
2) A dialect for Grosseto
Tuscan (Italian), compiled and annotated by M. Saenko based on information
from native speakers, has been uploaded to the Romance database.
27.06.2015. More updates:
1) Still more Romance wordlists from: this time, for
two dialects of Lombard (Bergamo and Plesio), the Neapolitan
dialect, and one more dialect of Sardinian (Campidanese). Original data collected from native speakers,
compiled and annotated by Mikhail Saenko.
2) Two Bantu wordlists, for the Mwetug and Elung
dialects of the Akoose language, added
to the newly created «Bantu A» database. Compiled and annotated by G. Starostin
based on publications by Robert Hedinger.
3) A new database for the «Jebel» or «East Jebel»
language group added to the East Sudanic section of the site. The database contains
four wordlists for the minor tribal languages Aka, Molo, Kelo, and Beni Sheko. Compiled and annotated by G. Starostin based mainly on
comparative data from works by M. L. Bender (also taking into consideration the
earlier and less reliable data by E. Evans-Pritchard).
17.06.2015. Another large update:
1) More Romance wordlists from Mikhail Saenko, this
time with a focus on Italy: Ravennate
Romagnol, 3 dialects of Emiliano
(Ferrarese, Carpigiano, Reggiano), 3
dialects of Ligurian (Genoese, Stella and Rapallo), a
wordlist for Literary Italian, and,
finally, a wordlist for the very distinct Logudorese
dialect of Sardinian. This brings up the total number of lects in the Romance
database to 25 and makes it the single largest database on the site in terms of
the number of languages covered - good job! Big thanks also to all the native
informants (listed in the database description) who provided the raw materials
for M. Saenko.
2) The Surmic database has been more or less completed
(except for the proto-wordlist) with the inclusion of data for the single known
North Surmic language, Majang
(Masongo), compiled from a variety of sources (mostly notes by M. L. Bender,
but cross-checked with earlier data) by G. Starostin. Also, the notes sections
on several other Surmic languages have been expanded with annotated items from
M. L. Bender's wordlists in his comparative survey on the languages of Ethiopia
(1971) - a useful source, albeit suffering from a number of phonetic and
semantic inaccuracies (which makes it all the more important to comment on
these inaccuracies whenever they are detected).
3) The Gondi-Kui database has been expanded with a
wordlist for Kui, put together on
the basis of W. W. Winfield's classic dictionary for the Udayagiri dialect and cross-checked with data from two newer sources on the
Balliguda and Kuṭṭiya dialects. Compiled and annotated by G.
Starostin.
22.05.2015. HUGE update today:
1) No less than eight new wordlists, many of them
containing totally original data, added to the Romance database. The lists are
for the literary Rumantsch Grischun
language; three different colloquial dialects of Romansh (Sursilvan, Surmiran, and Vallader); and four colloquial dialects of the Piemontese language (Lanzo
Torinese, Barbania, Carmagnola, and Vercellese). Compiled and annotated by Mikhail Saenko largely on
the basis of surveys recently completed by native speakers, but also taking
into consideration previously available descriptions and dictionaries.
The other updates are in the African section:
2) Two wordlists (unfortunately, containing serious
gaps) for the Akunnu and Ekiromi (Ikorom) dialects of the Akpes language, constituting a separate
subbranch of the Benue-Congo family, compiled and annotated by G. Starostin
based on several brief wordlists and comparative studies.
3) Two wordlists added to the Nubian database, for the
Karko and Wali varieties of Hill Nubian. Compiled and annotated by G.
Starostin, based on the recent SIL survey of several Kordofanian languages by
Amy Krell.
4) A wordlist for the Kwegu (Koegu) language, compiled and annotated by G. Starostin
based on materials of M. Yigezu and O. Hieda, has been added to the Surmic
database. This completes the South Surmic subsection of the base.
11.05.2015. Two updates:
1) Another three wordlists added to the Romance
database by Mikhail Saenko: one for Friulian,
and two for dialects of Ladin: Gardenese and Fassano. Compiled and annotated based on existing dictionaries as
well as original consultations with native speakers.
2) Three more wordlists added to the Hmongic database:
Northern Pa-Hng, Southern Pa-Hng, and Hm-Nai (Wunai). Compiled and annotated
by G. Starostin based on Mao Zongwu and Li Yunbing's materials, originally
published in 1997.
27.04.2015. Two updates in the African section:
1) A wordlist on the Me'en language (Surmic group) has been compiled and annotated by G.
Starostin on the comparative basis of several different sources, with elements
of dialectal comparison.
2) A wordlist for the Rere dialect of the Koalib
language (Heiban group, Kordofanian family) has been compiled and annotated by
G. Starostin on the basis of Thilo Schadeberg's original Swadesh wordlist,
contrasted with later publications on Koalib phonology by Nicholas Quint. More
wordlists on Kordofanian languages should be expected within the year.
15.04.2015. Two updates:
1) A new Romance wordlist from Mikhail Saenko, this
time for the extinct Dalmatian
language, based on a comprehensive source that summarizes nearly all of the
recorded data on Dalmatian, extracted from the last known speaker, Antonio
Udina, in the late 19th century.
2) Another wordlist in the Sino-Tibetan section: Geba Karen, based on a recent
description of the grammar of this language accompanied with a list of basic
lexicon. Compiled and annotated by G. Starostin.
02.04.2015. The first wordlist for a Mande language has been added to the
Niger-Congo section of the site: Bobo
(Southern Bobo Madaré), culled from Le Bris & Prost's dictionary and
compared with the results of a more recent dialectal survey. Compiled and
annotated by G. Starostin.
23.03.2015. Two updates:
1) Two more wordlists compiled and annotated by
Mikhail Saenko: one on the Aromanian
language and another one on Standard
Romanian (the literary variety), allowing now for a comprehensive
lexicostatistical analysis of the Eastern Romance (Vlach) languages.
2) A wordlist on the Mursi language (Surmic group) compiled and annotated by G.
Starostin on the comparative basis of three different sources.
11.03.2015. Two Indo-European updates:
1) In the Romance database, a wordlist for the Istro Romanian language has been
compiled and annotated by Mikhail Saenko, based on a variety of old and new
sources.
2) In the Germanic database, a wordlist for the
standard Danish language has been
compiled and annotated by G. Starostin.
25.02.2015. Two updates:
1) A wordlist for the Megleno Romanian language has been compiled and annotated by
Mikhail Saenko, our most recent contributor, based on several available
sources. More wordlists for various forms of Romanian and other Romance
languages are to be expected within the year.
2) A wordlist for the Serer language (Fula-Serer group of the alleged North Atlantic
family) has been compiled and annotated by G. Starostin, based on L. Crétois's
enormous dictionary and an auxiliary control source.
14.02.2015. Another Surmic wordlist, for the Chai
dialect of the Suri language, has been compiled from three different sources,
with M. Yigezu's (2001) data supported by two earlier publications as control
sources. Compiled and annotated by G. Starostin.
31.01.2015. A list for the Pengo
language added to the Dravidian section of the site. Compiled and annotated by
G. Starostin on the basis of T. Burrow and S. Bhattacharya's description
(1970).
20.01.2015. Two updates:
1) A long-awaited revival of the Athapaskan database:
a new list added for the Upper Tanana
language, compiled and annotated by A. Kassian on the basis of the most recent
sources, with some aid from Paul Milanowski.
2) The Surmic database has been updated again with a
list for the Baale (Kacipo-Balesi)
language, compiled and annotated by G. Starostin on the basis of M. Yigezu's
and Gerrit J. Dimmendaal's data from 1998-2001.
23.12.2014. The Hmong database is updated with wordlists for the Xiaozhai and Huangluo (sub)dialects of the Younuo
language, compiled and annotated by G. Starostin, based on data published by
Mao Zongwu and Li Yunbing in 2007.
16.12.2014. One more list added to the Surmic database, for the Murle language, based on M. Yigezu's
list (2001) with R. Lyth's old description of the language used as an important
secondary source. Compiled and annotated by G. Starostin.
22.11.2014. A list has been added for the Bwe
Karen language, based on Eugenie Henderson's dictionary of that language.
More lists will be added to the Karen database (Sino-Tibetan subsection) in the
next year.
06.11.2014. Two updates in the African section:
1) A list added for the Tennet language of the Surmic group, compiled and annotated by G.
Starostin on the basis of M. Yigezu's list (2001) as well as independent
research by Scott Randal.
2) Two more lists added to the Krongo-Kadugli
database, for Keiga (Deiga) and Tumtum languages, compiled and
annotated by G. Starostin on the basis of Th. Schadeberg's comparative
wordlists and additional data by M. Reh. This completes the Krongo-Kadugli
database, pending the inclusion of a reconstructed proto-wordlist for this
rather unique group of languages in the Nuba Mountains.
25.10.2014. Another Surmic list added, one for the Didinga language, compiled and annotated by G. Starostin based on a
comparison of M. Yigezu's list (2001) with several older sources.
25.09.2014. The large Surmic group of languages (East Sudanic family) is now
represented with a wordlist for the Narim
(Longarim) language, compiled and annotated by G. Starostin based on several
available sources. More to follow.
16.09.2014. A small addition to the Dravidian section: The Gondi-Kui («Gondwan»)
database has been opened with the construction of a wordlist for the Manda language, based on a recently
published dictionary as well as older fieldnotes by Th. Burrow and S.
Bhattacharya. Compiled and annotated by G. Starostin.
20.08.2014. Two updates:
1) The (particularly obscure!) Slavic group of
languages is finally represented on the site with a wordlist for Macedonian (Dihovo dialect), compiled
and annotated by A. Kassian with supplementary data on other Macedonian
dialects.
2) The Hmong database is updated with wordlists for
the Longhua and Liuxiang dialects of the Jiongnai
language, compiled and annotated by G. Starostin, based on data published by
Mao Zongwu and Li Yunbing in 2002.
15.08.2014. Two updates:
1) The Bai database has been updated with a wordlist
for Bijiang (Northern) Bai, the most
divergent variety of this macrolanguage. All the three main dialects are now
represented in the database. Compiled and annotated by G. Starostin based on
the same sources as older lists for Jianchuan and Dali Bai.
2) The Dargwa database (North Caucasian section) has
been initialized with some GLD-exclusive content, kindly supplied and annotated
in GLD format by Oleg Belyaev from his own field data on three poorly studied
dialects: Shiri, Amuzgi, Ashti Dargwa.
03.08.2014. The Tsezic database has been completed with a proto-wordlist for
Proto-Tsezic, reconstructed by A. Kassian in accordance with general GLD
methodology based on available lexical data and the original Proto-Tsezic
reconstruction by S. Nikolayev in the «North Caucasian Etymological
Dictionary», with a few modifications.
28.07.2014. Two updates:
1) The Krongo-Kadugli database has been expanded with
a wordlist for Krongo, compiled and
annotated by G. Starostin on the basis of M. Reh's description and Th.
Schadeberg's comparative wordlist.
2) The Sinitic database sees the arrival of the first
two wordlists on non-Mandarin Chinese "dialects": Jian'ou Min and Wenchang Hainanese (also Min), compiled and annotated by Elena
Kuzmina based on several available sources.
15.07.2014. A wordlist for Bokmål Norwegian (based on the current orthographic norm and general usage as indicated
in some of the most recent dictionaries) has been compiled and annotated by G.
Starostin and included in the Germanic database.
04.07.2014. Design update: We have now introduced nice icon links that take the
user directly from the name of the language in the database to the
corresponding info not only in the Ethnologue, but also in the Glottolog
language list. The system is now working in test mode on several databases
(Bantu-F, Bantu-L, Upper Sepik), but will gradually be expanded to the entire
system of databases. This will help the user get easy extra access to
additional information, including Glottolog's large bibliographical lists.
26.06.2014. A new complete database in the East Sudanic section: 6 wordlists for the
Tama language group (Tama, Erenga, Sungor, Miisiirii, Ibiri, Abuu Shaarib), compiled from the comparative vocabulary of Tama
languages by John Edgar and an additional source for Tama proper and annotated
by G. Starostin, have been uploaded.
19.05.2014. Two updates:
1) The Greek database has been updated with a list for
Ancient Attic Greek, based on the
idiolect of Plato as reflected in the latter's collected works. Compiled and
annotated by A. Kassian.
2) The Northeastern Dravidian database has been
updated with a list for the Malto
language, compiled and annotated by G. Starostin based on one old and one
relatively recent dictionary.
26.10.2014. Another double update for April:
1) The Tsezic database has been updated with a
wordlist for Sagada Dido and two
dialects of Khwarshi (Khwarshi proper
and Inkhokwari), compiled and
annotated by A. Kassian based on recent fieldwork by A. Abdulaev and R.
Karimova.
2) Two wordlists for the Nara language have been added to the Eastern Sudanic subsection of
the African section of the site. Compiled by G. Starostin based on the material
of a 19th century source («Old Nara») and on M. L. Bender's data, collected in
the 1960s («Modern Nara»).
04.10.2014. After a month without updating, three new contributions, two of them
provided by new participants:
1) The Greek database is updated with a wordlist for Modern Demotic Greek, freshly collected
by Alexandra Evdokimova from native speakers, and converted into GLD format by
A. Kassian.
2) We are introducing a new database for Iranian
languages, with two wordlists for the principal dialects of Ossetic (Iron and Digor), also
freshly collected by Oleg Belyaev from native speakers, and converted into GLD format
(with additional etymological annotations) by A. Kassian.
3) The Hmong database has been updated with three
wordlists for different dialects of the Bunu
language (Bunu proper, Baonao, and Numao), compiled by G. Starostin based on
recent Chinese sources.
03.06.2014. Two updates:
1) The Miwokan database is completed (except for the
proto-list) with a wordlist for Lake
Miwok, compiled and annotated by M. Zhivlov.
2) The Nubian database is also completed (except for
the proto-list) with wordlists for Birgid
and Midob Nubian, compiled and
annotated by G. Starostin.
02.21.2014. A wordlist for Dali Bai has
been added to the Bai database, compiled by G. Starostin based on the same
sources as the ones earlier used for Jianchuan Bai.
02.11.2014. Wordlists for Kadaru and Debri, two small Hill Nubian languages
/ dialects, have been added to the site based on published selections from R.
C. Stevenson's materials. Unfortunately, the wordlists contain multiple gaps,
but nevertheless remain of some use.
02.01.2014. Two updates:
1) A wordlist for the Kidero dialect of the Dido
language, compiled and annotated by A. Kassian, (temporarily) completes the
Tsezic database.
2) Three more wordlists added for the Krongo-Kadugli
languages (Tulishi, Kanga, Tumtum), compiled and briefly annotated by G. Starostin based
exclusively on comparative wordlists by Thilo Schadeberg.
01.23.2014. The Nubian database is expanded with the first wordlist for a Hill
Nubian language, Dilling (Deleny), compiled
and annotated by G. Starostin based on D. Kauczor's grammatical description and
auxiliary sources.
01.09.2014. After some temporary trouble (relocation to a new server), the GLD is
back up and functioning as always, with two first updates of the new year:
1) The Tsezic database is expanded with a list for the
Hinukh language, compiled and
annotated by A. Kassian;
2) A wordlist for Modern Icelandic added to the Germanic database by G. Starostin, based on
some bilingual dictionaries (and further tested on a random selection of
Internet sources for additional precision). In the process of comparing Modern
Icelandic forms with their Old Norse equivalents, a few mistakes have also been
corrected for the Old Norse list (some of them, with the assistance of I.
Sverdlov).
12.24.2013. A wordlist for the Kurux
language has been uploaded to the newly added database for the Northeastern
group of Dravidian languages. Compiled by G. Starostin on the basis of A.
Grignard's classic dictionary, compared with a recent SIL survey.
12.04.2013. Two updates:
1) The Hmong database has been expanded with a list
for Hmong Njua, compiled and
annotated by G. Starostin from Th. Lyman's dictionary.
2) The Tsezic database has been expanded with three
lists for Bezhta (Bezhta proper;
Khoshar-Khota; Tlyadal), compiled and annotated by A. Kassian based on a
variety of old and recent sources.
11.20.2013. Two updates:
1) A new database in the American section: the
Uto-Aztecan family (Takic group) is introduced with a wordlist for Cahuilla, compiled by M. Zhivlov on the
basis of a recent dictionary.
2) A new database in the Sino-Tibetan section: the Bai
cluster of dialects is introduced with a wordlist for Jianchuan Bai, compiled by G. Starostin on the basis of two
comparative sources.
10.30.2013. Two updates:
1) A new database in the North Caucasian section -
Tsezic, for now, with only one language (Hunzib),
soon to be expanded with more. Compiled and annotated by A. Kassian on the
basis of both recent and older lexicographic material.
2) The Krongo-Kadugli database has been expanded with
wordlists for the closely related Kadugli
(proper) and Miri languages.
Compiled and annotated by G. Starostin on the basis of field data published by
Thilo Schadeberg and other authors.
10.19.2013. A wordlist for Kenuzi Nubian
has been added to the Nubian database; this exhausts the list of all the
languages in the Nile-Nubian subgroup. Compiled by G. Starostin.
10.13.2013. Two updates:
1) The Athapaskan database has been expanded with a
new wordlist for the Tanacross
language, compiled by A. Kassian.
2) The Sino-Tibetan section of the site has been
expanded with a new database that contains five wordlists for three subdialects
of Northern Tujia (Tasha Tujia, Duogu Tujia,
Dianfang Tujia) and two subdialects
of Southern Tujia (Boluo Tujia, Tanxi Tujia), based on fieldwork
published in Chinese and European sources. Compiled by G. Starostin.
09.20.2013. Two updates:
1) Two more wordlists added to the Hmong database, for
the Chuanqiandian Hmong and Diandongbei Hmong dialects, spoken in
China. Compiled and annotated by G. Starostin on the basis of comparative
Hmong-Mien lexical data, published in 1987.
2) Three wordlists added for different varieties of
Miwok (Bodega Miwok, Central Sierra Miwok, Southern Sierra Miwok), based mostly on
dictionaries and grammatical descriptions from the 1960s-1970s. Compiled and
annotated by M. Zhivlov.
09.12.2013. A wordlist for the Khinalug
language, compiled by A. Kassian based on F. Ganieva's dictionary and older
sources, has been uploaded to the North Caucasian section of the site in its
own database.
08.14.2013. Two updates:
1) A wordlist for the Katcha language (Krongo-Kadugli group, of disputable affiliation) has
been added to the African section. Compiled and annotated by G. Starostin based
on the published fieldwork of Thilo Schadeberg and Roland Stevenson.
2) A wordlist for the Klon language (Bring dialect), based on a variety of new and old
sources, has been added by A. Kassian to the former Alor (now Alor-Pantar)
database in the small New Guinean section of the site.
07.31.2013. Six new wordlists for different varieties of the Pomoan languages (Kashaya; Northern, Central, Northeastern, Southeastern, and Southern)
have been added to the Pomo database by M. Zhivlov. The annotated lists are
mostly based on Robert L. Oswalt's publications, with some additional sources
also considered.
07.28.2013. It's been long in the making, but it's finally here: a reconstructed
Swadesh wordlist for Proto-Yeniseian,
compiled, annotated, and explained in detail by G. Starostin, based primarily
on S. A. Starostin's Proto-Yeniseian reconstruction, but with numerous
modifications through additional phonetic, semantic, and distributional
analysis of cognate forms.
07.25.2013. The Nubian database is expanded with a list for Dongolawi (Dongolese) Nubian, culled by G. Starostin from Charles
Armbruster's classic dictionary and cross-referenced with G. von Massenbach's
earlier data.
06.10.2013. Large update:
1) A wordlist for the Dogrib language, compiled by A. Kassian, has been added to the
Athapaskan database.
2) A wordlist for the Plains Miwok language, compiled by M. Zhivlov, has been added to
the Miwok database.
3) The Hmong database has been expanded with a new
wordlist for Qiandong Hmong,
compiled by G. Starostin.
4) The Dravidian language family makes its first
appearance in the GLD with a wordlist for Brahui,
compiled by G. Starostin based on Denis Bray's classic dictionary.
05.11.2013. Two updates:
1) The Athapaskan database has been expanded with new
wordlists for Central and Mentasta Ahtena dialects, compiled by
A. Kassian on the basis of James Kari's dictionaries and additional sources.
2) The Benue-Congo section of the site now has a
«Bantu-S» database with a wordlist for the Xhosa
language, re-edited by G. Starostin from an older version by Ye. Chekmeneva.
More Southern Bantu wordlists to be expected within the year.
04.28.2013. A wordlist for the moribund Konkow
language has been added to the Maidu database. Compiled by M. Zhivlov.
04.17.2013. The Burushaski database has been completed with a wordlist for Hunza Burushaski (the third Burushaski
dialect, Nagar, is not differentiated from Hunza on a lexicostatistical basis).
Compiled by G. Starostin, based on H. Berger's data. Some inaccuracies in the
Yasin wordlist corrected as well.
04.08.2013. A wordlist for modern Nobiin
(= Fadidja-Mahas) has been added to the Nubian database, compiled by G.
Starostin on the comparative basis of several recent and older sources.
03.28.2013. A wordlist for the nearly extinct Washo
language isolate (sometimes tentatively grouped with Hokan, but such an
affiliation is highly questionable) has been added to the American section of
the site. Compiled by M. Zhivlov, based on William H. Jacobsen's research.
03.27.2013. The recent conference on «Comparative-Historical Linguistics of the
XXIst Century», held in RSUH, Moscow, on March 20-22, features presentations
from all the major contributors to the GLD project as well as numerous other
specialists in the field. The program, materials, and even videos of the
conference can be located on the «Meetings» page of the
«Tower of Babel» site.
03.26.2013. The Athabaskan database (renamed from former Pacific Coast Athabaskan)
has been expanded to include wordlists for four different dialects of the Tanaina language, based on dictionaries
by J. Kari and other sources. Compiled by A. Kassian.
02.24.2013. Two more wordlists added to the Ekoid database, for the closely related
Ekparabong and Balep dialects (extracted from D. Crabb's comparative wordlist).
02.13.2013. The first list for a «Nilo-Saharan» language uploaded today: Old Nubian, with 75 Swadesh items
extracted from Gerald M. Browne's dictionary, opens the brand new Nubian
database. Compiled by G. Starostin.
01.29.2013. Two updates:
1) The Hokan section of the site has been expanded with
a database for the extinct Yana group, containing the wordlists for Northern Yana, Central Yana, and Yahi
dialects, documented by E. Sapir in the early 20th century. Compiled by M.
Zhivlov.
2) The Germanic database has been expanded with a
wordlist for Old Norse, compiled by
G. Starostin based on Cleasby's dictionary and cross-checked against earlier
lists.
01.03.2013. A wordlist uploaded for the Yasin
dialect of the Burushaski isolate, based on H. Berger's published materials
(compiled by G. Starostin).
12.19.2012. Two new lists uploaded today:
1) A wordlist for the extinct Shasta language (of the small and completely extinct Shastan
group), hypothetically belonging to the Hokan family; based on a selection of
sources dating mostly to the 1950s / 1960s. Compiled by M. Zhivlov.
2) A wordlist for the click language Hadza, an isolate of Tanzania
traditionally assigned to the "Khoisan" macrofamily, but without any
sufficient basis. Compiled by G. Starostin mostly on the basis of relatively
recent fieldwork by B. Sands, but adding data from numerous older sources as
well. With the addition of this wordlist, all of the "Khoisan"
languages / dialects for which sufficient amounts of data have been attested
are now properly represented in the GLD, without a single exception.
12.14.2012. A wordlist for the isolated (possibly distantly related to the Central
Khoisan family) language Sandawe,
based on recent fieldwork publications as well as adding comparative data from
several earlier sources, has been uploaded.
12.12.2012. The Pacific Coast Athapaskan database has been expanded with a wordlist
for Taldash Galice, an extinct
dialect, data on which was collected by H. Hoijer and H. Landar from the last
living speaker in the 1960s-1970s.
12.04.2012. A small list for the extinct Kwadi
language uploaded in the Central Khoisan section. Unfortunately, only a little
over 50% of the entries could be filled in due to the extreme scarceness of
data; nevertheless, the list was still included due to the importance of this link
for Khoisan studies.
12.02.2012. Four lists altogether uploaded on this day — all of them, incidentally,
on languages that are no longer living:
1) The Pacific Coast Athapaskan database has been
expanded with a wordlist for the extinct Kato,
based mainly on P. E. Goddard's fieldwork.
2) A new database on the Chimariko isolate, with data mainly taken from E. Sapir's field
notes, added to the Hokan section.
3) The Yeniseian database is finally completed (except
for the proto-wordlist) with lists for the long-extinct Arin and Pumpokol (the
latter with some significant gaps due to scarceness of data), constructed from
available XVIIIth century sources.
11.17.2012. Two updates:
1) The Pacific Coast Athapaskan database has been
expanded with a wordlist for Mattole,
based primarily on Li Fang-kuei's description from the 1930s.
2) The Kalahari Khoe database has been expanded with a
wordlist for the Hiechware database,
based on S. S. Dornan's old description. This completes the database as far as
all attested dialects, apt for lexicostatistical analysis, are concerned.
11.14.2012. The Lezgian database has finally been capped off with a wordlist for Proto-Lezgian, reconstructed based on
GLD standards by A. Kassian, with extensive notes justifying the details.
11.07.2012. Two updates:
1) The Yeniseian database is expanded with a wordlist
for the long-extinct Kott, compiled
mostly from M. Castrén's data, originally published in 1858, with the
addition of materials from even earlier sources.
2) The Yuman database has been expanded with wordlists
for Yavapai and Jamul Tiipay, compiled from several recent sources on these
languages.
10.11.2012. Large update to the former West Khoe database, now retitled
"Kalahari Khoe" and including seven more wordlists on minor Khoe
languages of Botswana: Cara, ǀXaise, Danisi, Ts'ixa, Deti, Kua, Tsua. All the data
have been extracted from publications based on fieldwork carried out by R.
Vossen in the 1980s.
09.26.2012. Another wordlist added
to the Ekoid Bantu database: this time, for the Ekajuk language, with limited information on dialectal variety.
09.25.2012. The Cocopa list has been added to the Yuman
database (Hokan family).
08.21.2012. Three more
wordlists added to the West Khoe database, for the ǂHaba, ǀGui,
and ǁGana languages of
Botswana.
08.09.2012. First two
wordlists added to the Coast Salish database: Upriver Halkomelem and Island
Halkomelem Salish, based on recent comprehensive dictionaries
for these dialects and personal communication with the authors. Both wordlists
have been compiled by Elena Barreiro, our newest contributor; more Salish data
are expected in the near future.
08.03.2012. After a month-long break, finally the next update: a wordlist for the
extinct Yugh dialect (closely
related to Ket), based on H. Werner's and earlier sources and including
comparative notes on Common Ket-Yugh. A few mistakes corrected in the proper
Ket section of the database as well.
06.30.2012. Last couple of updates for June:
1) A wordlist for the Lezgi language (Gyune dialect), along with comparative notes on
numerous other Lezgi dialects based on a variety of sources;
2) A wordlist for the Naro language (West Khoe subgroup), based on two dictionaries and
R. Vossen's comparative notes.
06.21.2012. Another Hmong-Mien update: a list for Hmong Daw (White Hmong) has been added, based on E. Heimbach's
detailed dictionary of this language.
06.09.2012. A new list added for the Tol
(Eastern Jicaque) language (Jicaquean group, possibly of the Hokan family).
06.07.2012. Another update in the Lezgian group database: this time, with two
wordlists for two different dialects of the Tabasaran language (Northern and Southern), based on a variety of
old and recent sources.
New feature: The «Language
Comparison» option on the main page of the site («Lists for specific
languages») has been significantly upgraded. It is now possible not only to
view any two different wordlists for any two languages side by side, but also
to highlight phonetically similar
forms between them (similarity is determined based on the same algorithm as
the «objectively generated tree of lexical similarity», see here for details). This is
particularly useful for determining the quantitative differences between the
numbers of accidental look-alikes on lexicostatistical lists and the average
numbers of true cognates that still preserve archaic phonological shapes.
06.01.2012. Another Khoisan update: new list uploaded for the Kxoe (Khwe) language (Central Khoisan family), based on a recent
dictionary by Christa Kilian-Hatz and older works by O. Köhler.
05.26.2012. A new list added for the Highland
Oaxaca Chontal language (Tequistlatecan group, possibly of the Hokan
family).
05.12.2012. Two new lists added to the Ekoid database: Nkum and Nnam.
05.06.2012. The Lezgian group database has been expanded with five wordlists for
five different dialects of the Aghul
language (Keren, Koshan, Gequn, Fite, and Aghul proper), based on a variety of
old and recent sources.
05.02.2012. Uploaded a list for the extinct Khoekhoe language !Ora, drawn from two short vocabularies published in 1920 and 1930.
This completes the Khoekhoe database, since not enough data are available on
the remaining extinct members of the group to perform proper lexicostatistics.
04.29.2012. A list for the Gothic
language, compiled with ample references not only to dictionaries, but to the
existing text corpus as well (Ulfilas' Bible), initiates the new database for
Germanic languages.
04.10.2012. A big day for updates: [1] The Lezgian group database has been
expanded with three wordlists for three different dialects of Rutul (Mukhad, Ixrek, and Luchek),
based, as usual, on a mix of old and recent sources. A few updates to other
Lezgian wordlists have also been made.
[2] A list for Maidu
(Maiduan group, Penutian family) has been compiled, based on F. Shipley's
dictionary (1963).
[3] The Khoekhoe lexicostatistical database is
initiated with a wordlist for Nama
(Khoekhoegowab), compiled from the recent highly informative dictionary by W.
Haacke & E. Eiseb (with references to the older Krönlein-Rust
dictionary as well).
[4] Finally, the Sinitic database has been expanded
with a list for Standard Chinese (=
Pǔtōnghuà or Standard Mandarin). With the aid of the accompanying
comments, it is now possible to trace, in details, the evolution of Chinese
basic lexicon from Early Zhou (XI-VIIIth centuries BC) to modern times, by
following the database.
03.26.2012. Two updates: [1] The Taa database (Peripheral Khoisan family) is
completed (except for the proto-list) with a wordlist for Nǀuǁen (Nǀusan), extracted, like the list for Kakia,
from D. F. Bleek's semi-reliable materials. It is the third and last dialect of
Taa for which enough data exist to make it suitable for lexicostatistical
purposes.
[2] The Hmong-Mien languages make their first
appearance on the GLD site with a list for Xiangxi
Hmong, based on cross-examination of data from one general and one
comparative lexicographical source. New wordlists for other varieties of Hmong
may be expected before the year is out.
03.13.2012. The Taa database
(Peripheral Khoisan family) is expanded with a wordlist for Kakia (Masarwa), extracted from D. F.
Bleek's materials: not a very reliable source, but the only one that exists for
this presumably extinct dialect.
03.05.2012. Two new lists added to the Ekoid database: Nselle and Nta (dialects
of the Nde-Nselle-Nta cluster; lexicostatistics based on available data shows
practically no lexical discrepancies between the three).
New feature: The "Build a tree" procedure on the site now
includes the option "Show lexicostatistical matrix", which yields all
cognacy percentages between the languages in the database in the form of a
standard table (which can be easily copy-pasted into a document).
02.29.2012. The Lezgian group database has been expanded with three wordlists for
three different dialects of the Tsakhur
language (Mishlesh, Mikik, and Gelmets), based on a variety of old and recent
sources. A few updates to the Budukh wordlist as well.
02.28.2012. The database for
Taa (one of the two branches of South Khoisan, along with !Kwi) is initiated
with a wordlist for !Xóõ,
the only surviving member of this branch, extracted from the extensive
dictionary by Anthony Traill and properly annotated.
02.23.2012. Two new wordlists added to the West Caucasian database: one for the
«literary» Abzhuwa dialect of Abkhaz
and a different one for the Bzyb
dialect, although no lexical discrepancies in the 100-wordlist have been
elicited (there are, however, significant phonetic differences between the two
dialects).
02.21.2012. A new Hokan list uploaded, this time, for the moribund (if not already
extinct) Eastern Pomo language of
the Pomo group, based on data published by Sally McLendon.
02.13.2012. The !Kwi group (Peripheral Khoisan family) database has been expanded
with a wordlist for the extinct language ǀHaasi (based on one single known
recording, made by R. Story in 1937).
02.05.2012. New wordlist edited and uploaded for the Ket language, the only survivor of the Yeniseian family, based on
data from G. Werner's dictionaries, with references to M. Castrén's
earlier data from the XIXth century.
01.19.2012. The !Kwi group (Peripheral Khoisan family) database has been expanded
with a wordlist for the extinct language ǀʼAuni (based on data collected
by D. F. Bleek in 1911 and 1936).
01.17.2012. The Lezgian group database has been expanded with a wordlist for the Budukh language.
01.13.2012. A new wordlist edited and uploaded for the extinct Abipon (Guaicuruan group, presumably
Mataco-Guaicuruan family), based on two vocabularies compiled in the late
XVIIIth century.
01.07.2012. The !Kwi group (Peripheral Khoisan family) database has been expanded
with a wordlist for the extinct language ǁXegwi (based on data mostly
recorded in the 1950s).
01.02.2012. Two new databases
added: (1) A wordlist for Seri (Seri
group, presumably Hokan family); (2) Five new wordlists for the Lezgian group
(North Caucasian family): Udi (Nidzh and Vartashen dialects), along with additional notes on Common Udi; Archi; Kryts (two dialects - Kryts «proper» and Alyk).
12.31.2011. We now have a
Facebook group for The
Global Lexicostatistical Database. Please join for quicker updates!
12.31.2011. The Sinitic group
(Sino-Tibetan family) wordlists have been
expanded by a list constructed for Late
Middle Chinese on the basis of the (semi-)vernacular document, The Record
of Linji (≈ IX-X centuries A.D.).
11.10.2011. The Ekoid group (Benue-Congo family)
database has been expanded with lists from two more languages: Efutop and Nde.
10.31.2011. A list for the ancient extinct Hurrian language (Hurro-Urartian group
and family) has been uploaded (unfortunately, only 66 out of 110 Swadesh
meanings are recoverable from known sources).
10.24.2011. Lists for Abé and Abidji (Agneby group,
Kwa family) have been uploaded in all formats.
10.20.2011. After more than a
year in the making, the GLD finally goes public — with 67 different annotated Swadesh
wordlists and 2 reconstructed proto-wordlists from 29 language groups of
Eurasia, Africa, America, and Papua. New updates coming soon!
BACK TO MAIN PAGE DATABASE
LIST RUSSIAN VERSION
© 2011-2014
George Starostin (site design, data input coordination)
© 2011-2014 Phil Krylov (programming,
technical support)