The
Global Lexicostatistical Database: Plans
PLANS |
The Global Lexicostatistical Database is a
long-term project with a limited number of specialists and resources at its
disposition; this makes it impossible to implement all the desired features
at the very start, or even come up with a definitive/finalized list of such
features. Nevertheless, the GLD is currently updated on at least a bi-weekly
basis, and we hope that the project will gradually pick up speed as time goes
by. Besides what is already there, users should eventually expect the following
additions:
1. More lists — no dazzling interface or
complex analytical machinery is worth anything without lots and lots of
accompanying data! Swadesh wordlists are being added to the overall database
all the time, and, if you are willing to master the format, you can throw in
your own additions as well (see Collaboration
for further details).
It is also quite
possible that at least some databases will be later expanded to include
«complete» Swadesh 200-item lists, and, perhaps, even move beyond that limit.
In particular, work has recently begun on the construction and aprobation of an
expanded 400-item list that not only adds more data, but also allows to take
into account the possibility of «trivial» semantic shifts (empirically attested
as polysemies or reconstructed beyond a reasonable doubt in low-level families)
between items, so as to come up with more complex models of lexicostatistical classifications.
Certain preliminary results on that research are expected to appear on the
website relatively soon.
2. User-defined versions — possibility for
registered users to build their own «unauthorised» copies of the uploaded
Swadesh wordlists online, adding their own notes and modifying cognation
indexes if they do not agree with the original etymological judgements or wish
to test different hypothetical configurations. This will make the GLD a significant
working tool for all historical-comparative linguists working with
lexicostatistics. However, this particular feature will require quite some time
to implement. In the meantime, it is always possible to work with the databases
in offline mode, by using StarLing.
3. More tree-building options — even more
additional parameters will be introduced into the tree-building algorithm,
including different variants of the standard glottochronological formula, error
margin indications, and possible incorporation of character-based rather than
distance-based algorithms.
BACK TO MAIN PAGE DATABASE
LIST RUSSIAN VERSION
© 2011-2016 George Starostin (site design,
data input coordination)
© 2011-2016 Phil Krylov (programming,
technical support)