Home About Subscribe Search Member Area

Humanist Discussion Group

< Back to Volume 33

Humanist Archives: March 26, 2020, 9:37 a.m. Humanist 33.695 - new tools for (Greek) corpus linguistics

                  Humanist Discussion Group, Vol. 33, No. 695.
            Department of Digital Humanities, King's College London
                   Hosted by King's Digital Lab
                Submit to: humanist@dhhumanist.org

        Date: 2020-03-25 11:39:31+00:00
        From: Alek Keersmaekers 
        Subject: New computational tools for Greek corpus linguistics

Dear members of this list,

I'm excited to announce some new computational tools for Ancient Greek
corpus linguistics:

- First of all, the Duke papyrus texts
(https://github.com/alekkeersmaekers/duke-nlp) are now not only
automatically annotated for lemmas and morphology but for syntax and
semantic roles as well, making this the largest diachronic treebank for
Ancient Greek so far (about 4.5 million tokens). The accuracy for syntax
and semantics (about 85-90% and 81% respectively for letters) is lower
than for morphology and lemmatization, but still decent enough to be
used in linguistic research.

- DendroSearch (https://github.com/alekkeersmaekers/dendrosearch), a
user-friendly query tool for Greek treebanks, including all treebank
material that is available to date (if your treebank is still missing,
please let me know!)

- An automatic semantic role labeler
(https://github.com/alekkeersmaekers/PRL), using the roles of the
Pedalion grammar created at the University of Leuven
(http://en.pedalion.org/). It also includes an animacy lexicon, partly
based on the animacy lexicon of the PROIEL project (many thanks to Dag
Haug!) and distributional word vectors for Greek lemmas.

None of this would be possible without the painstaking work of the
ancient Greek treebanking community, so many thanks to the people of the
PROIEL, AGDT and Sematia projects, Vanessa Gorman, J.M. Harrington and
his team, Polina Yordanova, and the job students involved in the
Pedalion treebanks!

All the best,

Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php

Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)

This site is maintained under a service level agreement by King's Digital Lab.