Home About Subscribe Search Member Area

Humanist Discussion Group


< Back to Volume 32

Humanist Archives: March 4, 2019, 8:01 a.m. Humanist 32.507 - a text editor for "entexutualizing" oral discourse

                  Humanist Discussion Group, Vol. 32, No. 507.
            Department of Digital Humanities, King's College London
                   Hosted by King's Digital Lab
                       www.dhhumanist.org
                Submit to: humanist@dhhumanist.org




        Date: 2019-03-03 11:32:26+00:00
        From: Iian Neill 
        Subject: Re: [Humanist] 32.502: a text editor for "entexutualizing" oral discourse?

Dear Catharine,

I don't know if this will be of interest to you, but I have been working
with Andreas Kuczera (University of Mainz) on a web-based solution, called
'Codex', for managing texts and standoff property annotations within a
graph database. A real-time standoff property editor gives the user the
ability to create overlapping annotations within a text editor-style
interface. The user can also link to, or create in an ad hoc manner, any
entities mentioned in the text and encode the references as annotations.
The Google NLP API has also been integrated into the system to generate
syntax annotations (Parts of Speech, sentences, sentiments, morphology)
which can be saved with the other, human-edited annotations. Given your
requirements it doesn't sound like the language services offered by the
Google API would be useful to you, but in principle any service can be used
whose output can be converted into standoff properties (annotations with
character offsets).

Another characteristic that may interest you is that the editor supports
not only overlapping annotations, but single-character annotations and even
inter-character (zero-width) annotations, along with 'meta-data'
annotations that apply to the whole text (in the manner of a TEI-XML
header). Speaking of TEI, although 'Codex' uses standoff properties and not
XML, many TEI elements are supported by the system as annotation types.
After all, if you remove the 'XML' from 'TEI-XML' you essentially have a
collection of annotation ontologies. (I understand this is not true of all
the TEI elements, like 'choice' and 'subst', but I think it is probably
true of most.)

If you're interested I can send you a pre-press article on the system, or
we could touch base over video for a demo.

The project is entirely open source and hosted on GitHub and BitBucket.

Warm regards,
Iian Neill

Visiting Researcher, Digital Academy of Mainz


On Sun, 3 Mar 2019 at 17:58, Humanist  wrote:

>                   Humanist Discussion Group, Vol. 32, No. 502.
>             Department of Digital Humanities, King's College London
>                    Hosted by King's Digital Lab
>                        www.dhhumanist.org
>                 Submit to: humanist@dhhumanist.org
>
>
>
>
>         Date: 2019-03-02 15:45:46+00:00
>         From: Catharine Mason 
>         Subject: Creating visual representation of stylized oral discouse
>
> Dear Willard and all Members of the Humanist Discussion Group,
>
> I am seeking any and all advice in the creation of a text editor for
> "entexutualizing" oral discourse. I am especially interested in questions
> of formatting and the markup of stylistic features of textual form and
> poetic functions and the like, most of which would be integrated in
> annotations.
>
> Working with a group of scholars in various cross disciplines such as
> sociolinguistics and anthropological linguistics, I have studied a wide
> body of encoded and annotated transcriptions by specialists in fields such
> as ethnopoetics, discourse analysis, and folklore and oral tradition more
> broadly. My purpose has been to identify standards in the study and visual
> representation of vocal and verbal arts, and to systematize inquires as
> well as variables in formatting choices. Much of what we do focuses on
> relationships between text segments and also social contextual indexing of
> meanings, so annotations ("metadata") are a central part of what we are
> gathering.
>
> Many of these questions arose in the late 1950s, and were formulated and
> debated throughout the 1990s. But despite the rise in interest in
> indigenous languages, very little funding is made available to anything
> other than documenting lexical, grammatical and morpho-syntactic data. The
> focus of my study is on the stylistics of social practices of language. I
> have been doing this for 10 years via non-profits founded in France and in
> the US (some info may be found on vovarts.org).
>
> We had always assumed that we would program our text editor using TEI, but
> I am afraid that this might greatly limit the number of people that could
> potentially provide very valuable data and metadata, namely insights into
> the deeper cultural meanings of spoken word performance.
>
> I am next to ignorant in questions about metalanguages and templates and
> would be deeply grateful to any guidance that specialists might provide. We
> are also looking to expand our team and clearly need someone either in TEI
> if we take that leap, or another solution that might allow our potentially
> very large user base to participate in the collection process. For the
> moment, we are operating with zero funding, but we are also exploring the
> possibility of a startup as a parallel for-profit venture to finance the
> archive. If our startup succeeds, the project could expand into something
> quite beyond our original purpose and we will want to invest sustainable
> human-centered technology, of course.
>
> Thanks for any and all consideration!
> --
> Catharine Mason, PhD
> Research Professor English and Linguistic Ethnography
> Université de Caen Normandie
> UFR LVE (Modern Languages Department)
> 14032 CAEN Cedex
> France
>
> Laboratoire CRISCO
>
> http://crisco.unicaen.fr/membres/catharine-mason-919741.kjsp?RH=1536071353594
> https://unicaen.academia.edu/CatharineMason
>
> President,
> Association VOVA France
> VOVA, Inc.
> www.vovarts.org
> https://www.facebook.com/CrossingLanguageBorders
>
> Board Member,
> Pays des Miroirs Productions
> http://www.paysdesmiroirs.com/
> Civ.Works
> https://civ.works



_______________________________________________
Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php


Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)

This site is maintained under a service level agreement by King's Digital Lab.