Home About Subscribe Search Member Area

Humanist Discussion Group


< Back to Volume 33

Humanist Archives: Sept. 26, 2019, 3:58 a.m. Humanist 33.275 - randomising: examples

                  Humanist Discussion Group, Vol. 33, No. 275.
            Department of Digital Humanities, King's College London
                   Hosted by King's Digital Lab
                       www.dhhumanist.org
                Submit to: humanist@dhhumanist.org




        Date: 2019-09-25 13:17:49+00:00
        From: David Hoover 
        Subject: Re: [Humanist] 33.272: randomising: examples

Dear Willard,

Geoffrey's mention of word clouds leads me to add some additional thoughts
on randomization. First, I think the randomization Geoffrey mentions may be
a bug rather than a feature of word clouds. The fact that words with the
same frequency can occur with different orientations and different colors
means that it is unlikely that anyone looking at a word cloud can tell
which words occur at the same frequency. In word clouds, beauty (which has
its value) has been prioritized over interpretability.

A very early use of randomization can be found in the great progenitor,
John F. Burrows's, "Not Unless You Ask Nicely: The Interpretative Nexus
Between Analysis and Information," Literary and Linguistic Computing, Vol.
7, No. 2,1992, where he uses random selection of sections of texts to
equalize the influence of authors.

Stylo/R implements randomization in many places, for example in Rolling
Classification. See Eder, M., Rybicki, J. and Kestemont, M. (2016).
Stylometry with R: a   package for computational text analysis. "R
Journal", 8(1): 107-121.

Finally, one of the most common uses of randomization these days is in
topic modeling. Mallet uses a random seed in text sampling, which results
in the variability in topic models that often disconderts new users of
Mallet. My experience is that running Mallet twice without setting a random
seed results in two models in which it is normal for the average match
between the most similar topics in the two models to be 30% or less. That
is, on average, less than 30% of the words in each pair of most similar
topics in the two models are the same, on average. (A relatively crude
comparison function is implemented in my Prepare and Visualize Mallet
Topics spreadsheet, available at https://wp.nyu.edu/exceltextanalysis/.)

Best,
David
--
David L. Hoover, Professor of English  NYU Eng. Dept. 212-998-8832
https://wp.nyu.edu/davidlhoover/

Adolph slid back into the thicket and lay down behind a fallen log to
see what would happen. Not much ever happened to him but weather.
--Willa Cather


On Tue, Sep 24, 2019 at 10:33 PM Humanist  wrote:

>                   Humanist Discussion Group, Vol. 33, No. 272.
>             Department of Digital Humanities, King's College London
>                    Hosted by King's Digital Lab
>
>                 Submit to: humanist@dhhumanist.org
>
>
>     [1]    From: David Hoover 
>            Subject: Re: [Humanist] 33.270: randomising? (56)
>
>     [2]    From: Geoffrey Rockwell 
>            Subject: Re: [Humanist] 33.270: randomising? (13)
>
>
>
> --[1]------------------------------------------------------------------------
>         Date: 2019-09-24 12:43:48+00:00
>         From: David Hoover 
>         Subject: Re: [Humanist] 33.270: randomising?
>
> Willard,
>
> If you don't mind a relatively micro-example, see my "The Microanalysis of
> Style Variation,"Digital Scholarship in the Humanities, Volume 32, Issue
> suppl_2, December 2017, Pages ii17-ii30
>
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__doi.org_10.1093_llc_fqx02
2&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=Sn-ieO2z-vBn_smM8eh1SYxgWmvr83RejPNM6s2amT
Y&m=hE9XN6_-Qd_0V7LULqZtl1cDjTFkaE5WswztaFiEdnw&s=bnAKCIW_c1s9psgEHt3fEOLCMcFQLa
Ficv21p2LXJlg&e=
>
> In that article, I argue for the possible uses and dangers of randomizing
> parts of literary texts as a way of smoothing out unwanted effects of
> variations in style.
>
> Best,
> David Hoover
> --
>  David L. Hoover, Professor of English  NYU Eng. Dept. 212-998-8832
>  https://wp.nyu.edu/davidlhoover/
>
> Adolph slid back into the thicket and lay down behind a fallen log to
> see what would happen. Not much ever happened to him but weather.
> --Willa Cather
>
>
> On Mon, Sep 23, 2019 at 11:48 PM Humanist  wrote:
>
> >                   Humanist Discussion Group, Vol. 33, No. 270.
> >             Department of Digital Humanities, King's College London
> >                    Hosted by King's Digital Lab
> >                 Submit to: humanist@dhhumanist.org
> >
> >
> >
> >
> >         Date: 2019-09-24 03:32:55+00:00
> >         From: Willard McCarty 
> >         Subject: randomising
> >
> > I would be very grateful for examples of computing work in the
> > humanities or human sciences that makes use of the machine's
> > potential for randomisation, for generating results with a significant
> > degree of unpredictability -- 'chaotic' results, if you will. This
> > potential was designed in from the beginning, insofar as conditional
> > branching and overwriting of instructions cannot be foreseen because
> > they may depend on the results of previous calculations, or esp on
> > inputs from the world.
> >
> > Many thanks for a any suggestions.
> >
> > Yours,
> > WM
> > --
> > Willard McCarty
> > Professor emeritus, Department of Digital Humanities, King's College
> > London; Editor, Interdisciplinary Science Reviews
> > and Humanist
>
>
>
> --[2]------------------------------------------------------------------------
>         Date: 2019-09-24 04:13:07+00:00
>         From: Geoffrey Rockwell 
>         Subject: Re: [Humanist] 33.270: randomising?
>
> Dear Willard,
>
> I don't think this is what you had in mind, but many word cloud tools will
> randomly assign colours from a palette to the words rendered and randomly
> alter
> the orientation of the words (horizontal or vertical) when generating the
> visualization. Typically it is only the size of the word and location in
> the
> cloud that is based on a measurement of the text.
>
> Yours,
>
> Geoffrey Rockwell


_______________________________________________
Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php


Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)

This site is maintained under a service level agreement by King's Digital Lab.