Home | About | Subscribe | Search | Member Area |
Humanist Discussion Group, Vol. 33, No. 275. Department of Digital Humanities, King's College London Hosted by King's Digital Lab www.dhhumanist.org Submit to: humanist@dhhumanist.org Date: 2019-09-25 13:17:49+00:00 From: David HooverSubject: Re: [Humanist] 33.272: randomising: examples Dear Willard, Geoffrey's mention of word clouds leads me to add some additional thoughts on randomization. First, I think the randomization Geoffrey mentions may be a bug rather than a feature of word clouds. The fact that words with the same frequency can occur with different orientations and different colors means that it is unlikely that anyone looking at a word cloud can tell which words occur at the same frequency. In word clouds, beauty (which has its value) has been prioritized over interpretability. A very early use of randomization can be found in the great progenitor, John F. Burrows's, "Not Unless You Ask Nicely: The Interpretative Nexus Between Analysis and Information," Literary and Linguistic Computing, Vol. 7, No. 2,1992, where he uses random selection of sections of texts to equalize the influence of authors. Stylo/R implements randomization in many places, for example in Rolling Classification. See Eder, M., Rybicki, J. and Kestemont, M. (2016). Stylometry with R: a package for computational text analysis. "R Journal", 8(1): 107-121. Finally, one of the most common uses of randomization these days is in topic modeling. Mallet uses a random seed in text sampling, which results in the variability in topic models that often disconderts new users of Mallet. My experience is that running Mallet twice without setting a random seed results in two models in which it is normal for the average match between the most similar topics in the two models to be 30% or less. That is, on average, less than 30% of the words in each pair of most similar topics in the two models are the same, on average. (A relatively crude comparison function is implemented in my Prepare and Visualize Mallet Topics spreadsheet, available at https://wp.nyu.edu/exceltextanalysis/.) Best, David -- David L. Hoover, Professor of English NYU Eng. Dept. 212-998-8832 https://wp.nyu.edu/davidlhoover/ Adolph slid back into the thicket and lay down behind a fallen log to see what would happen. Not much ever happened to him but weather. --Willa Cather On Tue, Sep 24, 2019 at 10:33 PM Humanist wrote: > Humanist Discussion Group, Vol. 33, No. 272. > Department of Digital Humanities, King's College London > Hosted by King's Digital Lab > > Submit to: humanist@dhhumanist.org > > > [1] From: David Hoover > Subject: Re: [Humanist] 33.270: randomising? (56) > > [2] From: Geoffrey Rockwell > Subject: Re: [Humanist] 33.270: randomising? (13) > > > > --[1]------------------------------------------------------------------------ > Date: 2019-09-24 12:43:48+00:00 > From: David Hoover > Subject: Re: [Humanist] 33.270: randomising? > > Willard, > > If you don't mind a relatively micro-example, see my "The Microanalysis of > Style Variation,"Digital Scholarship in the Humanities, Volume 32, Issue > suppl_2, December 2017, Pages ii17-ii30 > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__doi.org_10.1093_llc_fqx02 2&d=DwIDaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=Sn-ieO2z-vBn_smM8eh1SYxgWmvr83RejPNM6s2amT Y&m=hE9XN6_-Qd_0V7LULqZtl1cDjTFkaE5WswztaFiEdnw&s=bnAKCIW_c1s9psgEHt3fEOLCMcFQLa Ficv21p2LXJlg&e= > > In that article, I argue for the possible uses and dangers of randomizing > parts of literary texts as a way of smoothing out unwanted effects of > variations in style. > > Best, > David Hoover > -- > David L. Hoover, Professor of English NYU Eng. Dept. 212-998-8832 > https://wp.nyu.edu/davidlhoover/ > > Adolph slid back into the thicket and lay down behind a fallen log to > see what would happen. Not much ever happened to him but weather. > --Willa Cather > > > On Mon, Sep 23, 2019 at 11:48 PM Humanist wrote: > > > Humanist Discussion Group, Vol. 33, No. 270. > > Department of Digital Humanities, King's College London > > Hosted by King's Digital Lab > > Submit to: humanist@dhhumanist.org > > > > > > > > > > Date: 2019-09-24 03:32:55+00:00 > > From: Willard McCarty > > Subject: randomising > > > > I would be very grateful for examples of computing work in the > > humanities or human sciences that makes use of the machine's > > potential for randomisation, for generating results with a significant > > degree of unpredictability -- 'chaotic' results, if you will. This > > potential was designed in from the beginning, insofar as conditional > > branching and overwriting of instructions cannot be foreseen because > > they may depend on the results of previous calculations, or esp on > > inputs from the world. > > > > Many thanks for a any suggestions. > > > > Yours, > > WM > > -- > > Willard McCarty > > Professor emeritus, Department of Digital Humanities, King's College > > London; Editor, Interdisciplinary Science Reviews > > and Humanist > > > > --[2]------------------------------------------------------------------------ > Date: 2019-09-24 04:13:07+00:00 > From: Geoffrey Rockwell > Subject: Re: [Humanist] 33.270: randomising? > > Dear Willard, > > I don't think this is what you had in mind, but many word cloud tools will > randomly assign colours from a palette to the words rendered and randomly > alter > the orientation of the words (horizontal or vertical) when generating the > visualization. Typically it is only the size of the word and location in > the > cloud that is based on a measurement of the text. > > Yours, > > Geoffrey Rockwell _______________________________________________ Unsubscribe at: http://dhhumanist.org/Restricted List posts to: humanist@dhhumanist.org List info and archives at at: http://dhhumanist.org Listmember interface at: http://dhhumanist.org/Restricted/ Subscribe at: http://dhhumanist.org/membership_form.php
Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)
This site is maintained under a service level agreement by King's Digital Lab.