Home About Subscribe Search Member Area

Humanist Discussion Group


< Back to Volume 32

Humanist Archives: Jan. 16, 2019, 9:10 a.m. Humanist 32.344 - My Thoughts on Wikipedia, part 1/2

                  Humanist Discussion Group, Vol. 32, No. 344.
            Department of Digital Humanities, King's College London
                   Hosted by King's Digital Lab
                       www.dhhumanist.org
                Submit to: humanist@dhhumanist.org




        Date: 2019-01-15 20:42:34+00:00
        From: Ken Friedman 
        Subject: My Thoughts on Wikipedia, part 1/2

[Due to an unforeseen limitation in the software, Ken Friedman's long 
conclusion on this thread proved too long for Humanist to handle. So 
I am sending it out in two parts. This is the first. --WM]

Dear Colleagues,

This memo will explain my interest in Wikipedia. To get a broader view on
Wikipedia than my own view, I have been sending out queries. I very much
appreciate the replies and information that people have contributed in reply. At
this point, I want to summarise my concerns and my current perspective.

To open, I should note that I often read Wikipedia to answer my own queries.
While I don't always find the articles helpful and informative, they are often
a good place to begin. The articles often contain useful links or pointers to
external sources of information. Some articles are carefully structured,
responsible, and judicious. Others at least demonstrate what passes for common
knowledge. No matter how reasonable a Wikipedia article seems to be, if I really
want to know something on the selected topic, I dig deeper using search tools,
Google Scholar, and the resources available to me in the digital resources of my
university library.

My concern with Wikipedia involves a relatively simple question with an
apparently complex answer. It seems to me that Wikipedia articles are often
unreliable as information sources. What's worse, the Wikipedia system does not
seem to have a way to make articles more reliable. While all articles are open
to continual editing and revision, nothing ensures the reliability of any
revision. The editing and revision system is a relatively evolutionary process.
Random mutation plays a frequent role in the development of any given article.
Even though many articles become better fitted to their environment, each
evolving article is structured to fit a niche in the social ecology of the
Wikipedia system rather than fitting a niche in a larger information ecology of
what Jimmy Wales's web site describes as 'the sum of all human knowledge.'

For all it successes -- and there are many -- Wikipedia fails to map the sum
of all human knowledge. Much of what it does map is wrong.

While I will discuss the issue of reliability further, it is useful to get a
widely-circulated urban myth out of the way. In December 2005, a journalist at
Nature undertook an investigation in which he compared 42 selected articles in
Wikipedia with 42 articles in Encyclopaedia Britannica. (Giles, Jim. 2005.
"Internet Encyclopaedias Go Head to Head.' Nature, Vol. 438, pp. 900-901, 15
December.)

https://www.nature.com/articles/438900a

The comparison demonstrated that the selected Wikipedia articles were generally
as accurate as the Britannica articles. In the rush to headlines, however, a key
fact was usually lost: the survey compared only 42 articles out of what was then
a corpus of 3.7 million articles in 200 languages. Today, as I wrote this
summary, Wikipedia had 5,783,375 in English alone. (This number grew by several
hundred even as I wrote and revised my notes.) All told, the entire Wikipedia
system contained more than 49,387,381 articles in 303 different languages.

While Britannica objected to some of the findings of the Nature article, I'm
willing to give Nature the benefit of the doubt on those 42 articles. The
problem is far greater.

First, this was not a peer reviewed study published IN Nature. It was a
speculative foray by a journalist working for Nature. It's interesting as a
sprightly science article designed to capture the attention can often be. Many
of these can be delightful. Science runs an annual Dance Your Thesis competition
for recent PhD grads and current PhD students. And I still refer to John
Bohannon's sting project targeting predatory journals. But all of these differ
to the peer reviewed articles for which Nature and Science are famed. Many of
the people who refer to the Wikipedia study seem, mistakenly, to believe that it
was a peer reviewed study published in Nature. It was not.

Second, and even more important, the sample size was so small that the article
gave no valid information on the reliability of Wikipedia as a whole. The
articles in the study were selected among the larger, more complete and
comprehensive Wikipedia articles. These could not have been a representative
sample chosen from amongst all of the 3.7 million articles then available.
It's as if a journalist were to select 42 high performing amateur athletes
from the top Olympic teams, and compare them against professional athletes in
the same sports. These athletes would compare well against professionals. A
representative selective of the world's millions of amateur athletes would not
compare well, not even against a selection of professionals from the minor
leagues of any sport.

It is not possible to give details on the Nature piece -- Britannica offered a
reply, Nature responded, and then Nature offered the details on their data, but
the link within Nature to that detailed discussion is broken.

The key point is that no article has ever suggested that Wikipedia as a whole
compares favourably to any professionally edited encyclopaedia as a whole.

Wikipedia is a success nonetheless, in part because of wider and more massive
coverage than any other source, and -- in great part -- because it is free.

Wikipedia is regularly ranked among the top 5 or 10 Internet sites. It has
become a major news source for millions of people. It is even more important as
a source for people seeking background information on issues and topics in all
fields, and on people. It is linked to Google searches and to many other
sources. Journalists use it in their background research for the news stories
that influence public opinion.

As a public source of information, Wikipedia ranks with such leading global
newspapers as The New York Times -- but the NYT uses a paywall while Wikipedia
does not. In comparison, the only major global open access newspaper is The
Guardian.

Wikipedia has far greater influence than any smaller regional newspaper in North
America or the national newspapers of many nations elsewhere, for example
Svenska Dagbladet or Göteborgs Posten in Sweden and Politiken or Berlingske in
Denmark. The difference is that one must subscribe to read them.

Wikipedia is more visible and more widely used than nearly any television or
radio station, exerting an influence comparable to that of a major television
network. At the same time that the internet gives Wikipedia the global reach and
power of a major broadcaster or publishing firm, its status as a not-for-profit
reference work placed it outside the boundaries of the laws that govern
broadcasting companies and publishing firms. Like other major Internet
companies, Wikipedia has become a massive public utility.

While I believe Wikipedia to be a generally benevolent force in the world, it is
a definitely a force. The structure of the Wikipedia system is that of a self-
governing anarchy, partially controlled by a small autarchy of administrators
who answer only to one another.

Many Wikipedia editors and administrators seem to view changes to Wikipedia
content as disruptions without respect to the factual content of those changes.
This makes the content of Wikipedia nearly impervious to corrections or
improvements that in any way disturb the equilibrium of the Wikipedia system.

There has been some careful, empirical work to test the issue of bias. This
research also sheds light on reliability. The published results suggest that
most articles remain impervious to change once they take a stable form. The
abstract of a 2012 study from the National Bureau of Economic Research why:

'We examine whether collective intelligence helps achieve a neutral point of
view using data from a decade of Wikipedia's articles on US politics. Our null
hypothesis builds on Linus' Law often expressed as 'Given enough eyeballs
all bugs are shallow.' Our findings are consistent with a narrow
interpretation of Linus' Law namely a greater number of contributors to an
article makes an article more neutral. No evidence supports a broad
interpretation of Linus' Law. Moreover several empirical facts suggest the law
does not shape many articles. The majority of articles receive little attention
and most articles change only mildly from their initial slant.'

Greenstein, Shane and Feng Zhu. 2012 'Collective Intelligence and Neutral
Point of View: The Case of Wikipedia.'  NBER Working Paper No. 18167, June
2012, JEL No. L17L3L86. Cambridge, Massachusetts: National Bureau of Economic
Research.

https://www.nber.org/papers/w18167.pdf 

Four useful articles present a sympathetic yet reasoned and skeptical overview
of the situation:

1) Postrel, Virginia. 2014. 'Who Killed Wikipedia?' Pacific Standard, Nov.
17. 2014.

https://psmag.com/social-justice/killed-wikipedia-93777

2) Simonite, Tom. 2013. 'The Decline of Wikipedia.' The MIT Technology
Review. October 22 2013.

https://www.technologyreview.com/s/520446/the-decline-of-wikipedia/

3) Metz, Cade. 'At 15, Wikipedia is Finally Finding Its Way To The Truth.'
Wired Business, 2016 January 15.

https://www.wired.com/2016/01/at-15-wikipedia-is-finally-finding-its-way-to-the-
truth/ 

4)  O'Neil, Mathieu. 2010. 'Shirky and Sanger or the costs of
crowdsourcing.' JCOM. Journal of S science Communication.  DOI:
https://doi.org/10.22323/2.09010304

https://jcom.sissa.it/archive/09/01/Jcom0901(2010)C01/Jcom0901(2010)C04

While there has been a great deal of journalistic inquiry, there have been
relatively few careful, empirical studies on the reliability of Wikipedia for
specific fields. It makes little difference whether a Wikipedia biography on a
rap star is accurate. In contrast, those who use Wikipedia for medical
information have the right to expect a high standard. Wikipedia does not always
meet this standard. Consider the conclusion of an article on pharmaceutical
education:

"Wikipedia does not provide consistently accurate, complete, and referenced
medication information. Pharmacy faculty should actively recommend against our
students' use of Wikipedia for medication information and urge them to consult
more credible drug information resources.'

Lavsa, Stacey M., Shelby L. Corman, Colleen M. Culley, and Tara L.Pummer. 2011.
'Reliability of Wikipedia as a medication information source for pharmacy
students.' Currents in Pharmacy Teaching and Learning, Volume 3, Issue 2,
April 2011, Pages 154-158

https://www.sciencedirect.com/science/article/abs/pii/S1877129711000086

The issue of content volatility -- the problem of change and contrasted with
stable information, is a serious issue.

Wilson, Adam M. And Gene E. Likens. 2015. 'Content Volatility of Scientific
Topics in Wikipedia: A Cautionary Tale.' PLOS ONE, August 14, 2015. DOI:
https://doi.org/10.1371/journal.pone.0134454

https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0134454

The peer-reviewed journal PLOS ONE has an extensive collection of open access
articles on different aspects of Wikipedia.

There has been more work on the sociology and workings of Wikipedia in the
scholarly literature. For example,

Okoli, Chitu, Mohamad Mehdi, Mostafa Mesgari, Finn Ørup Nielsen and Arto
Lanamáki (2014). "Wikipedia in the eyes of its beholders: A systematic review
of scholarly research on Wikipedia readers and readership." Journal of the
American Society for Information Science and Technology, Vol. 65, No. 12, pp.
2381-2403, DOI: https://doi.org/10.1002/asi.23162

http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6785/pdf/imm6785.pdf

Wikipedia itself has lengthy and useful discussions on some of these issues.
This includes a long article on reliability:

https://en.wikipedia.org/wiki/Reliability_of_Wikipedia

It also includes an essay discussing the challenge of recruiting and retaining
expert editors:

https://en.wikipedia.org/wiki/Wikipedia:Expert_retention

My concerns involve the problem of reliability in articles, and the challenges
that anyone with expert knowledge faces in contributing to Wikipedia.

My personal involvement with Wikipedia took place when I was a professor at the
Norwegian School of Management. The New York Times hosted a debate on Wikipedia
in the wake of a libellous biographical entry on John Siegenthaler. The debate
took place around 2005. I can't locate the relevant articles in the New York
Times, so I rely on memory. In the debate, I was skeptical toward Wikipedia. I
stated that I had forbidden my students to use it as a reference source. After
some interesting correspondence with Wikipedia founder Jimmy Wales, I changed my
mind. I told students that Wikipedia was a good place to begin their research
-- while asking that they use stable, unchanging sources as references for term
papers and theses.

Later, intrigued by my experience, I tried to contribute to Wikipedia in areas
on which I am expert. My contributions relied on references to published, peer
reviewed sources, government reports, and books from respected publishers. My
experience as a Wikipedia editor was difficult.

A few biographical facts may place my experience in perspective. Until 2008, I
was a tenured full professor in leadership and strategic design at a major
European business school. Next, I took up a dean's post in design at a leading
university of technology in Australia. I am now a chair professor at Tongji
University in China and editor-in-chief of an academic journal published by
Tongji University Press and Elsevier. Inspired by Wikipedia and similar
projects, we secured the funding to publish the journal as a full open access
journal on a Creative Commons license and we charge no publication fees. (You
can see the journal at URL: http://www.journals.elsevier.com/she-ji-the-journal-
of-design-economics-and-innovation/ )

Along with many colleagues, I have had roles as a journal author, journal
editor, and journal reviewer, and I have been a conference chair. In addition, I
have had significant experience in reference book publishing. I have been an
editor, consultant, and contributor to a dozen paper-based encyclopaedias and
reference works. I edit a book series for MIT Press, and I have worked as an
editor, reviewer, and consultant to several university presses and academic
publishers.

My sympathy for the Wikipedia mission and my understanding of the challenges it
faces is based on my experience in publishing and working with reference works.

After corresponding with Jimmy Wales, I contributed a few pieces to Wikipedia. I
used my real name. Someone challenged me on nearly every edit I made. The
challenges seemed to be based on the fact that I was a professor who had written
in the field of my contributions. These contributions were not original research
published for the first time in Wikipedia. I did sometimes quote my own writings
after they became verifiable secondary sources published in peer reviewed
journals and academic books. Citing one's own work is permitted under
Wikipedia policy once the material enters the literature of the field to become
a secondary source. The editors who reverted and challenged me were apparently
unaware of this policy, and some seemed to assume that because I was a visible
figure working in the subject field of my edits, I must therefore be prejudiced
or biased in some way.

After receiving this kind of critique from editors adamantly convinced of their
own views, I withdrew from Wikipedia and gave up trying to contribute.

A slightly different version of this problem come up when a colleague recently
made a series of contributions. He tried to create external links to two open
access versions of peer reviewed publications for a group of biography subjects.
Each subject was covered at length in the two documents. Until then, he had
simply added to a few individual biographies. The idea occurred to him when a
scientist he knew pointed to a related problem in the biographies of many people
who have won a Nobel Prize. The Nobel Foundation maintains authoritative
biographies and documents on every winner of the Nobel Prize. Despite this
source of high-level, verified information, relatively few Wikipedia Nobel
biographies have an external link to the Nobel Foundation. Even more peculiar,
some of the links from Nobelist biographies at the Nobel Foundation are dead,
broken, or refer to pages that no longer exist. When my colleague attempted to
create valid external links to a common series of previously published, peer
reviewed documents, an administrator observed the pattern and blocked him,
stating that building a series of references to a common published source for
many articles constitutes advertising or spam.

When my colleague appealed the block, the Wikipedia administrator denied the
appeal. My colleague has now given up on Wikipedia. Over the past decade, I have
spoken with several dozen established scholars and professors who have had
similar experiences. Back at the time of the debate in the New York Times, a
number of my colleagues told me that my view of Wikipedia was ill-informed. I
studied the situation and changed my mind. But I see no way to encourage
established scholars, scientists, and researchers to contribute to Wikipedia
with a system that allows any editor to delete anyone else's work while giving
administrators massive and often arbitrary power over the system. I don't have
an answer, but I can describe the systemic design problem.

The Wikipedia system has two conflicting aspects. On the primary level, it
welcomes all opinions and contributions. On the next level, however, this
doesn't work because anyone is free to overrule anyone else at will. An author
must appeal or argue about each revert or administrator decision every time, and
even when the process is done, any of thousands of Wikipedians may initiate a
new revert or institute a new block.

Persistence and a willingness to debate each article and defend every
contribution seem to be the cultural norm among people who identify themselves
as Wikipedians. This makes sense within those who identify heavily with
Wikipedia. It doesn't make sense to people who simply wish to contribute to a
few articles on subjects they know well.

When subject field experts take the time to contribute to Wikipedia entries, and
they are usually willing to explain the facts and sources of their
contributions, as I was. Working scholars don't generally have the time for
repeating the explanation again and again when this kind of repeated engagement
is necessary to preserve every contribution. The Wikipedia system easily
discards work. Since people contribute to articles without author credit, there
is no normal incentive to contribute other than good will. Contribution takes
work. Explaining each contribution is an added burden. Debating the explanations
becomes an added burden. It is impossible when editors or administrators refuse
to engage in a debate. While refusing to engage in reasoned debate is against
Wikipedia policy, it is common behaviour. When the burden of interacting with
deeply engaged members of the Wikipedia culture grows too heavy for contributors
who are not themselves equally committed, many contributors leave. (Albert O.
Hirschmann discussed this problem in the 1972 study, Exit, Voice, and Loyalty:
Responses to Decline in Firms, Organizations, and States.)

To function as an expanding, online encyclopaedia, Wikipedia relies on a
relatively small, stable core of dedicated editors and administrators. As I
understand it, Wikipedia has around 132,000 amateur editors. According to some
studies, a mere 1% of these editors -- fewer than 1,500 people among 132,000
-- have written around 77% of the Wikipedia contents. Given the frequent
turnover of editors -- including discouraged contributors who simply give up
-- the long-term contents probably include more authors. Even so, there cannot
be enough expert authors to account for the more than 5,700,000 articles in the
English language Wikipedia.

While Wikipedia is massive in range and coverage, the quality of many articles
is spotty. The tendency to revert to prior states means that correcting content
errors or adding verifiable documentation is often difficult. Instead, the
Wikipedia system seems to favour new articles, generating wider coverage without
generating authoritative articles. The ability of any editor to delete content
means that there is no way to create and preserve the kinds of improvements that
lead to an authoritative reference work such as The Stanford Encyclopaedia of
Philosophy or Encyclopaedia Britannica.

Wikipedia is designed to function as a network of dedicated amateurs. Anyone may
remove or change anything that he doesn't like. Around 84% of Wikipedia
editors are male. A small group of administrators have the power to block
writers and to delete their contributions with software commands that remove
everything they have contributed. If something that someone has done something
in one article or a series of linked articles to stir the disapproval of an
administrator, the administrator can delete everything that the person has
written. The ease of doing this on a bureaucratic basis means that there is no
need to read the actual content of the contributions as the editor of a paper-
based reference work would be required to do in preparing a new edition.

The thousand or so Wikipedia administrators -- 500 are active -- constitute a
final court of appeal for practical purposes. There is a process for higher
appeals, but appealing an administrator decision involves complicated
arbitration proceedings. Most people cannot master the process. As a result, an
administrator decision resembles an administrative edict in the Ottoman Empire.
Those with the authority to issue an edict are not required to explain
themselves, even when they fail to adhere to the formal policies and processes
that should -- in principle -- safeguard the participation rights of all
participants. Even when an author can manage the arbitration process, there is
no formal policy that requires involving a qualified expert in the discipline
subject of the entry to review the substantive changes to state whether they
did, in fact, constitute an improvement. Finally, whatever the outcome of any
process, an article can be revised, changed, or reverted again by anyone. While
any decision can undo serious contributions, no decision to retain a serious
contribution to an article is final. (I gather that some controversial articles
may gain protracted status, but this is unusual. It involves the nature of the
debate and controversy surrounding the article, with the issue being consensus
on facts rather than the facts themselves.)

Wikipedia offers no way to engage with subject discipline experts on the issue
of quality or relevance. As a result, it is difficult for anyone with specific
expertise to contribute to the field in which they are expert. The good quality
of many entries in highly technical but relatively non-controversial fields such
as mathematics or natural science suggests that Wikipedia attracts experts for
some entries. Other fields are subject to the opinions and judgments of amateur
scholars who feel themselves qualified. This is apparently common in the arts
and humanities and the non-quantitative social sciences. It also covers
biographical entries. This works against the quality of many entries.

This also affects the choice of sources. Many articles don't seem to recognise
the difference between sources and the relative credibility of source choices.
On one hand, we see references to peer-reviewed journals and well-edited
newspapers such as The Guardian. On the other, we see amateur publications or
fanzines. In some cases, of course, fanzines are appropriate sources. This is
rarely the case for articles on history, and especially not for articles on
contentious political issues.

Articles often accord the same level of credibility to amateur sites as to
serious scholarly sites hosted by universities, research centres, or museums.
For an article on the US Civil War, the United States Civil War Center at
Louisiana State University or the Michigan State University archives are more
authoritative than the web site of a Civil War reenactment group or a political
group dedicated to the memory of the Confederate army.

While many articles draw on credible sources for secondary documentation, many
do not. It is often verifiable that the cited sources exist. The reliability of
these sources may be questionable.

Wikipedia editors and administrators mostly work behind a screen of pseudonyms,
so there is no way to know who they are or what they know. Only 1% of Wikipedia
editors are responsible for 77% of the content. That's fewer than 1,500 people.
It seems unlikely to me that so few people can maintain subject expertise across
the wide range of more than 5,700,000 articles in English Wikipedia, and
48,217,247 articles across all languages. I have the feeling that many Wikipedia
editors are amateurs or graduate students attempting to influence the subject
disciplines by shaping what people read on the world's most used reference
site.

As I see Wikipedia, the system is designed to yield a good amateur encyclopaedia
with numerous gaps, flaws, and problems. The problems include incorrect data and
poor writing. The system encourages numerous contributions while making it
impossible to develop an online encyclopaedia at the level of The Stanford
Encyclopedia of Philosophy or Encyclopaedia Britannica. There is no way to
ensure the stable data of a paper encyclopaedia or the reference works common to
any field -- including stable online sources such as the Stanford Encyclopedia
of Philosophy.

Wikipedia editors and administrators pursue the goal of making a high quality
professional encyclopaedia on the order of Britannica or World Book, while
ensuring free access rather than requiring paid subscriptions. Most of the
editors and administrators are sincere, decent people. At the same time,
they've developed a highly internalised culture. Many of these people seem
unable to communicate across the boundaries of their own culture to people with
subject discipline expertise. This is visible when editors or administrators
overrule genuine experts who attempt to explain the logic of their contributions
or to query the deletion of their work.

The power of 132,000 editors to change any article discourages experts who can
see several days of work undone in minutes. The 500 or so active administrators
have even higher authority, with the right to block people and universally undo
their work. This happened to my colleague. As a result, someone with a PhD who
has contributed to peer-reviewed journals, standard reference works, and books
from major publishers will not write for Wikipedia again. In his case -- and
that of the scholars I know who tried to contribute and stopped -- the problem
may not even be a content problem. It may simply involve catching the attention
of an editor or an administrator by violating Wikipedia's cultural norms in
attempting to post good content in ways that the individual editor or
administrator sees as disruptive.

Writing a good reference book entry is hard work. In the early 2000s, I
contributed a dozen articles to an encyclopaedia from Sage. Each article ran
between 500 and 1,250 words. It took three months of work to write twelve
articles. Each article took roughly a week of intense work. Because of those
articles, I was invited to contribute articles to other reference works. The
time involved was roughly the same. That's comparable to the work I put into
the articles I worked on for Wikipedia. These vanished when an editor decided
that he did not like them. When I checked the editor's page, I found a massive
group of articles to which he had contributed, several hundred in all. All were
in the broad subject field of the deleted articles, but across a far wider area
than would be possible for a true expert. A brief dip into the articles showed
numerous gaps and flaws. At the same time, I was never able to get an
explanation of why my contributions were deleted.

It's easy to understand why scholars become discouraged when they try to
contribute to Wikipedia.

Given the current Wikipedia system, I do not see how Wikipedia can attract and
hold expert editors and writers who are willing to invest time and work that can
vanish without explanation. The Wikipedia culture has created something
brilliant -- but rather like the life forms of the Cambrian Explosion, many
Wikipedia articles arise, flower briefly and then go extinct. They reach a
static point beyond which they do not improve. For a reference work to have
lasting impact, it requires a system that enables it to retain the verifiable
material and high quality content that Wikipedia seeks.

There was a predecessor to Wikipedia called Nupedia. As I understand it, Nupedia
didn't work because it was top-heavy. Passing peer review was too difficult.
Wikipedia actually has a  similar problem -- except that peer review is not
based on expert opinion by senior scholars. It involves review by amateurs who
are as likely to reject an article because of their ignorance as they are to
reject an article for factual flaws. Many rejections don't even involve
content. They arise because of minor format flaws or writing practices that make
sense to scholars in specific areas even though they don't make sense to the
Wikipedia editor whose attention they attract.

Despite gloomy predictions, I doubt that Wikipedia will die. It is a huge
success as it is today. Wikipedia is one of the most widely used web sites in
the world and the content is good enough for many purposes. The problem is that
Wikipedia doesn't represent the sum of all human knowledge. Too many entries
are deficient.

[End of part 1/2]


_______________________________________________
Unsubscribe at: http://dhhumanist.org/Restricted
List posts to: humanist@dhhumanist.org
List info and archives at at: http://dhhumanist.org
Listmember interface at: http://dhhumanist.org/Restricted/
Subscribe at: http://dhhumanist.org/membership_form.php


Editor: Willard McCarty (King's College London, U.K.; Western Sydney University, Australia)
Software designer: Malgosia Askanas (Mind-Crafts)

This site is maintained under a service level agreement by King's Digital Lab.