17.095 anti-spam

From: Humanist Discussion Group (by way of Willard McCarty willard.mccarty@kcl.ac.uk)
Date: Tue Jun 17 2003 - 01:46:45 EDT

  • Next message: Humanist Discussion Group (by way of Willard McCarty

                    Humanist Discussion Group, Vol. 17, No. 95.
           Centre for Computing in the Humanities, King's College London
                       www.kcl.ac.uk/humanities/cch/humanist/
                         Submit to: humanist@princeton.edu

             Date: Tue, 17 Jun 2003 06:40:08 +0100
             From: Virginia Knight <Virginia.Knight@bristol.ac.uk>
             Subject: Re: 17.086 anti-spam

    --On 14 June 2003 08:51 +0100 "Humanist Discussion Group (by way of Willard
    McCarty <willard.mccarty@kcl.ac.uk>)" <willard@lists.village.virginia.edu>
    wrote:
    >SpamAssassin appears to be working very well for me. So far in ca 2 weeks
    >of using it, spam has been reduced by at least 90% and no message I have
    >wanted to read has been wrongly classified. I can certainly live with the
    >10% while the mechanism for filtering is improved. It will be interesting
    >to see if spammers continue to learn from the techniques used against
    >them. Here, for example, is a typical analysis of a spamming message:

    I was given a peep at some of the code for just one of the SpamAssassin
    rules and it proved to be pretty complicated. This was in the course of
    reporting a bug (since fixed). I was finding that reviews from an
    e-journal were being flagged as spam because the abbreviation 'pp' was
    occurring near words such as 'longer'!

    If you run a SpamAssassin installation you can tweak it to reweight the
    various tests it performs. Failing that, you can still customise it by
    altering the points threshold for spam, or (as I do) combining SpamAssassin
    with a procmail filter which automatically sends a message which tests
    positive on certain SpamAssassin tests to a spam folder. This is because
    spam often falls below even a low points threshold. I find for example
    that treating messages with a large amount of HTML in them as spam catches
    some spam that SpamAssassin misses, at the price of treating as spam some
    genuine messages which come from people who aren't regular correspondents
    of mine. I know others have a similar rule for filtering email, so maybe
    people who compose email in HTML should note that it is risky to use a
    format so associated with spammers.

    Virginia Knight
    ----------------------
    Virginia Knight, Institute for Learning and Research Technology
    Tel: +44 (0)117 928 7154 Fax: +44 (0)117 928 7112
    University of Bristol, 8-10 Berkeley Square, Bristol BS8 1HH
    Virginia.Knight@bristol.ac.uk
    Official homepage: http://www.ilrt.bris.ac.uk/aboutus/staff?search=cmvhk
    Personal homepage: http://www.ilrt.bris.ac.uk/~ggvhk/virginia.html
    ILRT homepage: http://www.ilrt.bristol.ac.uk



    This archive was generated by hypermail 2b30 : Tue Jun 17 2003 - 01:49:39 EDT