Article 3738 of comp.lang.perl:
Xref: feenix.metronet.com comp.lang.perl:3738
Path: feenix.metronet.com!news.utdallas.edu!hermes.chpc.utexas.edu!cs.utexas.edu!uunet!pipex!uknet!cardamom.compnews.co.uk!phil
From: phil@compnews.co.uk (Phil Male)
Newsgroups: comp.lang.perl
Subject: Re: Script to manage a mailbox (SunOS)
Date: 27 Jun 1993 10:26:26 GMT
Organization: Computer Newspaper Services, Howden, UK.
Lines: 525
Message-ID: <20jskiINNiit@cardamom.compnews.co.uk>
References: <REPAYNE.93Jun25151442@iguana.syr.EDU>
Reply-To: phil@compnews.co.uk
NNTP-Posting-Host: cumin.compnews.co.uk
X-Newsreader: Tin 1.1 PL4

repayne@iguana.syr.EDU (Rob) writes:
: 
: I have the need for a utility, preferably perl, which will sort
: through a user's mailbox, on a distributed sun system, and remove
: messages which are older than a certain number of days.  Has anyone
: written such a utility?  Do you know where I would be able to find
: such a thing written in anything?  Thank you.
: 
: 				 -Rob

This is what we use, I knocked this up a while ago from expire_mail by Steve
Mitchell (steve_mitchell@csufresno.edu). We install it in /usr/local/sbin on
all our Suns then get cron to tidy up everyone's mailbox, removing messages
older than 2 months. It leaves a message in the users mailbox detailing the
subject lines of the messages expired. Options, as far as possible, are
compatible with Steve's expire_mail software.

Change the definitions at the start to say who your postmaster is etc.
Run like this in cron:

0 23 * * 3 /usr/local/sbin/expire_mail -l -M -a 60 /var/spool/mail/*

The options select the policy for the expiry, in our case 60 day old messages
whatever their status.

I havn't sent this out before, but it's been working here for a year or so.
I guess if anyone has any changes they want to fold in, mail me with them.


#!/usr/local/bin/perl --					     -*-perl-*-
#
# Copyright (c) Information Systems, The Press Association Limited 1993
# Portions Copyright (c) Computer Newspaper Services Limited 1993
# All rights reserved.
# 
# License to use, copy, modify, and distribute this work and its
# documentation for any purpose and without fee is hereby granted,
# provided that you also ensure modified files carry prominent notices
# stating that you changed the files and the date of any change, ensure
# that the above copyright notice appear in all copies, that both the
# copyright notice and this permission notice appear in supporting
# documentation, and that the name of Computer Newspaper Services not
# be used in advertising or publicity pertaining to distribution or use
# of the work without specific, written prior permission from Computer
# Newspaper Services.
# 
# By copying, distributing or modifying this work (or any derived work)
# you indicate your acceptance of this license and all its terms and
# conditions.
# 
# THIS SOFTWARE IS PROVIDED "AS IS", WITHOUT ANY WARRANTIES OF ANY KIND,
# EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO ANY IMPLIED
# WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, AND
# NONINFRINGEMENT OF THIRD PARTY RIGHTS.  THE ENTIRE RISK AS TO THE QUALITY
# AND PERFORMANCE OF THE SOFTWARE, INCLUDING ANY DUTY TO SUPPORT OR
# MAINTAIN, BELONGS TO THE LICENSEE.  SHOULD ANY PORTION OF THE SOFTWARE
# PROVE DEFECTIVE, THE LICENSEE (NOT THE COPYRIGHT OWNER) ASSUMES THE
# ENTIRE COST OF ALL SERVICING, REPAIR AND CORRECTION.  IN NO EVENT SHALL
# THE COPYRIGHT OWNER BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL
# DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR
# PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS
# ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF
# THIS SOFTWARE.
#
#
# $Id: expire_mail,v 1.1 1993/06/03 10:43:26 phil Exp $
#

#
# Information Systems Engineering Group
# Phil Male
#

local($_rcsid) = '$Id: expire_mail,v 1.1 1993/06/03 10:43:26 phil Exp $';
local($_copyright) = 'Copyright (c) Information Systems, The Press Association Limited 1993';

require "getopts.pl";			# option handling
require "timelocal.pl";			# time conversion
require "ctime.pl";			# ctime for pseudo-mailing
require "stat.pl";			# file status

# Perl mail expire.
# This program removes old messages from system mailboxes.
# It assumes the format of mailboxes to be standard
# sendmail format mail with a blank line followed by a `From ' line
# starting each and every message. Mailbox locking is via flock.
# Works under SunOS.
#
# Options as follows:
# -v 			verbose output
# -V			display version information and quit
# -d 			debug mode (no change to mailbox)
# -l			display messages for crontab output
# -z			do not delete zero length mailboxes
# -t			do not reset access and modification times on mailbox
# -o 			always open mailbox, never just test modification date
# -M			append a message detailing deleted messages for the user
# -T			do not record delivery of mail summary on mailbox date
# -a days		messages whose age is greater than days are expired
# -O days		messages whose age is greater than days are expired
# -u user		only consider messages from user (regexp)
# -S read|old		only consider messages with status `old' or `read'
# -s subject		only consider messages with subject (regexp)
#
# Based on expire_mail by Steve Mitchell (steve_mitchell@csufresno.edu)
#

#####
#
# Definitions
#
#####

# site postmaster - XXX change this as required
$postmaster = "postmaster@compnews.co.uk";

# current user
$me = getlogin || (getpwuid($<))[0] || "unknown";
$home = $ENV{'HOME'};

# default mailbox for a user - XXX change this as required
$default_mailbox = $ENV{'MAILBOX'} || "/var/spool/mail/$me";

# notice to append to list of deleted messages
$notice = "
Please read your mail on a regular basis. Old mail should be deleted,
or be filed in your personal mail folders. If you do not know how to
use mail folders, please refer to the `Guide to EMail@compnews'
available from Information Systems department, or Administration.

If you have any other queries regarding the mail system, please send
mail to $postmaster.

Processed by $_expire_mail_rcsid";

# set the umask for temp files
umask( 0700 );

# make stdout unbuffered
select(STDOUT); $| = 1;

$LOCK_EX = 2;				# lock
$LOCK_UN = 8;				# unlock
$START_TIME = time;			# time right now
$SEC_PER_DAY = 24 * 60 * 60;		# seconds in a day
$line_buffer = "";			# empty line buffer

# month numbers
$mon_num{'Jan'} = 0;
$mon_num{'Feb'} = 1;
$mon_num{'Mar'} = 2;
$mon_num{'Apr'} = 3;
$mon_num{'May'} = 4;
$mon_num{'Jun'} = 5;
$mon_num{'Jul'} = 6;
$mon_num{'Aug'} = 7;
$mon_num{'Sep'} = 8;
$mon_num{'Oct'} = 9;
$mon_num{'Nov'} = 10;
$mon_num{'Dec'} = 11;

#####
#
# Support
#
#####

# line buffer for look-ahead

sub get_line
{
  local( $line ) = "";			# line to return

  if( ! ($line_buffer eq "") ) {
    $line = $line_buffer;
    $line_buffer = "";
  } else {
    $line = <MBOX>;
  }
  return $line;
}

# read message from mailbox

sub read_message
{
  local( $msg ) = "";			# message to send back
  local( $prev_blank ) = 1;		# assume previous line blank
  local( $seen_from ) = 0;		# seen a from line
  local( $line ) = "";			# current line

  # reset some globals
  $msg_status = "";
  $msg_subject = "";
  $msg_date = "";

  while( $line = &get_line ) {
    
    if( $line =~ /^From\s+([^\s]+)\s+(.*)$/ ) {
      # if previous line was blank, then legal from line
      if( $prev_blank ) {
        # if already seen a legal from line, then this is next message
	if( $seen_from ) {
	  # pushback this from line
	  $line_buffer = $line;
	  return $msg;
	}
	$seen_from++;
        # From line found, extract information
        ( $msg_from, $msg_date ) = ( $1, $2 );
	$msg_stamp = &rctime( $msg_date );
        $msg_age = &days_old( $msg_stamp );
      }
    } elsif( $line =~ /^[Ss]tatus: ([A-Za-z]+)/ ) {
      ( $msg_status ) = ( $1 );
    } elsif( $line =~ /^[Ss]ubject: (.*)$/ ) {
      ( $msg_subject ) = ( $1 );
    }

    # set previous line
    if( $line =~ /^$/ ) {
      $prev_blank = 1;
    } else {
      $prev_blank = 0;
    }

    $msg .= $line;
  }

  return $msg;
}

# write a message into a mailbox

sub write_message
{
  print TMPF "@_";
}

# parse the ctime string into a time value
# From line contains local time

sub rctime
{
  local( $pt ) = @_;			# time to convert
  local( $ct ) = -1;			# converted time

  if( $pt =~ /^([A-Za-z]+)\s+([A-Za-z]+)\s+([0-9]+)\s+([0-9:]+)\s+([0-9]+)/ ) {
    ( $day, $mon, $mday, $time, $year ) = ( $1, $2, $3, $4, $5 );
    ( $hour, $min, $sec ) = split( ':', $time );
    if( $year > 1900 ) { $year -= 1900; }
    $ct = &timelocal($sec,$min,$hour,$mday,$mon_num{$mon},$year);
  }
  return $ct;
}

# age in days

sub days_old
{
  local( $agev ) = @_;			# time to convert

  return( ( $START_TIME - $agev ) / $SEC_PER_DAY );
}

# basename

sub basename
{
  local( $path ) = @_;			# path to find the base of
  local( $base ) = rindex( $path, "/" );

  if( $base < 0 ) {
	$base = $path;
  } else {
	$base = substr($path, $base + 1);
  }

  return $base;
}

# usage message

sub usage
{
  print STDERR "usage: expire_mail [-vlV] [-zotTM] [-d] { [-O days] [-u user] [-S read|old] [-s subject] } mailbox...\n";
  exit 0;
}

#####
#
# Main
#
#####

&Getopts( 'VvO:a:ou:zdS:s:MtTl' ) || &usage;

# compat
$opt_a = $opt_O if ($opt_O && !$opt_a);

# check version
if( $opt_V ) {
  print "expire_mail: mail expiry agent\n";
  print "expire_mail: $_expire_mail_rcsid\n";
  &usage;
}

# use default mailbox if non supplied
if( $#ARGV < $[ ) {
  $ARGV[0] = "$default_mailbox";
}

# decode status option
if( $opt_S ) {
  if( $opt_S eq "old" ) {
    $opt_S = "O";
  } elsif( $opt_S eq "read" ) {
    $opt_S = "R";
  } else {
    print STDERR "expire_mail: status may only be one of `old' or `unread'\n";
    &usage;
  }
}

# check we are actually doing some processing
if( !$opt_a && !$opt_u && !$opt_S && !$opt_s ) {
  print STDERR "expire_mail: must specify at least one of -O, -u, -S or -s\n";
  &usage;
}

# debug mode implies verbose mode
if( $opt_d ) { $opt_v = 1; }

# foreach mailbox...
while( $mailbox = shift ) {

  if( $opt_v ) { print STDOUT "Checking mailbox $mailbox\n"; }

  # does mailbox exist
  if( ! -f $mailbox ) { next; }

  # stat the mailbox
  @sb = &Stat($mailbox);

  # can it be deleted now?
  if( !$opt_o && $opt_a ) {
    # check the modification date
    $age = &days_old(@sb[$ST_MTIME]);
    if( $age > $opt_a ) {
      if( $opt_v ) { print STDOUT "Expiring mailbox $mailbox\n"; }
      if( !$opt_d ) {
        if( $opt_z ) {
          open( MBOX, ">$mailbox" ) || 
	    print STDERR "expire_mail: failed to truncate $mailbox\n";
	  close( MBOX );
        } else {
          unlink( $mailbox ) ||
	    print STDERR "expire_mail: failed to remove $mailbox\n";
        }
      }
      next;
    }
  }

  # open the mailbox
  if( !open( MBOX, "+<$mailbox" ) ) {
    print STDERR "expire_mail: unable to open $mailbox\n";
    next;
  }

  # lock the mailbox
  if( !flock( MBOX, $LOCK_EX ) ) {
    print STDERR "expire_mail: unable to lock $mailbox\n";
    close( MBOX );
    next;
  }

  # open the temporary file
  $tmpname = "$mailbox.exp$$";
  if( !open( TMPF, "+>$tmpname" ) ) {
    print STDERR "expire_mail: unable to create temporary file for $mailbox\n";
    close( MBOX );
    next;
  }
  unlink( $tmpname );

  # init counters
  $count = 0;
  $exp = 0;

  # read each message in turn
  while( $msg = &read_message ) {

    $count++;

    # looking for specific from users
    if( $opt_u ) {
      if( ! ($msg_from =~ /$opt_u/) ) {
        if( $opt_v ) {
	  print STDOUT "\tMsg #$count: from   \r";
	}
	&write_message( $msg );
	next;
      }
    }

    # check message status
    if( $opt_S ) {
      if( !($msg_status =~ /$opt_S/) ) {
	if( $opt_v ) {
	  print STDOUT "\tMsg #$count: status   \r";
	}
	&write_message( $msg );
	next;
      }
    }

    # check message subject
    if( $opt_s ) {
      if( ! ($msg_subject =~ /$opt_s/) ) {
        if( $opt_v ) {
	  print STDOUT "\tMsg #$count: subject   \r";
	}
        &write_message( $msg );
        next;
      }
    }

    # only other thing to check is message age
    if( $opt_a ) {
      if( $msg_age <= $opt_a ) {
        if( $opt_v ) {
	  print STDOUT "\tMsg #$count: newer   \r";
	}
        &write_message( $msg );
        next;
      }
    }

    # log the expiry
    if( $opt_v ) {
      print STDOUT "\tMsg #$count: expired   \r";
    }

    # copy message accross if in debug
    if( $opt_d ) {
      &write_message( $msg );
    } else {
      # record the mail message from and subject line
      $pad = ' ' x (25 - length($msg_from) );
      $npad = ' ' x ( 4 - length($count) );
      $subjects[$exp] = "$npad$count $msg_from$pad $msg_date\n     $msg_subject\n";
    }

    # increment the expired message count
    $exp++;
  }

  if( !$opt_d ) {

    # if sending mail to the owner of the mailbox, append message on the end

    if( $opt_M && $exp > 0 ) {
      chop( $ct = &ctime(time) );
      $to = &basename( $mailbox );
      print TMPF "From mail_expire $ct\n";
      print TMPF "From: mail_expire (Mail Expiry Agent)\n";
      print TMPF "Reply-To: $postmaster\n";
      print TMPF "To: $to\n";
      print TMPF "Subject: Expired Mail Summary\n\n";
      print TMPF "The following messages have been automatically removed from your\n";
      print TMPF "mailbox by the mail expiry agent.\n\n";
      # fitted to $subjects layout
      print TMPF " Msg From & Subject            Dated\n\n";
      foreach $msg ( @subjects ) {
        print TMPF "$msg\n";
      }
      print TMPF "$notice\n\n";

      if( !$opt_T ) {
        # set the modification time for the mailbox to be now
        @sb[$ST_MTIME] = time;
      }
    }

    # copy data back into mailbox to preserve permissions, creation time
    # and user and group id

    # zero length the mailbox
    truncate( MBOX, 0 );
    # *** START Critical
    # any data to copy?
    if( $exp < $count ) {
      # restart both files
      seek(MBOX, 0, 0);
      seek(TMPF, 0, 0);
      # copy file into mailbox, better with sysread/syswrite?
      while( <TMPF> ) {
	print MBOX $_;
      }
    } elsif( !$opt_z ) {
      unlink( $mailbox );
    }
    # *** END Critical

  }

  # unlock mailbox
  flock( MBOX, $LOCK_UN );

  # close files
  close( MBOX );
  close( TMPF );

  # reset access and modification dates
  # if we have sent mail, then the modification time is the time of the mail
  if( !$opt_t ) {
    utime( @sb[$ST_ATIME], @sb[$ST_MTIME], $mailbox );
  }

  # show counters
  if( $opt_v || ( $opt_l && $exp ) ) {
    print "$mailbox contained $count messages, expired $exp messages\n";
  }
}


