 1. DISCALIMER

 Copyright (c) 2003		Granch Ltd. (tm).  		All rights reserved.

 Redistribution and use in source and binary forms, with or without  modification,
 are permitted provided that the following conditions  are met:

 1. Redistributions of source code must retain the above copyright notice, this list
 of conditions and the following disclaimer.

 2. Redistributions in binary form must reproduce the above copyright notice,
 this list of conditions and the following disclaimer in the documentation and/or
 other materials provided with the distribution.

 3. All advertising materials mentioning features or use of this software must
 display the following acknowledgment: This product includes software developed
 by the Granch Ltd.

 4. Neither the name of the Granch Ltd. nor the names of its contributors may be
 used to endorse or promote products derived from this software without specific
 prior written permission.

 THIS SOFTWARE IS PROVIDED BY THE GRANCH LTD. AND THEIR CONTRIBUTORS ``AS IS'' AND
 ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE  ARE DISCLAIMED.
 IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE  FOR ANY DIRECT, INDIRECT,
 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED
 TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
 OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
 CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
 IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 SUCH DAMAGE.

 	$Id: README,v 0.94 2005/10/04 12:11:54 shelton Exp $

 2. What is it.

This is a sendmail milter API-based filter to control and canceling spam floods,
which last time more and more richest flood a electronic mail systems. This Spam
Control and Canceling (SCC) filter does:

 - checks codepages on mail, and discard mail with unwanted codepage (probably
   you do not want ierogliph Korean, China and Japan spam messages :-) )

 - allow mail, when only selected codepage suited to selected sender (when you
   receiving some spam from wide-open mailing lists, you can isolate this kind)

 - isolate HTML messages. HTML messages is a headache only and 90% guaranteed,
   that these messages are spam-only. You can isolate separately incoming and
   outgoing HTML mail.

 - has some 'enterprise-violation' checking: checks on PGP (or other strong
   cryptography) using, checks on attachments permissions, checks on permissions
   to receive executable attachments.

 - has fine-granulated user-oriented permissions: for one user you can allow
   receive HTML messages, for next - send and receive, for third - receive
   attachments etc.

 - can save isolated messages for future discovering or simply discard it

 3. Requirements

Mail filter requires sendmail with libmilter support (8.12.x has this support by
default, for 8.11.x it should be compiled in, see libmilter/README in sendmail
source directory, how to compile with libmilter support).

For 4.x branch mail filter requires libgnugetopt to process long option names.
It can be installed from /usr/ports/devel/libgnugetopt. For 5.x and followed
branches libgnugetopt presented in base system and do not need to install.

Of course, you need a working sendmail. So, Postfix users can sleep again...

 4. Compiling and installing

Unpack distribution (of course, you need bzip2 archiver to do it :-) )

tar -xyvf sccmilter.tar.bz2

Start configure script. To see list of configure parameters, start configure -help.
Here will be a short explanation of all these parameters.

--prefix, --exec-prefix and all other standard configure parameters mean exactly one,
what in other programs - set path to directory, where binary will be installed.

--with-homedir - This is a developer parameter. It specified a directory, where
backup will be saved, when 'make dist' will issued. Ordinary user do not need
specify this parameter.

--with-extra-libraries - This parameter specifies an additional path, included in
libraries search path. It need, when your libgnugetopt has been installed in the
non-standard path. You do not need specify this parameter, when you want to
include /usr/local/lib directory, it will be included automatically.

--with-isolator - This parameter specifies a directory, where spam and
enterprise-violated messages will be kept. When you specify it without path
(or at all), default path (/var/spool/isolated-spam) will be used.

--with-configdir - This parameter specifies a directory to keep sccmilter's
configuration files. When you do not specify path or at all, default path
PREFIX/etc/sccmilter will be used.

--with-scc-users-file - This parameter specifies a filename to keep users permissions
in send or receive mail process. To more information about users permissions see
man sccusers(5).

--with-scc-hosts-file - This parameter specifies a filename to keep hosts permissions
in send or receive mail process. To more information about hosts permissions see
man scchosts(5).

--with-domain - This parameter specifies a domain name to be auto-added, when username
without '@' char will be detected in SCC users table. When missed, this parameter
will be filled from output by `hostname` run.

--with-debug - This parameter specifies including in builded binary LOTS of debug
messages and code for their generation. This is debug parameter, you won't never
set it on soild production servers, when filter works properly. This  parameter allows
you to change debug value by -D commandline parameter (else doesn't make sense).
BEWARE! You can understand, what you do - this parameter FOR DEBUG ONLY, because on
solid production servers mail log can extremely grows quickly! You can revert debug
info by -D 0 set, but you CANNOT shrihk debug code from binary without recompiling!
Be EXTREMELY careful with -D values up to 100! This value genetared HUGE (really HUGE,
you should believe me) amount of debug info! (Sample: system with 3000 letters will
have about 45 MBytes log file with -D 111)

After successful finishing of configure, start make.

After successful finishing of make, start make install.

After successful installing, you must view (and edit appropriate lines) an
sccmilter.conf, sccusers and scchosts files. When you do not want permit your users
some unusable permissions, keep it in default (at default no hosts have permissions
to send HTML messages, no users have permissions to send and receive HTML messages,
and root has permissions to send/receive PGP and other strong cryptography messages
and attachments (include executable)). To more info about sccusers file, see man
sccusers(5), to more info about scchosts file, see man scchosts(5), to more info about
configuration file sccmilter.conf, see man sccmilter(5).

Go to /usr/local/etc/rc.d directory and start sccmilter by ./sccmilter.sh start

Modify /etc/mail/sendmail.mc to reflect mail filter starting

INPUT_MAIL_FILTER(`sccmilter',`S=unix:/var/run/sccmilter,F=T')

Remake sendmail config and restart it

cd /etc/mail
make cf install restart

5. Features
5.1. Explanation

Why I decide to develop my own spam filter? Because all existing filter or require
monstrous components like Perl, or hasn't required features. This filter has a unique
set of features - spam checking in it merged with enterprise standards violation
checking. Usually, at large enterprise (works, factories, research) set various
limitations in Internet and electronic mail usage. This filter integrated checks on
following these limitation or not, and treat violated mail exactly as spam. Of course,
it checks also on ordinary spam :-).

Basic features:

 - HTML isolating - filter can isolate HTML messages completely or partially, for some
remote hosts and some local users, can separate HTML sending and HTML receiving
processes. HTML mail by default is suspicious and to receive it server MUST be allowed
in scchosts file, OR recipient MUST be allowed in sccusers file (see appropriate man
pages to more info)

 - Codepage/charset isolation. I think, you (when you aren't Chinese, Japanese or
Korean) do not need Japan, China or Korea-languaged mail :-) This filter can isolate
mail, which is not in predefined set of codepages. Filter has global set of allowed
codepages, which use when sender/recipient does not described in sccusers file or
does not have codepages description. When sender codepage fit in allowed list (global
or personal), recipients does not check, when sender codepage DOE NOT fit - will check.
Mail won't arrived, even one recipeint hasn't allowance to receive this message.

 - Personal codepage specification. You can setup allowed charsets exactly for some
users (MAIL FROM:) and mail from these users will be arrived, only when in these
codepages.

 - PGP (and other strong cryptography) prohibition - this filter can prohibited usage
PGP or other strong cryptography for sending mail. This permission tuned personally
for each user. This is a "enterprise violation" feature (this release doe not impleme-
nted). It is switch off by deafult, to switch on see man sccmilter.conf(5) or embedded
help

 - Attachments receiving prohibition - this filter can prohibited receiving attachments
completely or executable attachments only selectively for each user. This is a "enter-
prise violation" feature (this release doe not implemented). It is switch off by deafult,
to switch on see man sccmilter.conf(5) or embedded help

 - Imitation - for testing purpose, this filter can only imitating various mail action.
All messages will be given, all logs will be fill, but mail always will be arrived.

5.2 Processing mode and enterprise violation procesing mode

This filter takes separate actions with spam and enterprise violated messages. That was
done, because enterprise violated messages can be clean with 'spam sight'.

You can specify next modes:

- discard. As called, received message will silently discarded. Sender will notified,
  	   that message was delivered :-) BEWARE! For local mail (goind from inside)
           this mode will ignored and replaced to 'reject' to avoind circumstance,
           that local user sending mail, filter discards it, but report, that mail
           was delivered, user waits for answer, but mail was 'eated' by filter...

- reject.  As called, received message will reject witj 554 5.7.1 error code with
	   message, which will explain reject reason. This mode will always activate
           for local senders to display error message for it (they. at least, can
           call sysadmin for explain 'misunderstood message', he-he)

- isolate. This is a couple of these previous explained modes with some one addition
	   mail will save into --isolator= directory for explaining with admins or
           another person. User will receive reject message and should to correct
           situation and resend message and violated (spammed) message will kept

5.3. Gory details

Filter can be started in next modes: simple, isolate and enterprise. Each next mode
includes all previous.

5.3.1 Simple mode

In simple mode (no -l and -n keys in command line and no appropriate parameters in
configuration file) filter only does charset checking against   mail charset and
against personal user charset (see man sccusers(5)) and does spamaction (-m key).

This mode all HTML mail is correct, permissions about sending and receiving HTML mail
in sccusers and scchosts file will be ignored. Also, this mode all PGP mail is allowed,
all attachments are allowed (include executable).

When checking charstes is going, we qualify mail as spam with next rules: when sender
described in sccusers file (only sccusers, NOT scchosts! Adding hosts with charsets
will be done next releases), mail charset checks against specified, else against global
(AllowedCharsets=...) in configuration file. When charset is found in sender allowed
charsets (if any), mail will be qualified as normal, and arrived. Else will perform
check against all recipients' charset - mail will be arrived ONLY when check agaainst
ALL recipients will be OK. Even one recipient hasn't allowance to receive mail with
this charset, mail will be qualified as spam, and spamaction will be performed.

When all recipients checked, when mail was qualified as spam, spamaction will
be issued, else none actions. Mail will be qualified as spam, even one of recipients
specified charsets will doesn't contain mail charset!

To full disable charset checking, you must NOT specify selected charsets to individual
users and setup AllowedCharsets = * in configuration file.

When arrived message is a multipart message, will start to serach boundaries. Each
boundary will check against charsets, when charset specification will be found. And
each charset specification will be check against charset.

When mail qualified as "without charset", NoCharsetSpecify action will be issued. When
this parameter set to 'spam', mail will qualify as spam, else (any value) - charset
missing will be ignored and treat as (still) normal mail.

Samples: in sccusers file local user alice has specified charset iso-8859-1, local user
john hasn't any specified charsets, global charsets list is us-ascii,iso-8859-2, remote
user bob@domain.com has charset iso-8859-4, remote user bill@foo.com hasn't any speci-
fied charsets. They all start to mail each another.

bob@domain.com -> alice: Mail will be arrived, ONLY when it will be in iso-8859-4
codepage, else will be refused with diagnose: This charset does not allowed for sender.
Reason: bob@domain.com has personal charset iso-8859-4, and only this charset valid for
him.

bill@foo.com -> alice: Mail will be arrived, ONLY when it will be in iso-8859-1 codepage,
else will be refused with diagnose: This chraset does not allowed for recipient.
Reason: alice has personal charset iso-8859-1, and only this charset vaild for her.

bill@foo.com -> john (and john -> bill@foo.com): Mail will be arrived. ONLY it will be
in us-ascii or iso-8859-2 charsets, else will be refused with diagnose as in first sam-
ple. Reason: these users has not personal codepages and global codepage will be use.


5.3.2 Isolate mode

In isolate mode (-l key with any parameter, except no in command line or
IsolateHTML="any, except no" in configuration file) filter does all checks as in simple mode,
but additionally checks permissions to receiving HTML messages from various sites,
and permissions to send and receive HTML messages to various users.

Permissions setup in sccusers and scchosts files (see appropriate man pages). These
permissions are: when host allowed to send HTML mail to us, this permission take
precedence before user's permission to send HTML message. So, when host www.baobab.com
allowed to send HTML, you do not need specify all their users separately to allow send
HTML messages. When host does not allowed to send HTML, separate users permissions will
be checked - when host www.dereva.net doesn't allowed to send HTML, but user
pila@dereva.net allowed, HTML mail from pila@dereva.net will be processed. After
successful checking about sender, recipient will be checked. Keep in mind, that in
scchosts file you should specify mail gate, which will connected to your server, not
simply domain name. Also, -l has four values: no, incoming, outgoing and full. Each
value means: no isolation, isolate incoming messages, isolate outgoing messages, isolate
any messages. When active incoming or outgoing mode, other direction will be unchecked
at all, so you can receive HTML messages freely, but prohibit to send HTML messages in
this big, big pile of junk. When active full mode, checking will be performed in any
direction. Checking performed in these order: global parameter, server parameter (when
receiving), user parameter. Each underlying takes precedence, so, when active mode is
incoming, you can setup HTML allowing to some server, and will be receiving all HTML
mail from this server, or you can setup HTML allowing to some user and this user will
be received all HTML mail from all servers.

When mail arrived to some recipients simultaneously, each (keep in mind, EACH!) recipient
should have a permissions to receive HTML messages. When one of recipients bunch will
hasn't permissions to receive HTML messages, incoming mail will be qualified as spam
and spamaction will be issued.

So, to allow passing HTML mail from pila@dereva.net to topor@derevo.com you should:
include dereva.net in scchosts file with mark html (see man scchosts(5) to details) or
include pila@dereva.net in sccusers file with mark htmlout, include topor@derevo.com in
sccusers file with mark htmlin.

TWICE: To receive HTML messages you must include source server in scchosts file OR all
their recipients in sccusers file! And do not forgot - in scchosts file MUST be mail
gateway address, NOT source domain!

5.3.3 Enterprise mode

In enterprise mode (-n in commnadline keys or EnterpriseActivate=yes in configuration
file) filter does all checks as in isolate modes, but additionally checks permissions
to send/receive PGP encrypted messages (implies other strong cryptography), to receive
mail with executable attachments, to receive mail with any attachments. Beware,
executable attachments implies any attachments, but any attachments DOES NOT imply
executable attachments! About permissions setting see above. More info about
permissions see man sccusers(5) and man scchosts(5). These features doe not implemented
this release, but will be implemented shortly. Which attachments will be qualified as
"executable" - now I do not know. Next days, possible, it will be a parameter...

6. Commandline parameters

-h (--help) - prints short help about each parameter and exit. Also prints version infor-
	      mation and compilation date.

-v (--version) - prints only version information and compilation date.

-c (--sendmail) - specifies sendmail communication socket filename.
		  Default is /var/run/sccmilter.

-d (--daemon) - run filter in daemon mode (BEWARE! Default is NOT!)

-D (--debug) - ensure various debug level. BEWARE! Debug levels varied, only when you
	       have configure filter with --with-debug parameter, else debug code will
               be missed in filter.

		0 - no debug messages
                21 - mesasges about mail tracing events
                51 - messages about detailed mail tracing
                101 - messages with full technical log and verbose mail tracing. Also
                      this level implies 'keep-mail-always' feature.
                121 - messages with extremely technical details (this level
                      additionally stamps tracing EACH line in mail with their
                      start and end pointers. Be smart with this level, your maillog
                      will extremely grows to HUGE amount very quickly!)

-p (--tmpdir) - temporary directory path. This path will be used to keep isolated
		messages.

-t (--sendmail-timeout) - timeout to connect with sendmail (seconds)

-m (--mode) - spam messages action mode: discard simply discards message, reject
	      rejects it with 554 5.7.1 error code, isolate does all these actions
              and addintionally kept it in --tmpdir for further analysis.

-n (--enterprise) - switch on enterprise violation checks
-e (--emode) - enterprise violation mode: all modes similar spam messages modes
	       Reason to separate spam and enterprise mode is that enterprise
	       violations can done wtih full legitimate as spam messages

-l (--nohtml) - switch on HTML isolation checks. Has four parameters: no, full,
		incoming, outgoing. All these parameters explained in sccmilter(8)
                manual page and in sccmilter.conf(5) manual page.

   (--imitate) - all action will be imitation. BEWARE! This parameter hasn't short
   		 key, because it is VERY DANGEROUS to production sites!

   (--no-delete) - no delete mail anyway.

   (--spam-no-delete) - no delete mail.classified as spam, anyway.

-g (--codepage) - set these codepages as global godepages.

-o (--domain) - set this value as domain part

-P (--pidfile) - use this file as PID file.

7. Thanks

Special thanks for:

Damir Bikmuhametov (boco@ufanet.ru)	- for tracking kavmilter sources and lead me
					  to some critical checks in sccmilter code
Kirill Ponomarew (ponomarew@oberon.net) - for maintaining a FreeBSD port and pointing
					  to errors

Of course, very special thank for Sendmail Inc. for libmilter API :-)

8. Links

I have used only RFC as sources mail letter structure. No one othre programss or
documentation were studied. I think. this is clear way - RFC is a standard, other are
a mirrors. I have used these RFC's in filter developing process:

2045 Multipurpose Internet Mail Extensions (MIME) Part One: Format of
     Internet Message Bodies. N. Freed, N. Borenstein. November 1996.
     (Format: TXT=72932 bytes) (Obsoletes RFC1521, RFC1522, RFC1590)
     (Updated by RFC2184, RFC2231) (Status: DRAFT STANDARD)

2046 Multipurpose Internet Mail Extensions (MIME) Part Two: Media
     Types. N. Freed, N. Borenstein. November 1996. (Format: TXT=105854
     bytes) (Obsoletes RFC1521, RFC1522, RFC1590) (Updated by RFC2646)
     (Status: DRAFT STANDARD)

2231 MIME Parameter Value and Encoded Word Extensions: Character Sets,
     Languages, and Continuations. N. Freed, K. Moore. November 1997.
     (Format: TXT=19280 bytes) (Obsoletes RFC2184) (Updates RFC2045,
     RFC2047, RFC2183) (Status: PROPOSED STANDARD)

2387 The MIME Multipart/Related Content-type. E. Levinson. August
     1998. (Format: TXT=18864 bytes) (Obsoletes RFC2112) (Status: PROPOSED
     STANDARD)

2557 MIME Encapsulation of Aggregate Documents, such as HTML (MHTML).
     J. Palme, A. Hopmann, N. Shelness. March 1999. (Format: TXT=61854
     bytes) (Obsoletes RFC2110) (Status: PROPOSED STANDARD)

9. Conribution and availability

This program based on public-domain example source codes:
sample.c from Libmilter API documentation

In directory contrib you can find attendant materials. Now here lies only script to
take spamming statistic per daily basis. To start this script, you should place it
in your directory to local periodic scripts (for me it is /usr/local/etc/periodic) in
daily subdir, and add daily_status_mail_spammed_enable="YES" in your periodic.conf
file. Also here lies a some of spam mail examples, which I have used to testing
filter in HTML isolation mode. Keep in mind, these files represented ONLY for testing
purpose!

spamtest1 - multipart message, one plain-text part, one HTML part, without charset
	    specification, Content-Type at last line on description.

spamtest2 - miltipart message, one 'nested multipart' part (this way HTML still cannot
	    trap by filter) with one empty part and one HTML part, one image part,
            without charset specification, Content-Type at first line on description.

spamtest3 - miltipart message, one HTML part with charset specification, Content-Type
            at first line on description, two image parts.

testmail1 - multipart message, two text parts, each with charset specification, multiline
	    Content-Type at first line on description

spamtest5 - most "lovely" sample :-) multipart message, with charset in header, one empty
	    text part, one HTML part (all without charset specification)

spamtest6 - multipart message, one text part with charset specification, one HTML part,
	    also with charset specification.

spamtest7 - one-part message in plain text with some unusual charset

spamtest8 - multipart message, one text part with charset specification, one HTML part,
	    also with charset specification.

testmail2 - multipart message with spaces in boundary specification

This program has been written and contributed by Rashid N. Achilov (shelton@granch.ru).
You always welcome to bug reports, wishes and comments (but smart comments, dumb
comments will resent to 'devnull' user :-) )

Fresh versions you can download from http://granch.ru/~shelton/fileZ/sccmilter.tar.bz2

Also on my homepage you can watch some latest news about new features, changes and
bug fixes in sccmilter. And. of course, other (probably important) news.

To be continued...

Rashid N. Achilov.
E-mail shelton@granch.ru  Web: http://granch.ru/~shelton
PGP: 83 CD E2 A7 37 4A D5 81 D6 D6 52 BF C9 2F 85 AF 97 BE CB 0A
