mod_clamav: an Apache virus scanning filter

Apache 2 introduces filters, which allow to modify content generated by some other module. mod_clamav is an Apache 2 filter which scans the content delivered by the proxy module (mod_proxy) for viruses using the Clamav virus scanning engine.

mod_clamav was written and is currently maintained by Andreas Müller, it is distributed under the GNU General Public License, see the file COPYING in the distribution for details.

This document describes Version 0.12 of mod_clamav. This version can be downloaded from http://software.othello.ch/mod_clamav/mod_clamav-0.12.tar.gz. The most current version will always be available at http://software.othello.ch/mod_clamav/.

Installation

Before installing mod_clamav, make sure you have Clamav properly installed. The module is of only limited use if the proxy module is not available to apache, which is not built by default. So you may want to go back to your Apache compilation and adjust the options to configure so that the proxy module is built.

The only configuration option necessary for mod_clamav is --with-apache=/your/apache2/directory. So installing the module usually takes the familiar steps

# ./configure --with-apache=/usr/local/apache2
# make
# make install

mod_clamav has so far been tested on Linux, Solaris and Mac OS X (the latter only supports local mode). If you succeed to install the module on some other platform, please keep the maintainer updated.

Configuration

The module comes with some reasonable defaults for most options, but depending on your plattform, in particular on the implementation of shared memory and mutexes (they are used for the statistics page), some directives may be required on your platform, that are not necessary on others.

Here is a configuration for an Apache proxy that scans everything except some image types for viruses, using the database files in /usr/local/share/clamav. While downloading files, mod_clamav will write a copy of the file it will later scan for viruses in /tmp/clamav.

ClamavTmpdir    /tmp/clamav
ClamavDbdir     /usr/local/share/clamav
ClamavSafetypes image/gif image/jpeg image/png
<Proxy *>
    SetOutputFilter CLAMAV
</Proxy>

The status page can be enabled with the Location

<Location /clamav>
        SetHandler clamav
</Location>
Please note that not restricting access to this location may reveal sensitive information.

The contents status of the status page depend on the configuration: in daemon mode there is no way to measure the CPU time spent checking viruses, so not CPU time is displayed.

How it works

mod_clamav is an Apache 2 filter, so there is no hope that it will ever be usable with Apache 1. Filters were introduced in Apache 2 to inspect and modify content delivered by some other module.

mod_clamav takes the output of the proxy module, and scans it for viruses using the Clamav library (local mode) or the Clamav daemon (daemon mode). This means that in local mode, the virus scanning engine is part of the apache process, thus virus scanning does not take an extra round-trip to a virus scanning proxy, as with many other virus scanning products.

The clamav library could work entirely inside main memory, but this would cause a problem for large downloads: they could eat up all memory starving the machine in the process. Hence mod_clamav writes the data to a file, the location is configurable with the ClamavTmpdir directive. If file IO is a problem, the temporary files can be placed on a ramdisk.

Long downloads create a special dilemma for a virus scanning proxy: the proxy should not send anything to the browser before it has made sure the object is virus free, but the browser may think the server has a problem if no data is transmitted for a long time. mod_clamav therefore sends one byte every minute (or less if you prefer) of the file being downloaded to the browser. This is enough to keep the browser happy.

Some platforms do not support daemon mode, because the Clamav daemon (which uses pthreads), is not available for them. One example is Mac OS X, on which mod_clamav can only be used in local mode.

One problem with browsers is that the decide to time out if the proxy does not send any data to them. So mod_clamav sends a single byte every minute, even before anything has been checked for viruses. This has the side effect that no HTML error message can be displayed to the client if a byte has been sent already. If the transfer from the server completes within the first minute, i.e. before the first trickle byte is sent to the browser, mod_clamav sends an HTML error message (new in 0.9).

Debugging

mod_clamav provides very verbose logging, if enabled at compile time. If the preprocessor flag CLAMAV_DEBUG is set to 1 instead of the default 0, additional messages are generated at run time. If you meet a problem running mod_clamav, please try to compile with debugging enabled and run the server with DebugLevel set to debug.

Reference

All the available directives are described below

ClamavMode

Syntax: ClamavMode local | daemon
Default: ClamavMode local
Context: server config, virtual host, directory

If the module is supposed to use the clamav library directly, use local mode. In daemon mode, the module queries a remote clamd (on the same machine, of course) for virus checking. The connection to the daemon must be configured using the ClamavSocket or ClamavPort directives


ClamavSocket

Syntax: ClamavTmpdir unix-domain-socket
Default: none
Context: server config, virtual host, directory

Specifies the path where the Clamav daemon clamd is listening. If this directive is not set, the daemon mode of the module assumes a TCP connection to the Clamav daemon.


ClamavPort

Syntax: ClamavPort port
Default: none
Context: server config, virtual host, directory

Specifies the port number on which the clamav daemon is listening. Not that this directive only has any effect if ClamavSocket is not specified.


ClamavTmpdir

Syntax: ClamavTmpdir tmp-dir
Default: ClamavTmpdir /tmp
Context: server config, virtual host, directory

This directive defines the directory where temporary files should be stored until the can be scanned for viruses.


ClamavDbdir

Syntax: ClamavDbdir virus-pattern-dir
Default: same as that of your clamav installation
Context: server config, virtual host, directory

This directive defines the directory from which virus patterns are loaded.


ClamavReloadInterval

Syntax: ClamavReloadInterval interval
Default: 0
Context: server config, virtual host, directory

The pattern database is reloaded if the last request is more then interval seconds in the past. A value of 0 means that the pattern database is never reloaded, to update patterns, the server must be gracefully restarted. Reloading is only necessary in local mode, in daemon mode its the daemon's business to keep the pattern matching engine up-to-date.


ClamavTrickleInterval

Syntax: ClamavTrickleInterval interval
Default: ClamavTrickleInterval 60
Context: server config, virtual host, directory

This directive sets the interval at which a block (normaly one byte, but configurable with the ClamavTrickleSize directive) of the incoming data is sent to the browser to keep it happy. If your browsers are tolerant of long delays, this value can be increased.

Note that the trickle interval has a side effect that can affect your link load considerably: only when the trickle is sent to the client will the module be able to detect that the client has aborted the connection. A long trickle interval means that the server will continue downloading the file, although the client is no longer interested. This can fill up you link with downloads still going on no user is interested in.

Browsers behave quite differently with respect to timeouts. For some browsers, a single byte is not good enough, so you will want to increase the trickle size to a larger value. Download speeds below 1 byte/sec seem to be a problem for browsers. Apple's Safari browser times out after 60 seconds (Mozilla seems to be more patient), so you will have do make the trickle interval smaller than 60. Note also that the trickle interval is a minimum value, if a packet arrives from the remote server after that interval, then a trickle block is sent to the browser client. If no packets arrive from the remote server, no trickle blocks are sent to client either.


ClamavTrickleSize

Syntax: ClamavTrickleSize size
Default: ClamavTrickleSize 1
Context: server config, virtual host, directory

This directive sets the size of the block sent after each trickle interval. See the description of the ClamavTrickleInterval directive for details.


ClamavMaxfiles

Syntax: ClamavMaxfiles number-of-files
Default: none
Context: server config, virtual host, directory

This directive sets the maxfiles limit variable in Clamav, please read the Clamav for the exact implications of this.


ClamavMaxfilesize

Syntax: ClamavMaxfilesize filesize
Default: none
Context: server config, virtual host, directory

This directive sets the maxfilesize limit variable in Clamav, please read the Clamav documentation for the exact implications of this.


ClamavRecursion

Syntax: ClamavRecursion depth
Default: none
Context: server config, virtual host, directory

This directive sets the recursion depth limit variable in Clamav, please read the Clamav for the exact implications of this.


ClamavSafetypes

Syntax: ClamavSafetypes safe-mime-type ...
Default: none
Context: server config, virtual host, directory

Use this directive to specify a list of mime types that can safely be bypassed.

ClamavSizelimit

Syntax: ClamavSize size
Default: ClamavSizelimit 0
Context: server config, virtual host, directory

This directive sets the size of the largest part of a file that will be checked. By default, its value is 0, meaning the a file is scanned in its entirety. For a positive value, a chunk of at least size bytes is downloaded and checked for viruses. If nothing is found, the rest of the file is downloaded without checking.



© 2003 Dr. Andreas Müller, Beratung und Entwicklung
$Id: mod_clamav.html.in,v 1.6 2003/11/20 15:04:02 afm Exp $