To: users, dev, announce Subject: ANNOUNCE: Apache SpamAssassin 3.4.0 available Release Notes -- Apache SpamAssassin -- Version 3.4.0 Introduction ------------ This is a major release. It introduces over two years of bug fixes and features since the release of SpamAssassin 3.3.2 on June 16, 2011. 3.4.0 includes the Bayes Redis (http://redis.io/) back-end (bug 6879), EDNS0 changes (bug 6910), native IPv6 support, numerous URIBL.pm changes or features and a small API change in libspamc (bug 6562) with many other subtle changes. SpamAssassin was tested on perl 5.18.2, and (out of curiosity) also on a Raspberry Pi (ARM6, Raspbian / Debian 7.2 Wheezy, perl 5.14.2) ... yes, it is 20 times slower compared to i7-960 CPU, but all tests pass! Overall, this release has been tested on many production-level environments for nearly a year, including testing on an IPv6-only host. It is highly recommended and stable. NOTE: Complete changes are available at http://svn.apache.org/repos/asf/spamassassin/branches/3.4/Changes Notable Sendmail Bug -------------------- Sendmail 8.14.5 and below contain a canonicalization misfeature / bug that can cause DKIM failures. See https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6462. Compatibility with version 3.3.2 -------------------------------- * DNS queries generated by SpamAssassin now enable option EDNS0 in query packets and specify a buffer size of 4096 bytes by default. This allows DNS replies larger than 512 bytes to be returned in one UDP datagram, avoiding a need for re-issuing a failed query over a TCP protocol. This default setting is well suited if a DNS resolver (i.e. a recursive DNS server) is located on the same LAN as a host running SpamAssassin, which is the usual setup for all but perhaps some home uses of SpamAssassin. The option should be disabled (by 'dns_options noedns0') when a recursive DNS server is only reachable through some old-fashioned firewall or through some picky router with deep packet inspection which bans DNS UDP messages larger than 512 bytes, or blocks fragmented UDP datagrams. The 'dns_options' setting is documented in Mail::SpamAssassin::Conf POD or man page, more details in bug 6910 and bug 6862. * A default setting for option 'dns_available' was changed from 'test' to 'yes' (bug 6770, bug 6769), so SpamAssassin now assumes by default that it is running on a host with an internet connection and a working DNS resolver. If this is not the case, please configure this option explicitly. The change avoids surprises on an otherwise well connected host which may experience a temporary DNS unavailability at the system startup time or a temporary network outage when spamd was starting, and the initial failed test would disable DNS queries permanently. The option is documented in the Mail::SpamAssassin::Conf POD or man page. * When Bayes classification is in use and messages are 'learned' as spam or ham and stored in a database, the Bayes plugin generates internal message IDs of learned messages and stores them in a 'seen' database to avoid re-learning duplicates and accidental un-learning messages that were not previously learned. With changes in bug 5185, the calculation of message IDs in a bayes 'seen' database has changed, so new code can no longer associate new messages with those learned before the change. Note that this change does not affect recognition of old tokens and the classification algorithm, only duplicate detection and unlearning of old messages is affected. Because of this change, if you use Bayes and you are upgrading from a version prior to 3.4.0, you may consider wiping your Bayes database and starting fresh. However, this is not mandatory. If you choose to keep your current database tokens, these are the ramifications: 1 - If you re-process emails that have already been learned before, it will create duplicate entries because of the new msg_id format. The duplicates will expire, eventually, and should cause minimal impact unless it occurs frequently. 2 - If you try and unlearn or reclassify an email processed prior to the upgrade, the system will be unable to do so because of the new msg_id format. If unlearning a message (that was learned before the change) is important, consider just clearing your Bayes store and starting from scratch. Dependency changes since version 3.3.2 -------------------------------------- Dependency on the following Perl modules were dropped: Net::Ident, IP::Country::Fast and IP::Country. Dependency on a perl module LWP::UserAgent as used by sa-update is now made optional if any of programs curl, wget, or fetch are available. New optional dependencies on the following Perl modules were introduced: - new optional dependency on Geo::IP in a RelayCountry plugin (bug 6599); for backward compatibility IP::Country::Fast is used if Geo::IP is not installed - new optional dependency on IO::Socket::IP for a cleaner IP support regardless of a protocol family (IPv4 and IPv6) - new optional dependency on Net::Patricia to speed up lookups on internal_networks, trusted_networks or msa_networks when these lists contain a larger number of entries - new optional dependency on programs curl, wget, or a FreeBSD fetch. sa-update will use any of these external programs to download rule updates, either over IPv6 or over IPv4. Any of these three programs suffices - the installation procedure is currently unclear on this, its warning may be understood as if all three programs are needed, which is not the case - minimal required version of NetAddr::IP was bumped to 4.010 Internal changes potentially affecting third party software using Mail::SpamAssassin library ----------------------------------------------------------- A caller is now given a choice of calling srand() by itself (e.g. before forking) or let a SpamAssassin library do it as before. Avoiding redundant initialization of a perl's random number generator can prevent unnecessary entropy loss. It is controlled by option skip_prng_reseeding in a call to Mail::SpamAssassin::new(). The change was documented in bug 6690. The Mail::SpamAssassin::parser can now accept a message also as a string reference, avoiding one copy in memory. Documented in bug 6686. A caller may pass the original mail body size to Mail::SpamAssassin::parse through the suppl_attrib argument's field 'body_size'. This mail body size is accessible to the eval rule check_body_length. It can be useful when a caller only passes a truncated message to SpamAssassin. Documented in bug 6830. A new plugin callback "prefork_init" was introduced, which should be called by a master process (e.g. spamd) before forking multiple child processes. For compatibility this call is currently optional, but recommended for new versions. Currently only a Redis backend for Bayes checks will benefit from being notified before a fork. Documented in bug 6942. Notable bug fixes ----------------- The sa-update program now avoids repeatedly downloading same rules if subsequent unpacking of rules and updating fails. Documented in bug 6655. Several incompatibilities with newer versions of a perl module Net::DNS as used by sa-update and by the SpamAssassin library were fixed. See Net::DNS problem [rt.cpan.org #83451]. A perl module Razor agent clobbers entropy of a random number generator by re-initializing the generator on every call. The SpamAssassin Razor plugin now provides a workaround, preserving entropy across calls to Razor2 agent. A workaround in BayesStore/MySQL.pm was added for a MySQL server bug, see http://bugs.mysql.com/bug.php?id=46675 . Documentation was fixed: trailing dots in DNSBL zone names are not required since version 3.1.0 of Mail::SpamAssassin (September 2005). Notable features: ================= Redis database backend for a Bayes database ------------------------------------------- In addition to existing backends, the 3.4.0 introduces support for keeping a Bayes database on a Redis server, either running locally, or accessed over network. Similar to SQL backends, the database may be concurrently used by several hosts running SpamAssassin. The current implementation only supports a global Bayes database, i.e. per-recipient sub-databases are not supported. The Redis 2.6.* server supports access over IPv4 or over a Unix socket, starting with version 2.8.0 also IPv6 is supported. Bear in mind that Redis server only offers limited access controls, so it is advisable to let the Redis server bind to a loopback interface only, or to use other mechanisms to limit access, such as local firewall rules. The Redis backend for Bayes can put a Lua scripting support in a Redis server to good use, improving performance. The Lua support is available in Redis server since version 2.6. In absence of a Lua support, the Redis backend uses batched (pipelined) traditional Redis commands, so it should work with a Redis server version 2.4 (untested), although this is not recommended for busy sites. Expiration of token and 'seen' message id entries is left to the Redis server. There is no provision for manually expiring a database, so it is highly recommended to leave the setting bayes_auto_expire to its default value 1 (i.e. enabled). Example configuration: bayes_store_module Mail::SpamAssassin::BayesStore::Redis bayes_sql_dsn server=127.0.0.1:6379;password=foo;database=2 bayes_token_ttl 21d bayes_seen_ttl 8d bayes_auto_expire 1 Improved support for IPv6 ------------------------- The rules-updating program sa-update and its infrastructure is now usable over either IPv4 or IPv6, including from an IPv6-only hosts (bug 6654). SpamAssassin is now usable on an IPv6-only host: affects installation, self-tests, rule updates, client, server, and a command-line spamassassin. Command line options -4 and -6 were added to prefer/choose/force IPv4 or IPv6 in programs spamassassin, spamd, spamc, and sa-update. Command line options --listen and --allowed-ips in spamd can now accept IPv6 addresses. Preferably a perl module IO::Socket::IP is used (if it is available) for network communication regardless of a protocol family - for DNS queries, by spamd server side, and by a client code in Mail::SpamAssassin::Client. As a fallback when the module IO::Socket::IP is unavailable, an older module IO::Socket::INET6 is used, or eventually the IO::Socket::INET is used as last resort. If spamd fails to start with an 'Address already in use' message, please install perl module IO::Socket::IP, or deintall IO::Socket::INET6, or specify a socket bind address explicitly with a spamd --listen option. See bug 6953 for details. The spamd server can now simultaneously listen on multiple sockets, possibly in different protocol domains (Unix sockets, INET or INET6 protocol families. DnsResolver was updated allowing it to work on an IPv6-only host (bug 6653) A plugin RelayCountry now uses module Geo::IP and its database of IPv6 addresses GEOIP_COUNTRY_EDITION_V6 when available. The following configuration options were extended to accept IPv6 addresses: dns_server, trusted_networks, internal_networks, msa_networks, (but not yet the whitelist_from_rcvd), and their defaults were adjusted accordingly. The parser code of Received header fields can now deal with IPv6 addresses in a mail header section. The AutoWhitelist plugin was updated and can now deal with IPv6 addresses. Installation unit tests were updated to prevent them from failing on an IPv6-only host. New command-line options ------------------------ New command-line option for spamd: added an option --listen (or -i), which can be specified multiple times and allows spamd to accept requests over multiple INET (IPv4) or INET6 (IPv6) or UNIX sockets. See bug 6841, and see also option --port. New command-line option for spamc: -X (or --unavailable-tempfail) allows spamc to return EX_TEMPFAIL instead of EX_UNAVAILABLE when using option -x. As already noted in the 'Improved support for IPv6' section, options -4 and -6 were added to programs spamassassin, spamd, spamc, and sa-update. The sa-update utility can now take multiple -v or --verbose options to increase verbosity. The sa-learn command has a new option --max-size . New configuration options ------------------------- Plugin/URIDNSBL: new tflags options 'a' and 'ns' were introduced. They are documented in the Mail::SpamAssassin::Plugin::URIDNSBL POD or man page. Plugin/AutoLearnThreshold: new option autolearn_force was added. It is documented in the Mail::SpamAssassin::Plugin::AutoLearnThreshold POD or man page. Plugin/ASN: new options asn_prefix and clear_asn_lookups were added. They are documented in Mail::SpamAssassin::Plugin::ASN POD or man page. The following new options, as implemented by various plugins or by other modules, are all documented in the Mail::SpamAssassin::Conf POD or man page: - Plugin/WLBLEval: new configuration options were added: enlist_uri_host, delist_uri_host, with shorthands blacklist_uri_host and whitelist_uri_host and an associated eval rule check_uri_host_listed. - Configuration options dns_query_restriction (allow|deny) and clear_dns_query_restriction were added (bug 6884). - A 'dns_options' setting accepts new sub-options 'dns0x20' and 'edns'. - Added option 'dns_server' which specifies an IP address of a recursive DNS server (i.e. DNS resolver) and optionally its port number. - Added options dns_local_ports_permit, dns_local_ports_avoid and dns_local_ports_none to control source port local ranges available to DNS queries - Added the following sub-options to the tflags setting: autolearn_force, maxhits=N, ips_only, domains_only, a, ns. - The option whitelist_from_rcvd can now take an IP address as its second argument (instead of a domain name), which can be useful for whitelisting a sending mailer which has no reverse DNS mapping. ArchiveIterator has new options opt_max_size and opt_from_regex. They are documented in Mail::SpamAssassin::ArchiveIterator POD or man page. A new tag (macro) _RULESVERSION_ was added. It expands to a comma-separated list of rules versions, retrieved from an '# UPDATE version' comment in rules files and can be used in an 'add_header' configuration setting. New plugins ----------- A new plugin AskDNS was introduced. Using a DNS query template as specified in a parameter of an askdns rule, the plugin replaces tag names as found in the template with their values and launches DNS queries as soon as tag values become available. When DNS responses trickle in, filters them according to the requested DNS resource record type and an optional subrule filtering expression, yielding a rule hit if a response meets filtering conditions. Optimizations ------------- Several smaller performance optimizations were introduced, among others: bug 6508 (uses Net::Patricia if available), bug 6854 (base64 attachments), bug 6915 (get_tag speedup). The DNS client code module now caches queries and replies for the duration of processing one mail message. Duplicate DNS queries by different rules which happen to query the same DNS resource are now avoided. Downloading and availability ---------------------------- Downloads are available from: http://spamassassin.apache.org/downloads.cgi md5sum of archive files: 46e99adc0affebbe5f3524b4834e0345 Mail-SpamAssassin-3.4.0.tar.bz2 5d0b50cee3bfa905cca35c33296c8c2a Mail-SpamAssassin-3.4.0.tar.gz 088a9b9bf7f3d93350f8c8920cbd2fe6 Mail-SpamAssassin-3.4.0.zip 9c15df55e9ec2a3c8376f3e15e448a2e Mail-SpamAssassin-rules-3.4.0.r1565117.tgz sha1sum of archive files: 5bc66cd599cbe6a38a127d7813d4abc8af03b667 Mail-SpamAssassin-3.4.0.tar.bz2 4dac1384282b6201f7d80cea8295933ef08e7e28 Mail-SpamAssassin-3.4.0.tar.gz 3fa7715fb4c8b558b5fbc2e5a1288a751d8d12e3 Mail-SpamAssassin-3.4.0.zip d71a64cab9f5454d3b164e44d3649bff9cb87f87 Mail-SpamAssassin-rules-3.4.0.r1565117.tgz Note that the *-rules-*.tar.gz files are only necessary if you cannot, or do not wish to, run "sa-update" after install to download the latest fresh rules. See the INSTALL and UPGRADE files in the distribution for important installation notes. GPG Verification Procedure -------------------------- The release files also have a .asc accompanying them. The file serves as an external GPG signature for the given release file. The signing key is available via the wwwkeys.pgp.net key server, as well as http://www.apache.org/dist/spamassassin/KEYS The key information is: pub 4096R/F7D39814 2009-12-02 Key fingerprint = D809 9BC7 9E17 D7E4 9BC2 1E31 FDE5 2F40 F7D3 9814 uid SpamAssassin Project Management Committee uid SpamAssassin Signing Key (Code Signing Key, replacement for 1024D/265FA05B) sub 4096R/7B3265A5 2009-12-02 To verify a release file, download the file with the accompanying .asc file and run the following commands: gpg -v --keyserver wwwkeys.pgp.net --recv-key F7D39814 gpg --verify Mail-SpamAssassin-3.4.0.tar.bz2.asc gpg --fingerprint F7D39814 Then verify that the key matches the signature. Note that older versions of gnupg may not be able to complete the steps above. Specifically, GnuPG v1.0.6, 1.0.7 & 1.2.6 failed while v1.4.11 worked flawlessly. See http://www.apache.org/info/verification.html for more information on verifying Apache releases. About Apache SpamAssassin ------------------------- Apache SpamAssassin is a mature, widely-deployed open source project that serves as a mail filter to identify spam. SpamAssassin uses a variety of mechanisms including mail header and text analysis, Bayesian filtering, DNS blocklists, and collaborative filtering databases. In addition, Apache SpamAssassin has a modular architecture that allows other technologies to be quickly incorporated as an addition or as a replacement for existing methods. Apache SpamAssassin typically runs on a server, classifies and labels spam before it reaches your mailbox, while allowing other components of a mail system to act on its results. Most of the Apache SpamAssassin is written in Perl, with heavily traversed code paths carefully optimized. Benefits are portability, robustness and facilitated maintenance. It can run on a wide variety of POSIX platforms. The server and the Perl library feels at home on Unix and Linux platforms, and reportedly also works on MS Windows systems under ActivePerl. For more information, visit http://spamassassin.apache.org/ About The Apache Software Foundation ------------------------------------ Established in 1999, The Apache Software Foundation provides organizational, legal, and financial support for more than 100 freely-available, collaboratively-developed Open Source projects. The pragmatic Apache License enables individual and commercial users to easily deploy Apache software; the Foundation's intellectual property framework limits the legal exposure of its 2,500+ contributors. For more information, visit http://www.apache.org/