NAME Mail::SpamAssassin::Pyzor::Digest::Pieces - Pyzor backend logic module DESCRIPTION This module houses backend logic for Mail::SpamAssassin::Pyzor::Digest. It reimplements logic found in pyzor's digest.py module (). FUNCTIONS $strings_ar = digest_payloads( $EMAIL_MIME ) This imitates the corresponding object method in digest.py. It returns a reference to an array of strings. Each string can be either a byte string or a character string (e.g., UTF-8 decoded). NB: RFC 2822 stipulates that message bodies should use CRLF line breaks, not plain LF (nor plain CR). We will thus convert any plain CRs in a quoted-printable message body into CRLF. Python, though, doesn't do this, so the output of our implementation of "digest_payloads()" diverges from that of the Python original. It doesn't ultimately make a difference since the line-ending whitespace gets trimmed regardless, but it's necessary to factor in when comparing the output of our implementation with the Python output. normalize( $STRING ) This imitates the corresponding object method in digest.py. It modifies $STRING in-place. As with the original implementation, if $STRING contains (decoded) Unicode characters, those characters will be parsed accordingly. So: $str = "123\xc2\xa0"; # [ c2 a0 ] == \u00a0, non-breaking space normalize($str); The above will leave $str alone, but this: utf8::decode($str); normalize($str); ... will trim off the last two bytes from $str. $yn = should_handle_line( $STRING ) This imitates the corresponding object method in digest.py. It returns a boolean. $sr = assemble_lines( \@LINES ) This assembles a string buffer out of @LINES. The string is the buffer of octets that will be hashed to produce the message digest. Each member of @LINES is expected to be an octet string, not a character string. ($main, $sub, $encoding, $checkval) = parse_content_type( $CONTENT_TYPE ) @lines = splitlines( $TEXT ) Imitates "str.splitlines()". (cf. "pydoc str") Returns a plain list in list context. Returns the number of items to be returned in scalar context.