NAME
    Mail::SpamAssassin::PerMsgStatus - per-message status (spam or not-spam)

SYNOPSIS
      my $spamtest = new Mail::SpamAssassin ({
        'rules_filename'      => '/etc/spamassassin.rules',
        'userprefs_filename'  => $ENV{HOME}.'/.spamassassin/user_prefs'
      });
      my $mail = $spamtest->parse();

      my $status = $spamtest->check ($mail);

      my $rewritten_mail;
      if ($status->is_spam()) {
        $rewritten_mail = $status->rewrite_mail ();
      }
      ...

DESCRIPTION
    The Mail::SpamAssassin "check()" method returns an object of this class.
    This object encapsulates all the per-message state.

METHODS
    $status->check ()
        Runs the SpamAssassin rules against the message pointed to by the
        object.

    $status->learn()
        After a mail message has been checked, this method can be called. If
        the score is outside a certain range around the threshold, ie. if
        the message is judged more-or-less definitely spam or definitely
        non-spam, it will be fed into SpamAssassin's learning systems
        (currently the naive Bayesian classifier), so that future similar
        mails will be caught.

    $score = $status->get_autolearn_points()
        Return the message's score as computed for auto-learning. Certain
        tests are ignored:

          - rules with tflags set to 'learn' (the Bayesian rules)

          - rules with tflags set to 'userconf' (user white/black-listing rules, etc)

          - rules with tflags set to 'noautolearn'

        Also note that auto-learning occurs using scores from either
        scoreset 0 or 1, depending on what scoreset is used during message
        check. It is likely that the message check and auto-learn scores
        will be different.

    $score = $status->get_head_only_points()
        Return the message's score as computed for auto-learning, ignoring
        all rules except for header-based ones.

    $score = $status->get_learned_points()
        Return the message's score as computed for auto-learning, ignoring
        all rules except for learning-based ones.

    $score = $status->get_body_only_points()
        Return the message's score as computed for auto-learning, ignoring
        all rules except for body-based ones.

    $score = $status->get_autolearn_force_status()
        Return whether a message's score included any rules that are flagged
        as autolearn_force.

    $rule_names = $status->get_autolearn_force_names()
        Return a list of comma separated list of rule names if a message's
        score included any rules that are flagged as autolearn_force.

    $isspam = $status->is_spam ()
        After a mail message has been checked, this method can be called. It
        will return 1 for mail determined likely to be spam, 0 if it does
        not seem spam-like.

    $list = $status->get_names_of_tests_hit ()
        After a mail message has been checked, this method can be called. It
        will return a comma-separated string, listing all the symbolic test
        names of the tests which were triggered by the mail.

    $list = $status->get_names_of_tests_hit_with_scores_hash ()
        After a mail message has been checked, this method can be called. It
        will return a pointer to a hash for rule & score pairs for all the
        symbolic test names and individual scores of the tests which were
        triggered by the mail.

    $list = $status->get_names_of_tests_hit_with_scores ()
        After a mail message has been checked, this method can be called. It
        will return a comma-separated string of rule=score pairs for all the
        symbolic test names and individual scores of the tests which were
        triggered by the mail.

    $list = $status->get_names_of_subtests_hit ()
        After a mail message has been checked, this method can be called. It
        will return a comma-separated string, listing all the symbolic test
        names of the meta-rule sub-tests which were triggered by the mail.
        Sub-tests are the normally-hidden rules, which score 0 and have
        names beginning with two underscores, used in meta rules.

        If a parameter of collapsed or dbg is passed, the output will be a
        condensed array of sub-tests with multiple hits reduced to one
        entry.

        If the parameter of dbg is passed, the output will be a condensed
        string of sub-tests with multiple hits reduced to one entry with the
        number of hits in parentheses. Some information is also added at the
        end regarding the multiple hits.

    $num = $status->get_score ()
        After a mail message has been checked, this method can be called. It
        will return the message's score.

    $num = $status->get_required_score ()
        After a mail message has been checked, this method can be called. It
        will return the score required for a mail to be considered spam.

    $num = $status->get_autolearn_status ()
        After a mail message has been checked, this method can be called. It
        will return one of the following strings depending on whether the
        mail was auto-learned or not: "ham", "no", "spam", "disabled",
        "failed", "unavailable".

        It also returns is flagged with auto_learn_force, it will also
        include the status and the rules hit. For example:
        "autolearn_force=yes (AUTOLEARNTEST_BODY)"

    $report = $status->get_report ()
        Deliver a "spam report" on the checked mail message. This contains
        details of how many spam detection rules it triggered.

        The report is returned as a multi-line string, with the lines
        separated by "\n" characters.

    $preview = $status->get_content_preview ()
        Give a "preview" of the content.

        This is returned as a multi-line string, with the lines separated by
        "\n" characters, containing a fully-decoded, safe, plain-text sample
        of the first few lines of the message body.

    $msg = $status->get_message()
        Return the object representing the message being scanned.

    $status->rewrite_mail ()
        Rewrite the mail message. This will at minimum add headers, and at
        maximum MIME-encapsulate the message text, to reflect its spam or
        not-spam status. The function will return a scalar of the rewritten
        message.

        The actual modifications depend on the configuration (see
        "Mail::SpamAssassin::Conf" for more information).

        The possible modifications are as follows:

        To:, From: and Subject: modification on spam mails
            Depending on the configuration, the To: and From: lines can have
            a user-defined RFC 2822 comment appended for spam mail. The
            subject line may have a user-defined string prepended to it for
            spam mail.

        X-Spam-* headers for all mails
            Depending on the configuration, zero or more headers with names
            beginning with "X-Spam-" will be added to mail depending on
            whether it is spam or ham.

        spam message with report_safe
            If report_safe is set to true (1), then spam messages are
            encapsulated into their own message/rfc822 MIME attachment
            without any modifications being made.

            If report_safe is set to false (0), then the message will only
            have the above headers added/modified.

    $status->action_depends_on_tags($tags, $code, @args)
        Enqueue the supplied subroutine reference $code, to become runnable
        when all the specified tags become available. The $tags may be a
        simple scalar - a tag name, or a listref of tag names. The
        subroutine &$code when called will be passed a "permessagestatus"
        object as its first argument, followed by the supplied (optional)
        list @args .

    $status->set_tag($tagname, $value)
        Set a template tag, as used in "add_header", report templates, etc.
        This API is intended for use by plugins. Tag names will be converted
        to an all-uppercase representation internally.

        $value can be a simple scalar (string or number), or a reference to
        an array, in which case the public method get_tag will join array
        elements using a space as a separator, returning a single string for
        backward compatibility.

        $value can also be a subroutine reference, which will be evaluated
        each time the template is expanded. The first argument passed by
        get_tag to a called subroutine will be a PerMsgStatus object (this
        module's object), followed by optional arguments provided a caller
        to get_tag.

        Note that perl supports closures, which means that variables set in
        the caller's scope can be accessed inside this "sub". For example:

            my $text = "hello world!";
            $status->set_tag("FOO", sub {
                      my $pms = shift;
                      return $text;
                    });

        See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more
        details on how template tags are used.

        "undef" will be returned if a tag by that name has not been defined.

    $string = $status->get_tag($tagname)
        Get the current value of a template tag, as used in "add_header",
        report templates, etc. This API is intended for use by plugins. Tag
        names will be converted to an all-uppercase representation
        internally. See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section
        for more details on tags.

        "undef" will be returned if a tag by that name has not been defined.

    $string = $status->get_tag_raw($tagname, @args)
        Similar to "get_tag", but keeps a tag name unchanged (does not
        uppercase it), and does not convert arrayref tag values into a
        single string.

    $status->set_spamd_result_item($subref)
        Set an entry for the spamd result log line. $subref should be a code
        reference for a subroutine which will return a string in
        'name=VALUE' format, similar to the other entries in the spamd
        result line:

          Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL,
          DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN,
          TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME,
          TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm,
          uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1,
          rport=33153,mid=<9PS291LhupY>,autolearn=spam

        "name" and "VALUE" must not contain "=" or "," characters, as it is
        important that these log lines are easy to parse.

        The code reference will be called by spamd after the message has
        been scanned, and the "PerMsgStatus::check()" method has returned.

    $status->finish ()
        Indicate that this $status object is finished with, and can be
        destroyed.

        If you are using SpamAssassin in a persistent environment, or
        checking many mail messages from one "Mail::SpamAssassin" factory,
        this method should be called to ensure Perl's garbage collection
        will clean up old status objects.

    $name = $status->get_current_eval_rule_name()
        Return the name of the currently-running eval rule. "undef" is
        returned if no eval rule is currently being run. Useful for plugins
        to determine the current rule name while inside an eval test
        function call.

    $status->get_decoded_body_text_array ()
        Returns the message body, with base64 or quoted-printable encodings
        decoded, and non-text parts or non-inline attachments stripped.

        This is the same result text as used in 'rawbody' rules.

        It is returned as an array of strings, with each string being a
        2-4kB chunk of the body, split from boundaries if possible.

    $status->get_decoded_stripped_body_text_array ()
        Returns the message body, decoded (as described in
        get_decoded_body_text_array()), with HTML rendered, and with
        whitespace normalized.

        This is the same result text as used in 'body' rules.

        It will always render text/html.

        It is returned as an array of strings, with each string representing
        one 'paragraph'. Paragraphs, in plain-text mails, are
        double-newline-separated blocks of multi-line text.

    $status->get (header_name [, default_value])
        Returns a message header, pseudo-header, real name or address.
        "header_name" is the name of a mail header, such as 'Subject', 'To',
        etc. If "default_value" is given, it will be used if the requested
        "header_name" does not exist.

        Appending ":raw" to the header name will inhibit decoding of
        quoted-printable or base-64 encoded strings.

        Appending a modifier ":addr" to a header field name will cause
        everything except the first email address to be removed from the
        header field. It is mainly applicable to header fields 'From',
        'Sender', 'To', 'Cc' along with their 'Resent-*' counterparts, and
        the 'Return-Path'. For example, all of the following will result in
        "example@foo":

        example@foo
        example@foo (Foo Blah)
        example@foo, example@bar
        display: example@foo (Foo Blah), example@bar ;
        Foo Blah <example@foo>
        "Foo Blah" <example@foo>
        "'Foo Blah'" <example@foo>

        Appending a modifier ":name" to a header field name will cause
        everything except the first display name to be removed from the
        header field. It is mainly applicable to header fields containing a
        single mail address: 'From', 'Sender', along with their
        'Resent-From' and 'Resent-Sender' counterparts. For example, all of
        the following will result in "Foo Blah". One level of single quotes
        is stripped too, as it is often seen.

        example@foo (Foo Blah)
        example@foo (Foo Blah), example@bar
        display: example@foo (Foo Blah), example@bar ;
        Foo Blah <example@foo>
        "Foo Blah" <example@foo>
        "'Foo Blah'" <example@foo>

        There are several special pseudo-headers that can be specified:

        "ALL" can be used to mean the text of all the message's headers.
        Each header is decoded and unfolded to single line, unless called
        with :raw.
        "ALL-TRUSTED" can be used to mean the text of all the message's
        headers that could only have been added by trusted relays.
        "ALL-INTERNAL" can be used to mean the text of all the message's
        headers that could only have been added by internal relays.
        "ALL-UNTRUSTED" can be used to mean the text of all the message's
        headers that may have been added by untrusted relays. To make this
        pseudo-header more useful for header rules the 'Received' header
        that was added by the last trusted relay is included, even though it
        can be trusted.
        "ALL-EXTERNAL" can be used to mean the text of all the message's
        headers that may have been added by external relays. Like
        "ALL-UNTRUSTED" the 'Received' header added by the last internal
        relay is included.
        "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
        headers.
        "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
        SMTP transaction that delivered this message, if this data has been
        made available by the SMTP server.
        "MESSAGEID" is a symbol meaning all Message-Id's found in the
        message; some mailing list software moves the real 'Message-Id' to
        'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
        'Message-Id' header. The value returned for this symbol is the text
        from all 3 headers, separated by newlines.
        "X-Spam-Relays-Untrusted" is the generated metadata of untrusted
        relays the message has passed through
        "X-Spam-Relays-Trusted" is the generated metadata of trusted relays
        the message has passed through

    $status->get_uri_list ()
        Returns an array of all unique URIs found in the message. It takes a
        combination of the URIs found in the rendered (decoded and HTML
        stripped) body and the URIs found when parsing the HTML in the
        message. Will also set $status->{uri_list} (the array as returned by
        this function).

        The returned array will include the "raw" URI as well as "slightly
        cooked" versions. For example, the single URI
        'http://%77&#00119;%77.example.com/' will get turned into: (
        'http://%77&#00119;%77.example.com/', 'http://www.example.com/' )

    $status->get_uri_detail_list ()
        Returns a hash reference of all unique URIs found in the message and
        various data about where the URIs were found in the message. It
        takes a combination of the URIs found in the rendered (decoded and
        HTML stripped) body and the URIs found when parsing the HTML in the
        message. Will also set $status->{uri_detail_list} (the hash
        reference as returned by this function).

        The hash format looks something like this:

          raw_uri => {
            types => { a => 1, img => 1, parsed => 1, domainkeys => 1,
                       unlinked => 1, schemeless => 1 },
            cleaned => [ canonicalized_uri ],
            anchor_text => [ "click here", "no click here" ],
            domains => { domain1 => 1, domain2 => 1 },
            hosts => { host1 => domain1, host2 => domain2 },
          }

        "raw_uri" is whatever the URI was in the message itself
        (http://spamassassin.apache%2Eorg/). Uris parsed from text will be
        prefixed with scheme if missing (http://, mailto: etc). HTML uris
        are as found.

        "types" is a hash of the HTML tags (lowercase) which referenced the
        raw_uri. *parsed* is a faked type which specifies that the raw_uri
        was seen in the rendered text. *domainkeys* is defined when raw_uri
        was found from DK/DKIM d= field. *unlinked* is defined when it's
        assumed that MUA will not linkify uri (found in body without scheme
        or www. prefix). *schemeless* is always added for uris without
        scheme, regardless of linkifying (i.e. email address found in body
        without mailto:).

        "cleaned" is an array of the raw and canonicalized version of the
        raw_uri (http://spamassassin.apache%2Eorg/,
        https://spamassassin.apache.org/).

        "anchor_text" is an array of the anchor text (text between <a> and
        </a>), if any, which linked to the URI.

        "domains" is a hash of the domains found in the canonicalized URIs.

        "hosts" is a hash of unstripped hostnames found in the canonicalized
        URIs as hash keys, with their domain part stored as a value of each
        hash entry.

    $status->add_uri_detail_list ($raw_uri, $types, $source, $valid_domain)
        Adds values to internal uri_detail_list. When used from Plugins,
        recommended to call from parsed_metadata (along with
        register_method_priority, -10) so other Plugins calling
        get_uri_detail_list() will see it.

        "raw_uri" is the URI to be added. The only required parameter.

        "types" is an optional hash reference, contents are added to
        uri_detail_list->{types} (see get_uri_detail_list for known keys).
        *parsed* is default is no hash given. *nocanon* does not run
        uri_list_canonicalize (no redirector, uri fixing). *noclean* skips
        adding uri_detail_list->{cleaned}, so it would not be used in "uri"
        rule checks, but domain/hosts would still be used for URIBL/RBL
        purposes.

        "source" is an optional simple string, only used for debug logging
        purposes to identify where uri originates from (default: "parsed").

        "valid_domain" is an optional boolean (0/1). If true, uri will not
        be added unless hostname/domain is in valid format and contains a
        valid TLD. (default: 0)

    $status->clear_test_state()
        Clear test state, including test log messages from
        "$status->test_log()".

    $status->got_hit ($rulename, $desc_prepend [, name => value, ...])
        Register a hit against a rule in the ruleset.

        There are two mandatory arguments. These are $rulename, the name of
        the rule that fired, and $desc_prepend, which is a short string that
        will be prepended to the rules "describe" string in output reports.

        In addition, callers can supplement that with the following optional
        data:

        score => $num
            Optional: the score to use for the rule hit. If unspecified, the
            value from the "Mail::SpamAssassin::Conf" object's "{scores}"
            hash will be used (a configured score), and in its absence the
            "defscore" option value.

        defscore => $num
            Optional: the score to use for the rule hit if neither the
            option "score" is provided, nor a configured score value is
            provided.

        value => $num
            Optional: the value to assign to the rule; the default value is
            1. *tflags multiple* rules use values of greater than 1 to
            indicate multiple hits. This value is accessible to meta rules.

        ruletype => $type
            Optional, but recommended: the rule type string. This is used in
            the "hit_rule" plugin call, called by this method. If unset,
            *'unknown'* is used.

        tflags => $string
            Optional: a string, i.e. a space-separated list of additional
            tflags to be appended to an existing list of flags in
            $self->{conf}->{tflags}, such as: "nice noautolearn multiple".
            No syntax checks are performed.

        description => $string
            Optional: a custom rule description string. This is used in the
            "hit_rule" plugin call, called by this method. If unset, the
            static description is used.

        Backward compatibility: the two mandatory arguments have been part
        of this API since SpamAssassin 2.x. The optional *name=<gt*value>
        pairs, however, are a new addition in SpamAssassin 3.2.0.

    $status->create_fulltext_tmpfile (fulltext_ref)
        This function creates a temporary file containing the passed scalar
        reference data (typically the full/pristine text of the message).
        This is typically used by external programs like pyzor and dccproc,
        to avoid hangs due to buffering issues. Methods that need this,
        should call $self->create_fulltext_tmpfile($fulltext) to retrieve
        the temporary filename; it will be created if it has not already
        been.

        Note: This can only be called once until
        $status->delete_fulltext_tmpfile() is called.

    $status->delete_fulltext_tmpfile ()
        Will cleanup after a $status->create_fulltext_tmpfile() call.
        Deletes the temporary file and uncaches the filename.

    all_from_addrs_domains
        This function returns all the various from addresses in a message
        using all_from_addrs() and then returns only the domain names.

SEE ALSO
    Mail::SpamAssassin(3) spamassassin(1)