NeoMutt: filtering email with Sieve

Focus is the most precious resource we have, and this is particularly true of email where the sender bears less cost than the receiver. To reclaim some focus this post covers filtering out email using GNU Mailutils' Sieve. The nice part of using Sieve is that it's a standard and by using free software on our own machines we won't have to recreate filters if we move to another mail provider. The weakness is that there's only a couple of implementations, but I found Mailutils' Sieve has met all my needs and it's extensible.

There's a few different approaches to filtering and tagging, but for my straightforward requirements Sieve is both lightweight and easy to use. As I'm using mbsync to download email to my local maildir folders, I need a solution that can be run from the command line as part of my email checking script. The most known implementation of Sieve is Dovecot's Pidgeonhole but it requires a full Dovecot IMAP server install - which seems over the top for simply filtering some email into a folder! Eventually, I discovered that GNU Mailutils has a Sieve implementation.

In this post I'm going to introduce Sieve (RFC 3028) and how it fits into the rest of my email system. For a deeper dive into the Sieve language I've put together a Sieve resources page which goes through the tests, flow control, match types and actions that the Sieve standard provides, along with examples.

This is part 6 of a series of posts on NeoMutt, which has become a series on command line email and supporting tools! Check out the rest of the series on the NeoMutt page.

Installing Mailutils

Mailutils is widely available in Linux distributions. For Guix, install it:

guix package --install mailutils

Lower down in this post we'll cover how to use it within a mail checking script, but for testing we'll call it manually.

Running Sieve

GNU Mailutils is a collection of mail servers and utilities that the project page calls "a Swiss army knife of electronic mail handling". Sieve is just one utility in the set.

Sieve can be run from the command line like this:

$ guix shell mailutils
[env]$ sieve --mbox-url=maildir:///home/<user>/.mail/INBOX --dry-run --verbose ~/bin/sieve-script1.sv

This will run the Sieve script ~/bin/sieve-script1.sv` against the mailbox specified in ``--mbox-url.

The URL format maildir:// is one of the best features of Mailutils. It means we can convert between mbox or maildir very easily. It's particularly important for playing with Sieve as it's easy to create a temporary mailbox:

movemail --verbose --max-messages=20 mbox://home/<user>/.mail/some.mbox \
      maildir:///home/<user>/.mail/<test-account>/INBOX

The movemail utility is part of Mailutils. This example moves messages from an mbox file to a maildir directory. Bear in mind this is moving not copying!

Sieve Introduction

The Sieve standard describes a mini-language for defining rules that take action on emails.

To use Sieve we create a test file which consists of a set of tests. The sieve tool checks the test file against the email in a mailbox. Each email is checked against the tests in the test file. If a test passes then the actions in that particular test are undertaken. In some cases the action section of a test will tell Sieve to cease checking further tests, but the default is that Sieve will continue to check the email against all tests in the test file. Generally the test file is called a "sieve script", but for clarity it's just a list of tests not a shell script or a program script.

The basic structure of a sieve test is:

if <test>
{
  # some comment
  <action_1>;
  [action_2];
}

For example, to check for email from a particular address:

if address :is "from" "bsmith@example.com"
{
  keep;
  stop;
}

In this example the if is a command, which runs the test. The address is a test to look at the address in the email: it's told to look at the "From:" header in the email. The :is is a match type which specifies how the from line will be checked, it tells the test that it has to exactly find "bsmith@example.com" - there are other match_types like finding a substring in the field.

If it finds the address in the From field then the test passes and the Action section of the rule is applied. The action section is surrounded by braces. The first action is to keep the email, this is the default action (it leaves the email in the mailbox). Notice that the action ends with a semi-colon. The second action is stop which stops it checking further rules - if this wasn't there then sieve would process the email with the next test, and all subsequent tests, all the way to the end of the sieve script.

It's also worth knowing that all email headers in tests are case insensitive - where we specified the header it could have been "from", "From", "FROM", FrOM" - they are all the same.

Often it's useful to have a list of things to test or use, for example:

if address :localpart :is "to" ["bob+newsletter1", "bob+newsletter2"]
{
    fileinto "newsletters";
    stop;
}

This uses the if command again (we'll see it a lot!). The test is the address test, which looks for email addresses in the structured headers. The match-type of :is precisely matches. In this case the address test has a parameter of :localpart which is the part in front of the at symbol - so the test is looking precisely for the part in front of the at symbol. It's provided with a string list which is the part inside square brackets. A list is always in square brackets with commas between items, and they can be used in tests and other places.

The result is that if someone sends us an email with "bob+newsletter1" or "bob+newsletter2" this test will return true. Notice that because it's a precise match, if the email was to "bob+newsletter-something" or any other variation that's not the two tested for, then the test will be false. The action block, uses fileinto which as you might imagine files an email into a folder.

As GNU Mailutils is designed to be extensible we have to tell it which extensions to load. This is done (most commonly) at the top of the Sieve script using the require keyword. The require directive can accept either a string, or a list (square brackets with commas between each item):

require ["fileinto", "variables"];

if address "to" "shop@mydomain.com"
{
    fileinto "shop-folder";
    stop;
}

This example loads the fileinto and variables extensions, it uses fileinto in the action section, and we'll learn about variables in a moment.

This test used the address test again, and it's generally the best one to start with. But, there are a variety of different tests and it's possible to test using an external script. The most common ones are address which looks for email addresses, and header which looks at all fields in the email headers.

As we've seen the default action is to keep an email, so if all the tests in a sieve script return false then nothing will happen to the email. The most common actions used are fileinto for filing into a folder. But, there are lots of other ones such as reject and addheader.

The variable extension is useful if you file email into multiple different folders, it lets you create and use variables which can be used in tests or actions.

require ["fileinto", "variables"];

set folder "maildir:///home/<user>/.mail/example-account/";
set trash "maildir://${folder}/trash";

if address "from" [list@mailinglist.com, updates@updates.com]
{
  fileinto "${folder}/mailing-lists";
  stop;
}

In this example first we load the extensions using a list and the require keyword. Then we create two variables using set. The first is called folder which points to the base path for the example account. The second one (called trash) uses the previously set folder (${folder}) variable to create the longer path. Finally, the action in the test also uses the folder variable, so when it runs it will expand to the full path.

Generally, we want to make a test as precise as possible, the more precise it is the less likely it is to catch the wrong email. The most precise match type is :is, but there are also some others to use if matching is more complicated (matches and regex).

Another way to make tests precise is to look at more than one field, for example a test could look at both the "To" field and a special header. For this there is test logic which enables multiple tests:

if allof ( address :is "from" "myboss@example.com",
           header :contains "subject" "URGENT" )
{
    fileinto "urgent-mail";
    stop;
}

In this example, the test logic is allof which means that all tests must pass. There are two tests defined, inside brackets and each test is separated by a comma. The first test is whether the address from field specifically matches "myboss@example.com". The second is a header test that looks at the Subject and tests whether the word "URGENT" is somewhere within it. If both of these tests return true then the action block happens - which files the email into the folder urgent-email.

For a quick test the Fastmail Sieve Test is useful - just cut-n-paste in an email you want to filter and create a test for it.

There's lots more you can do with Sieve, but hopefully this quick tour should be enough to get you started!

Tips

As I've been playing with Sieve, the main tips I've discovered are:

  • Test on a test account
    I created a test folder and the copied email into it to test my ideas.
  • Save all email to a backup folder (rsync!)
    Even when certain about a script, saving all email to an all-mail folder is useful. I'm using rsync for that.
  • Use the most precise test possible
    To avoid accidentally catching the wrong emails be as precise with matches as possible. This can be a mixture of choosing the right test, match type and using test test logic.
  • Use address rather than header
    In a lot of cases tests are checking some form of address. For this address is more precise than using header. The address test will look at From, To, CC, Bcc, Sender, Resent-From and Resent-To.
  • Use :is over others
    The :is match type is the most precise as it's an exact match. The next best option is :matches or :contains but the shorter the string it's looking for the more likely there will be mismatch.
  • Use test logic for multiple tests
    The alloff test logic is a really useful way to make matches more precise.
  • Use the stop action to prevent further processing
    The stop action tells Sieve that it shouldn't evaluate the email against further tests. This is really useful as it's clear why an email was dealt with in a particular way. The only time I haven't done this is when I need to copy an email rather than move it to a folder.
  • Use variables to make tests and actions easier
    The ability to use a variable in tests and actions can make it much easier to keep everything organised!
  • The pipe test and action provides a lot of flexibility
    GNU Mailutils has a pipe test and action which can be used to process email in other ways. If all else fails this might be a way. The main constraint is that it can only process a single email.

More Sieve

The Sieve resources page covers the majority of the Sieve language options, split into:

Sieve Tests:

  • True & False: automatic true or false
  • Exists: whether a header exists
  • Address: any structured address header
  • Header: any field in the header
  • Envelope: the smtp envelope
  • Size: test the size of an email
  • Timestamp: test a date
  • Pipe: the output of an external command
  • Spamd: the result from running the spamd command
  • List: test more than one field

Sieve test logic and flow control:

  • allof - all tests must pass
  • anyof - one test must pass
  • not - reverse the meaning of the test
  • if / elsif /else - test flow control
  • stop - don't check this email against further tests

Sieve Actions:

  • stop: don't process this email through any further rules
  • keep: keep the email that's being processed (the implicit action)
  • fileinto: file the email into a specified folder
  • discard: delete the email
  • reject: reject the email and send back a message to the sender
  • redirect: redirect (bounce) the email to another address
  • pipe: run a command or script
  • variables: use variables within the sieve script
  • vacation: send an email to the sender informing them that we're not reading email

Automating Sieve

Having defined a Sieve script that works it's time to integrate it with the rest of the mail checking script. The way I did this was to extend the email-checker.sh file from last time to add a Sieve function.

#!/usr/bin/env bash
exec guix shell guile isync rsync mailutils mblaze goimapnotify guile-readline guile-colorized \
    -- guile -e main -L /home/<user>/.config/guix/current/share/guile/site/3.0/ -s "$0" "$@"
!#

(use-modules (ice-9 exceptions)
             (srfi srfi-19) ;for 'date->string'
             (srfi srfi-34) ;for 'guard'
             (guix build utils)) ;for 'invoke'

;; CHANGE: set to a log file location
(define* log-port (open-file "~/.msmtp/log/email-checker.log" "a"))

(define* (start-logger)
 (write-log "Starting logging")
 (set-current-output-port log-port)
 (set-current-error-port log-port))

(define* (write-log msg)
  (format log-port "~a: ~a \n" (date->string (current-date) "~5") msg)
  (force-output log-port))

(define* (end-logger)
 (write-log "Stopping logging")
 (close-port log-port))

(define (mbsync_f)
  ;; CHANGE: set to your own mbsync channel
  (let* [ (cmd_mbsync "mbsync <channel>")
          (exit_status (system cmd_mbsync)) ]
    (cond
        ( (= exit_status 0)
            (write-log
              (format #f "Successfully ran mbsync. Command: ~s returned: ~y~%"
                 cmd_mbsync exit_status)))
        ( (>= exit_status 1)
            (write-log
              (format #f "Error! Command: ~s. Result returned: ~y~%"
                 cmd_mbsync exit_status))
            (error "Error executing mbsync: " exit_status)) ;exit with error
     ) ;end of cond
   )) ;end of func

(define (rsync_f)
  ;; CHANGE: set to the location of email to be backed-up
  (let* [ (from_f "/home/<user>/.mail/<account>/INBOX")
          (to_f "/home/<user>/.mail/<account>/backup-all/")
        ] ;let definitions

      (define exit_status_e (file-exists? (dirname to_f)))
      (cond
        ( (equal? exit_status_e #t)
              (write-log "Rsync: directory to rsync into exists"))
        ( (equal? exit_status_e #f)
            (write-log "Error! Directory to rsync into doesn't exist!")
            (error "Error directory for back-up doesn't exist!"))
      ) ;end of cond

      (guard (e
               ((exception? e)
                 (format (current-error-port) "Error backing up email. Exception: ~y~%"  e)
                 (error "Error backing up email" e)
                 )
            ) ; end of cond
        (invoke "rsync" "-avvz" "--delete" "--progress" "--itemize-changes" from_f to_f)
        (write-log "Successfully backed-up all mail.")
      ) ; end of guard
    )) ; end of rsync_f func

(define (sieve_f)
  ;; CHANGE: set mail folder location and sieve script location
  (let* [ (cmd_sieve "sieve")
          (mail_folder "/home/<user>/.mail/test-account")
          (mbox_loc (string-append "--mbox-url=" mail_folder "/INBOX"))
          (verbose "--verbose")
          (sieve_script "/home/<user>/.config/neomutt/sieve-test.sieve")
        ]

    (guard (exception
             ((exception? exception)
                (format (current-error-port) "Error with guix folder. Exception ~y~%" exception)
                (error "Error creating guix folder" exception)))
      (invoke "mmkdir" (string-append mail_folder "/guix"))
      (write-log "Guix folder exists for filtering mail into")
    ) ;end of guard

    (invoke cmd_sieve mbox_loc verbose sieve_script)
    (write-log "Successfully ran sieve script on mail_folder")
  )) ; end of sieve_f

(define (main args)
    (start-logger)
    (write-log "Starting e-mail checker")

    (mbsync_f)
    (rsync_f)
    (sieve_f)

    (write-log "Starting goimapnotify")
    (invoke "goimapnotify")
    (write-log "Successfully ran goimpapnotify")
    (exit EXIT_SUCCESS)
) ;end of main

The script is labelled with "CHANGE" in a few locations where configuration details such as the locations of mail folders are required. As in the previous version we run some initial functions to sync email (using mbsync) and then it starts goimapnotify. The new function rsync_f is used to backup all mail that's in INBOX, it was much easier to do it using rsync than sieve. The second new function is sieve_f and as we can see it runs a specified Sieve script over the INBOX which contains new email. The result is that mbsync will pull down new email (we back it up) and Sieve will filter it.

When Goimapnotify starts it watches for new emails in the remote INBOX, and when it sees some it runs mbsync to download them. So we also need to run sieve when it does that download. The goimapnotify.yaml commands look like this:

boxes:
  -
    mailbox: INBOX
    onNewMail: 'guix shell isync mailutils bash-minimal -- bash -c "mbsync <mychannel>"'
    onNewMailPost: 'guix shell isync mailutils bash-minimal -- bash -c \
      "sieve --mbox-url=/home/<user>/.mail/<account>/INBOX/ --verbose ~/.config/neomutt/<account-sieve-script>.sieve"'
    onDeletedMail: guix shell isync mailutils bash-minimal -- bash -c "mbsync <mychannel>"'
    onDeletedMailPost: 'guix shell isync mailutils bash-minimal -- bash -c \
      "sieve --mbox-url=/home/<user>/.mail/<account>/INBOX/ --verbose ~/.config/neomutt/<account-sieve-script>.sieve"'

The onNewMail command runs when new email is detected, there's no change from previously, it's just an mbsync of the channel. The onNewMailPost is run after the new mail is downloaded, this is where we run the sieve script.

The onDeletedMail is triggered if an email is deleted on the remote server (say from the user dealing with email on their phone), and again the onDeletedMailPost is called after any change to the mailbox has been synchronised.

Panning for Gold

As I said at the start attention, focus and time are the most precious commodity - perhaps more than gold. Email has a justified reputation of being a time-sink, yet it's also the most widely used and 'federated' electronic communication we have. Figuring out how to filter email using Sieve has taken a lot more time (and effort, and focus) than I expected! But, now I have everything needed to filter out the dross and focus on the real 'gold' of emails from people!

This post was designed to be a quick summary of Sieve and how to fit it into the email system that the rest of the NeoMutt series has covered. For more detail, don't forget there's my deep dive page GNU Mailutils Sieve.

And, if you found this post useful, or perhaps there's something that was a bit confusing, or you'd just like to give feedback on the series - feel free to email me or contact me on Mastodon (@futurile).


Posted in Tech Saturday 07 June 2025
Tagged with tech ubuntu guix email neomutt isync mbsync goimapnotify mailutils sieve