I’ve seen many examples of how to implement greylisting in exim, many use a MySQL or even PostgreSQL database, the majority of which take too complex an approach to what is a simple problem.
See Exim greylisting GitHub repo for an example
Grey-listing is a practice of “trust no-one”, so any address neither white-listed or black-listed is considered “grey” rather than the default “white” of a standard email server.
Spammers do not usually use a real mail server through which to send their spammy content, because a real correctly configured mail server would leave it’s real hostname, present only allow genuine email addresses etc. Instead, spammers use spamming scripts or tools which parse large lists of email addresses and connect to mail servers in the hope that the mail server’s configuration won’t spot that the mail has been spoofed.
Because Spammers are rarely using real mail servers, their scripts don’t often speak “real mail server lingo”, a genuine mail server can be told by a recipient server that it is currently too busy to take delivery of your mail and ask you retry delivery again in a few minutes. This is how greylisting works. By saying “I’m too busy right now, try back in a few minutes” you will cause a spammer’s scripts to move on to the next email address. Rarely will a spammers script understand why you disconnected, whilst a real email server will try again later.
To implement greylisting you need to store the information from a mail server when it first talks to you. This is why people sometimes use databases, but that’s just unnecessarily complex.
There are just three things that you need so that you can recognise a re-delivery. Only if your mail server has seen this combination before will it let the mail through. Otherwise, the mail server should be told that you are too busy.
Three items you need to store/recognise are:-
1) The IP address (or network address) of the originating mail server
2) The email address of the sender
3) The email address of the recipient.
I’ve seen algorithms that do not use item 2, but that’s a bad idea. If the spammer is running his script twice, and they are generating random outgoing email addresses, you could receive a spam that you would otherwise have filtered, as if the sending email was different the 2nd time around that would be the first time you saw the combo and so it would be grey-listed and told that you were too busy once again.
So I use a directory and create a filename which is made up of the three parts shown above.
Each time we get a connection, we check the three items to see if the filename already exists in the directory, if it does we accept the mail otherwise we write the filename and defer the mail. This happens forever.
# Grey Listing warn set acl_m_greyfile = /etc/virtual/greylist/${length_255:\ ${sg{$sender_host_address}{\N\.\d+$\N}{}},\ ${sg{$sender_address,$local_part@$domain}{\N[^\w.,=@-]\N}{} }} defer message = Deferred: Temporary error, please try again later log_message = greylisted sender_domains = !+local_domains sender_domains = !+relay_domains domains = !+skip_rbl_domains !authenticated = * condition = ${if exists{$acl_m_greyfile}\ {${if >{${eval:$tod_epoch-\ ${extract{mtime}{${stat:$acl_m_greyfile}} }}\ }{180}{0}{1}}\ }{${if eq{${run{/bin/touch $acl_m_greyfile} }}{}{1}{1} }} }
Of course, eventually, you would have a lot of filenames so you might want to write a cronjob to clean-up file-names older than say 7 days.
Something like:-
find /etc/virtual/greylist -type f -mtime +7 -delete
In the above example algorithm, written for the DirectAdmin control panel, we don’t check email that belongs to local domains, or domains that we agree to relay from, nor domains that we don’t want to check against Real-time Blackhole Lists, but of course, you could change the ACLs for your own use.
Here will not allow a redelivery again within the first 180 seconds, and our filename combo is being written to the /etc/virtual/greylist/filename.
In our tests, we found that reading and writing filenames in this way were, in fact, faster than using a database, but most of all it’s a lot easier.