Saturday, January 05, 2013

Setting up MySQL + Postfix + Dovecot to do Gmail-like 'youremail+whatever@domain.com' plus multiple delimiters

When I set up my Postfix + Dovecot + MySQL installation, I wanted GMail-like filtering for my domain. GMail allows you to do 'youremail+whatever@gmail.com' and it will automatically be delivered to 'youremail@gmail.com' with the label 'whatever'. From there, it can be filtered into the folder of your choice. I figured something similar would be very useful when registering on websites where I'm not necessarily wanting to use a mailinator address but do want to track whether they sell my e-mail address or not. Well, I thought doing the same thing would be useful, but more on the difficulties I've encountered with special characters like '+' in a bit.

In MySQL, I have 'virtual_aliases' and 'virtual_users' tables. If I remember correctly, Postfix first attempts to find an e-mail address in users, then aliases, then tries again without the extension specified by 'recipient_delimiter'.

To set up Postfix with a 'recipient_delimiter', open up '/etc/postfix/main.cf' and add:

recipient_delimiter = +

Save the file. I seem to recall that 'recipient_delimiter' only works with aliases, not mailboxes, so that might explain why I had to add the 'virtual_mailbox_maps' value to the 'virtual_alias_maps' line. Or maybe I'm confused. I set all of this part up almost two years ago, so the memory's a bit rusty because I've been busy working on my software products.

Now open up '/etc/postfix/master.cf' and change the dovecot delivery line from '-d ${recipient}' to '-d ${user}@${nexthop}'. Here are the full lines:

dovecot unix - n n - - pipe
flags=ODRhu user=vmail:vmail null_sender= argv=/usr/lib/dovecot/deliver -c /etc/dovecot/dovecot.conf -f ${sender} -d ${user}@${nexthop}

Save the file. Run the whole 'postfix reload' thing. At this point, Postfix and Dovecot will handle e-mail just like GMail does. However, as I said earlier, the '+' symbol doesn't work very well. You'll quickly discover, as I have, that there are a lot of broken web forms out there that won't accept it because web developers use regular expressions to "validate" e-mail addresses. E-mail addresses are so complex that regular expressions don't cut it. Since there are a billion different regexes out there for doing broken "validation", I wanted to use "multiple recipient delimiters" with my Postfix + MySQL setup. Scouring Google, I found exactly one post that started to deal with this issue, but only barely. Their solution was to create a new MySQL-based 'virtual_alias_maps' entry that runs the query:

select concat(replace(left('%s', length('%s') - instr(reverse('%s'), '@')), '_', '+'), '@', reverse(substring_index(reverse('%s'), '@', 1))) "goto" from domain where name = reverse(substring_index(reverse('%s'), '@', 1)) and active = 1 and instr(left('%s', length('%s') - instr(reverse('%s'), '@')), '_') > 0

Yeah, my eyes bleed too. It is an interesting approach but they are replacing every '_' character with a '+' character such that 'my_user_name_extension@domain.com' becomes 'my+user+name+extension@domain.com' and Postfix will then look up 'my@domain.com' and not find anything when it should have looked up 'my_user_name@domain.com'. It also risks potential delivery issues to random addresses depending on how Dovecot is set up. It seems that a better approach would be to first ignore any address that already contains a '+' symbol, since Postfix handles that natively. Then, make sure the address part contains the current symbol. Finally, locate the first instance of the current symbol and construct a new string excluding everything after it up to the domain part and look THAT address up in the appropriate table. This approach will work with small and large databases:

SELECT dest FROM virtual_aliases WHERE INSTR('%s', '+') = 0 AND INSTR('%s', '@') > 0 AND INSTR(LEFT('%s', CHAR_LENGTH('%s') - INSTR(REVERSE('%s'), '@')), '_') > 0 AND source = CONCAT(SUBSTRING_INDEX(LEFT('%s', CHAR_LENGTH('%s') - INSTR(REVERSE('%s'), '@')), '_', 1), '@', REVERSE(SUBSTRING_INDEX(REVERSE('%s'), '@', 1))) AND active = 1

MY EYES! THEY BLEEEEED! I feel like I've done an injustice to the Internets after writing that. Let's break it down. Basically, it returns a 'dest' (this is the aliases table after all) if it finds a matching 'source' based on string manipulation that strips the first match of a specific symbol plus the extension. In this case, it is looking for the '_' character. But, before it even does a lookup, it verifies that the string being searched for doesn't contain a '+' and does contain both a '@' and a '_' for sanity checking purposes. MySQL will resolve the INSTR() and CONCAT() mess first before doing anything else. Then it will start looking at the database table. If there is a MySQL index on the 'source' column, then this will be a super-fast, index-based lookup regardless of database size. But good grief, look at that awful mess again! I don't think Postfix will let me have that on multiple lines to make it potentially readable.

Okay, now let's assume we want some additional flexibility with '_'. Let's say that we want to remove everything after the last underscore instead of the first underscore (e.g. so 'my_user_name_extension' becomes 'my_user_name' instead of just 'my'). Postfix will execute things in order and stop when it gets a non-empty response, so the above query could be run first and then this can be run second to catch this new special case:

SELECT dest FROM virtual_aliases WHERE INSTR('%s', '+') = 0 AND INSTR('%s', '@') > 0 AND INSTR(LEFT('%s', CHAR_LENGTH('%s') - INSTR(REVERSE('%s'), '@')), '_') > 0 AND source = CONCAT(REVERSE(SUBSTRING_INDEX(REVERSE(LEFT('%s', CHAR_LENGTH('%s') - INSTR(REVERSE('%s'), '@'))), '_', 1)), '@', REVERSE(SUBSTRING_INDEX(REVERSE('%s'), '@', 1))) AND active = 1

If you don't see the difference, look for the additional REVERSE() calls. Okay, the above might not actually work since I didn't try it out and I think the SUBSTRING_INDEX() call will pull off the part I want to remove instead of getting rid of it. I'm using the first query, but not this one. My head hurts from looking at this mess. If you fix the second query, let me know. It might just be better for 'my_user_name' to use a different delimiter such as a hyphen and then have a '-' based query because doing this could potentially introduce some weird issues.

Each query has to be in its own file and you'll want to check the 'users' table as well. So, to do both queries above for each character for both tables, you'll need four distinct files on the file system. This gets to be rather messy if you want to do a combination of characters such as '+', '_', '-', and '.' which results in a lot of distinct files (and potentially lots of distinct MySQL connections?). I suppose you could make one file and use OR with the query but you risk losing index functionality (the OR keyword tends to have the effect of doing full table scans) and the SQL query would become a vomitastic monstrosity. Actually, if each file means a separate database connection, it might be faster to not worry about the index. I honestly don't know which will be faster. I assume indexes will be faster and that Postfix intelligently maintains connections to MySQL, but I don't know and I don't really want to go digging around source code to find out. Postfix runs a ton of SQL queries per e-mail already but it is preferable to run a lot of fast queries instead of a few slow ones. At any rate, make it work first with semi-readability, worry about performance issues later. Don't forget to 'postfix reload' and be sure to check file permissions so that Postfix can read the file.

If you do use this query, please document the file(s) so someone (including you) viewing the query later doesn't have a "what in the..." head-scratching moment.

Disclaimers: I have never applied this many MySQL functions to a single query before. My knowledge of MySQL functions is fairly limited. I call CHAR_LENGTH() because it seems more Unicode-friendly than just plain LENGTH(). I also use the specific table for aliases instead of the domain as in the original example because, looking elsewhere on this topic, I've learned that Postfix likes 'specific' rather than 'general' definitions and not creating a potential exploitable delivery vulnerability in the system is important. I blame the authors of Postfix for this major oversight in their product that it requires a hack in MySQL to fix a problem that is clearly within their domain and capability to fix by simply allowing multiple delimiters. I'm also not responsible for any heart attack, stroke, stress, panic, heart palpitations, sweating, shock, bleeding from the eyes, weight gain, weight loss, hair loss, mutation, rage quit, and other medical disorders as a result of this blog post.

1 comment:

  1. In case anyone else came searching for multiple demilters, postfix now (at Oct 2016) supports them out of the box.
    Set
    recipient_delimiter = +_
    or
    recipient_delimiter = +-

    whatever you need. Works for me anyway.

    ReplyDelete