Sunday, April 07, 2013

Setting up WPA2-Enterprise + AES with Ubuntu 12.04.2 + FreeRADIUS with EAP-TLS only (The Definitive Guide)

After spending a LOT of time researching, waiting, more researching, and almost giving up on WiFi, I've finally figured out a secure enough WiFi setup that I think has a pretty good chance of standing up to scrutiny. It is called EAP-TLS and it is serious Kung Fu (aka REAL security). Unfortunately, to implement said Kung Fu, the protocol requires a RADIUS server. And, to get said RADIUS server at an affordable price, FreeRADIUS is needed and therefore Linux is necessary. And the easiest Linux to use is (usually) Ubuntu. But, after scouring the Internets, I've also determined that there are NO good tutorials on setting up a basic FreeRADIUS EAP-TLS system at home under Ubuntu 12.04.2 using the apt-get packages for FreeRADIUS. This, therefore, is the definitive guide mostly cobbled together from a number of different sources.

I'm assuming a half-decent understanding of Linux command-line editing, a fresh installation of Ubuntu Server 12.04.2 LTS on a computer on the network where it will reside (usually plugged into the upstream router), an el-cheapo router with WPA2-Enterprise AES support (some "consumer grade" routers have this baked in), and a willingness to follow directions. If you really want this to work without hassle, I recommend getting an el-cheapo DD-WRT capable router from Newegg and loading DD-WRT onto it (you'll only be out $25). DD-WRT implements RADIUS correctly and the authors actually test it, which is kind of critical. Without further ado...

Make the Ubuntu box have a static IP address instead of being issued via DHCP. Run 'ifconfig' and 'route -n' to obtain the current network setup. Then edit '/etc/network/interfaces':

# The primary network interface
auto eth0
iface eth0 inet dhcp
To become something like:

# The primary network interface
auto eth0
iface eth0 inet static
address 192.168.1.15
netmask 255.255.255.0
broadcast 192.168.1.255
gateway 192.168.1.1
dns-nameservers 192.168.1.1
Your IP address information will be different and based on your 'ifconfig' and 'route -n' output. Be sure to fire up the router admin and locate where to reserve the Ubuntu box's IP address so DHCP doesn't hand it out to some other device on the network in the future and cause havoc. (Or just change the server IP to something else outside the DHCP address range but still in the same subnet - less fuss.)

Next, install FreeRADIUS on the Ubuntu box:

apt-get install freeradius
Next, back up all the files in the /etc/freeradius/ directory in case something goes wrong (makes it easy to restore stuff). The next steps have to more or less be completed before the server will be ready for use, so the ability to completely revert to the original config files is nice.

Next, disable proxies in '/etc/freeradius/radiusd.conf':

proxy_requests  = no
#$INCLUDE proxy.conf
And optionally comment out the "accounting port" in the same file ('/etc/freeradius/radiusd.conf'):

#  This second "listen" section is for listening on the accounting
#  port, too.
#
#listen {
#       ipaddr = *
##      ipv6addr = ::
#       port = 0
#       type = acct
##      interface = eth0
##      clients = per_socket_clients
#}
That closes at least one open port on the FreeRADIUS server that isn't usually necessary, especially for smaller networks.

Next, disable the default client in '/etc/freeradius/clients.conf':

#client localhost {
#       ipaddr = 127.0.0.1
#       secret          = testing123
#       require_message_authenticator = no
#       nastype     = other     # localhost isn't usually a NAS...
#}
Set up a new client in '/etc/freeradius/clients.conf':

client your_router_name {
       ipaddr = your_routers_ip_address
       secret = random_string
       require_message_authenticator = yes
}
The client entry is really custom for your environment and you can have more than one. The IP address, within the average home network, is the gateway IP address of the Ubuntu box. See the default client that was just commented out for details on various options. For 'secret' and other strings that should be random, use a good tool or the Random.org random string generator. Note that your router might have hard character limits (e.g. 32 bytes) while FreeRADIUS will accept just about anything.

Next, edit '/etc/freeradius/eap.conf'. Comment out all subsections inside the 'eap' block except for 'tls'. The following lines need to be changed from this:

eap {
  default_eap_type = md5
  # Supported EAP-types
  md5 {
  }
  leap {
  }
  gtc {
  }
  ttls {
  ...
  }
  peap {
  }
  tls {
    cipher_list = "DEFAULT"
    make_cert_command = "${certdir}/bootstrap"
    verify {
#      tmpdir = /tmp/radiusd
#      client = "/usr/bin/openssl verify -CApath ${..CA_path} %{TLS-Client-Cert-Filename}"
    }
  }
  mschapv2 {
  }
}
Into something more like this:

eap {
  default_eap_type = tls
  # Supported EAP-types
#  md5 {
#  }
#  leap {
#  }
#  gtc {
#  }
#  ttls {
#  ...
#  }
#  peap {
#  }
  tls {
    cipher_list = "HIGH"
#    make_cert_command = "${certdir}/bootstrap"
    verify {
      tmpdir = /var/tmp/radiusd
      client = "/usr/bin/openssl verify -CApath ${..CA_path} %{TLS-Client-Cert-Filename}"
    }
  }
#  mschapv2 {
#  }
}
Options not mentioned are to be left alone (except new protocols). The CA private key password will be used later when the CA is set up. The above mostly just disables EAP-MD5, LEAP, EAP-GTC, EAP-TTLS, PEAP, and EAP-MSCHAPv2 (all of those are broken, weak, a bad idea, or a combination of those words), commenting out the bootstrap line (as documented in the config), and setting the allowed ciphers to something rational/sane. Only EAP-TLS is truly secure and all major OSes (including Windows) support it.

Next, create '/var/tmp/radiusd' as root:

mkdir /var/tmp/radiusd
chown freerad /var/tmp/radiusd
chgrp freerad /var/tmp/radiusd
chmod 700 /var/tmp/radiusd
Next, disable the default virtual servers:

cd /etc/freeradius/sites-enabled/
rm *
Next, copy the 'default' virtual server and edit the new file (change 'your_ap_name' to the router's SSID or something similar):

cd /etc/freeradius/sites-available/
cp default your_ap_name_default
vim your_ap_name_default
In the new file, comment out everything except 'preprocess', 'eap', 'expiration', and 'logintime' from the 'authorize' section. Comment out everything but 'eap' from the 'authenticate' section. Comment or delete the 'accounting' sections if you commented out the accounting port earlier.

Next, enable the new default file:

cd /etc/freeradius/sites-enabled/
ln -s ../sites-available/your_ap_name_default your_ap_name_default
Next, stop the currently running FreeRADIUS server and run FreeRADIUS in debug mode:

/etc/init.d/freeradius stop
freeradius -X
If all goes well, you should have an enabled EAP-TLS only FreeRADIUS installation with just port 1812 open, a client that is your router (access point), and the sentence "Ready to process requests" on the screen. If you get an error message, then try to fix it (paste error messages into Google searches). Once it works, press Ctrl-C to exit the server. At this point, you can breathe a sigh of relief because the only thing left is to generate SSL certificates.

Next, if you don't have it installed, install 'make':

apt-get install make
Next, remove the default certificates:

cd /etc/freeradius/certs/
rm *.pem
rm *.key
Next, copy the tools to set up FreeRADIUS certificates to a reasonable directory:

mkdir /var/certs
mkdir /var/certs/freeradius
chgrp ssl-cert /var/certs/freeradius
chmod 710 /var/certs/freeradius
cp /usr/share/doc/freeradius/examples/certs/* /var/certs/freeradius/
cd /var/certs/freeradius/
rm bootstrap
chmod 600 *
Next, clean up any previous attempts and initialize the required baseline files:

make destroycerts
make index.txt
make serial
Next, edit 'ca.cnf' and set various options to your heart's content. At the very least, change: 'md5' to 'sha1', up the 'default_bits' from 2048 to at most 4096, and 'default_days' to something longer than 365 days (unless you want to change all certificates every year). (I tried 8192 for 'default_bits', but, even though the year is 2013, I ran into TLS handshake issues - rebuilding all the certs and using 4096 bits fixed the handshake problems.) Keep various text strings anonymous. Change 'output_password' to a random string ('input_password' does not appear to be used by the Makefile but it doesn't hurt to make both passwords the same). You can always recreate the CA later if you don't like some setting.

Next, generate 'ca.pem':

make ca.pem
make ca.der
make printca
If you get an error while generating 'ca.pem', there are a number of things that could have gone wrong, but it is generally best to start over with the 'make destroycerts' command, edit 'ca.cnf' after searching Google for hints, and try again. The last command lets you confirm that the CA cert contains the information you want at the strength you want.

Next, edit 'server.cnf' and repeat the editing process.

Next, generate 'server.pem':

make server.pem
Next, edit 'client.cnf' and repeat the editing process. But this time only issue the client a cert for the default 365 days and emailAddress and commonName should be the e-mail address of the user who will use it. (If you don't want to use e-mail addresses, edit 'Makefile' and locate the 'USER_NAME' field and change " | grep '@'" to " | grep -v optional" or all files will end up named '.pem'). The official README says to use an e-mail address, but it can be anything that can also be a valid filename. Note that the user is going to have to manually enter the password used in 'output_password' to add the certificate to their certificate store. Best to keep it simple.

Actually, edit the 'Makefile' anyway (vim Makefile). Locate:

client.p12: client.crt
  openssl pkcs12 -export -in client.crt -inkey client.key -out client.p12  -passin pass:$(PASSWORD_CLIENT) -passout pass:$(PASSWORD_CLIENT)

client.pem: client.p12
  openssl pkcs12 -in client.p12 -out client.pem -passin pass:$(PASSWORD_CLIENT) -passout pass:$(PASSWORD_CLIENT)
  cp client.pem $(USER_NAME).pem
Change it to:

client.p12: client.crt
  openssl pkcs12 -export -in client.crt -inkey client.key -out client.p12  -passin pass:$(PASSWORD_CLIENT) -passout pass:$(PASSWORD_CLIENT)
  cp client.p12 $(USER_NAME).p12

client.pem: client.p12
  openssl pkcs12 -in client.p12 -out client.pem -passin pass:$(PASSWORD_CLIENT) -passout pass:$(PASSWORD_CLIENT)
  cp client.pem $(USER_NAME).pem

client_android.p12: client.crt
  openssl pkcs12 -export -in client.crt -inkey client.key -certfile ca.pem -name "$(USER_NAME)" -out client_android.p12  -passin pass:$(PASSWORD_CLIENT) -passout pass:$(PASSWORD_CLIENT)
  cp client_android.p12 $(USER_NAME)_android.p12
Make sure indented lines are 'tabs' and not 'spaces'. This copies the PKCS#12 file too (why should PEM files have all the fun), which is necessary for some OSes and adds a special case for Android, which bundles the CA certificate with the PKCS#12 file and gives it a "friendly name". Next, generate the client certificate, private key, and everything else needed:

make client.pem
make client_android.p12
You can generate as many client certificates as you need. Just open the 'client.cnf' file up and edit a few things before running the above 'make' commands. If you plan on generating lots of certs, you might want to write a script to do the editing and generating.

Assuming everything has gone well to this point, you now have Ubuntu 12.04.2 + FreeRADIUS + EAP-TLS only (with "HIGH" ciphers) and a brand new certificate authority, a server certificate, and a client certificate. But the latter aren't connected (yet). Time to put on the finishing touches:

chmod 600 *
chmod 640 ca.pem
chmod 640 server.pem
chmod 640 server.key
chgrp ssl-cert ca.pem
chgrp ssl-cert server.pem
chgrp ssl-cert server.key
cd /etc/freeradius/certs/
ln -s /var/certs/freeradius/ca.pem ca.pem
ln -s /var/certs/freeradius/server.pem server.pem
ln -s /var/certs/freeradius/server.key server.key
cd /etc/freeradius/
vim eap.conf
Note that you should run the first four lines after generating each client SSL certificate. OpenSSL likes to 'chmod' files as 644.

Locate the line in 'eap.conf' containing 'private_key_password'. This should be changed from "whatever" to the value of 'output_password' from '/var/certs/freeradius/server.cnf'. Save and close. Run 'freeradius -X' and make sure everything still works. Again, Google is your friend for resolving issues.

At this point, the Ubuntu FreeRADIUS server is set up. Come on, it wasn't THAT bad (plus I did all the digging around the configs for you). The only missing piece is your router configuration. Each router is different, but dig around the settings for a while to locate a way to switch to WPA2-Enterprise + AES (don't use Wifi Protected Setup or WPS or whatever "automatic configuration" setup your router offers). A few extra fields will show up. Plug in your Ubuntu server's static IP address and the shared secret from '/etc/freeradius/clients.conf'.

Hooray! The router and RADIUS parts are done. Time to move onto the first client. Fire up 'freeradius -X' during testing so you can debug issues more easily. Extract the client certificate (the .p12 and .pem files) from the server as well as the 'ca.der' and 'ca.pem' certificates. Install the CA cert on the client. Then install the client certificate. This might take some trial and error depending on all sorts of factors - you might have to rename the file extension to get the OS to recognize the certificates. Searching Google for "[OS name] install certificate" usually turns up relevant results. However, there are "recipes" below that you might want to try (and if you come up with a good client recipe, write it down in the comments). After that, try to connect to the WiFi router using the newly installed certificate. Both errors and successful connections will show up in the FreeRADIUS output. However, once it works, you've got the most secure WiFi on the block!

Once you are satisfied with the setup, you can terminate the command-line debug mode and do the '/etc/init.d/freeradius start' thing so that FreeRADIUS runs as a background service.

Recipes

For Windows: Use 'ca.der' and '(name).p12'. Double-click the 'ca.der' file. Click "Install Certificate..." Click "Next". Select "Place all certificates in the following store" and select "Trusted Root Certification Authorities" from the dialog. Click "Next". Click "Finish". A dialog will appear warning about the certificate, click "Yes". A dialog saying "The import was successful" will appear. Click "OK". Double-click on the '(name).p12' file. Click "Next". Click "Next". Enter the password. Click "Next". Click "Next". Click "Finish". A dialog saying "The import was successful" will appear. Next, add the WiFi network manually for WPA2-Enterprise + AES and then somewhere in the settings for the connection is an option for "Network authentication method" that defaults to PEAP. Change it to "Smart Card or other certificate". Open the "Settings" dialog. Use the "Use a certificate on this computer" option and the "Use simple certificate selection (Recommended)" option. Check the imported CA cert in the "Trusted Root Certification Authorities" box. Click "OK". Now it is possible to try to connect normally. If all goes well, the connection will be established.

For Linux (Ubuntu GUI): Use 'ca.pem' and '(name).p12'. The GUI acts a little weird because it first asks for the client public key under the WPA2 Enterprise + TLS setup. Skip that field. The 'ca.pem' file goes into the CA certificate field. Put the '(name).p12' into the private key field and type in the password. The 'Connect' button becomes available when the correct information is in the right fields.

For Android: Use '(name)_android.p12'. First get the certificate on the device by putting the certificate in the root directory of the SD card. Then, go into Settings -> Security -> Credential Storage -> Install from storage. Enter the password. A dialog will appear offering to install three things (CA public cert, client public cert, and client private key). Next go into WiFi settings and connect to the network using 802.1x EAP, the EAP method as TLS, no Phase 2 auth, use the same certificate for both CA and user certificates (will probably be the only option in the dropdown), supply the same string for the Identity and Anonymous Identity you want the device to use, and don't use a password.

Troubleshooting

If you need help, offer a geek/nerd a free dinner in exchange for their help.

Connecting just two devices together flawlessly is apparently too difficult for the software/hardware industry to figure out most of the time (and they like to abuse the "finger of blame" rather than fix the problems that exist). So imagine the complexity of three devices: Your computer/tablet/phone (multiple OSes) + your router (multiple OS flavors of varying levels of quality of feature support) + your Ubuntu box (with FreeRADIUS having a zillion configuration options). When something goes wrong, it will go spectacularly wrong and be nearly impossible to diagnose. A connection that constantly drops could be bad hardware, too many WiFi devices in range on the same channel, and a zillion other factors. Or, in my case, my first router simply couldn't handle RADIUS packets correctly - out of hundreds and hundreds of attempts by the client, it only passed along packets to the Ubuntu box maybe three times which sent FreeRADIUS into fits and spasms of "got connection, dropping connection, got connection, dropping connection, error, got connection, error, error, error". I have no idea what it was doing wrong. I found a cheap $25 router that supported DD-WRT, put DD-WRT on it, and FreeRADIUS was MUCH more reliable. Only then was I able to diagnose that the issue was with using 8192 bit certificates.

The point is that this stuff is (unnecessarily) complicated to debug. You need to have friends who are serious nerds and/or be one yourself should you run into issues and the fact is, even though this guide exists to get a specific setup working, there's a pretty good chance it won't work at all. So you have to be serious about getting this to work. If you need a nerd's help, make them a meal. Free food is decent compensation, especially if they are the slightest bit interested in a crazy EAP-TLS installation.

References:

http://www.area536.com/projects/the-toughest-wifi-on-the-block/
http://linuxtechtutorials.blogspot.com/2011/10/installing-freeradius-on-ubuntu-1110.html
http://blog.wains.be/2009/09/13/wpa2-freeradius-eap-tls/
http://www.privacywonk.net/2010/10/security-how-to-wpa2-enterprise-on-your-home-network.php
http://www.ibm.com/developerworks/library/l-wifiencrypthostapd/
http://technet.microsoft.com/en-us/network/cc917480
http://support.google.com/android/bin/answer.py?hl=en&answer=1649774
http://blog.wains.be/2011/03/13/importing-certificates-on-android-ca-and-client/
(Plus tons of digging around the configuration files for myself.)

Other thoughts: This makes for a nice multi-weekend project. Some people do woodworking on weekends. Others wrestle with Linux.

Sunday, March 31, 2013

Security is a moooooving target

Earlier today, I had a "free heart attack" when a new thread showed up on the PHP Internals list:

[RFC] more secure unserialize()

I love serialize()/unserialize() because it is nice and easy to use. Unfortunately, with ease-of-use comes greater responsibility. In this case, it is important that users can't submit their own serialized data structures to the server. When the server calls unserialize(), it expands out any data type, including objects. Upon destruction of an object, __destruct() is called by PHP automagically, which then executes whatever code is in there. The "free heart attack" I mentioned earlier came from the fact I send serialized data to the SSO client in the encrypted cookie. Fortunately, a look at the encrypted cookie code revealed I had been using json_encode() and json_decode(), which allowed me to breathe a sigh of relief. For now.

This just goes to show that security is a moving target. Or, if you are a cow, it is a mooooooving target. Failure to stay on top of the latest changes on the security front makes systems less secure.

Saturday, March 09, 2013

Just use grep! Don't use find + grep (+ xargs + whatever)...

I've been looking for an alternate to find + grep under Linux so that I can do similar queries to 'findstr' under Windows such as:

findstr /sic:"my_function" *.php

Recursive search for PHP files containing the string "my_function". The typical response to "how to search for some type of file containing some text" under Linux is usually along the lines of:

find "*.php" -exec grep -H "my_function" {} \;

Not only is that cryptic and longer and more difficult to type, it fires off a new, separate process (grep) for every PHP file it finds. I've seen a zillion incarnations of the above. Every time I run that, I end up waiting ten times longer than I would have waited for 'findstr' for the same operation. find + grep performance is terrible. So I said to myself today, "Hmmm...maybe grep has the solution already?" Lo-and-behold, it does:

grep -nir --include="*.php" "my_function" .

-nir is a mashup of the -n (line number), -i (case insensitive pattern search), and -r (recursive) flags. Depending on the system, the -H flag might be necessary as well to get filenames to display. The '--include' thing works around shell limitations that typically cause people to use 'find'. Also, the performance is comparable to 'findstr'. If you don't care about the file type and don't mind scanning images and other binary data:

grep -nir "my_function" .

But the above isn't probably what you stopped by for. I'm mostly posting it as a reminder to myself in the future. The FUTUUUUUUURRRRRE!

Monday, February 25, 2013

Need a good book to read? Read this technical novel!

If you are in the mood for a not-so-boring "technical novel" (there apparently is now such a thing), may I recommend reading this lovely 730 page book:

Engineering Security by Peter "Long-Winded" Gutmann

Reading it does require a level of technical expertise and understanding of how SSL/TLS, SSH, IPSec and a number of rather boring protocols work to truly appreciate what he has to say. For those who don't have the time to read 730 pages, I'm going to summarize:

Security, or at least the average programmer's understanding of it, is...severely lacking. We've had two decades to figure out how to not screw up security and, yet, we still find new, extraordinarily stupid ways to do so. The real problems are a lack of accountability in software development and that anyone can own a computing device without any training whatsoever.

The book then proceeds to attempt to describe fixes for the problems, but I'd wager that around page 50 or so, most readers will get bored and move on, never getting to the author's recommendations. "Since I'm not accountable as a software developer, reading this is a waste of my time. I also feel mildly insulted somehow." That will be the general consensus of the technically-literate (especially programmers).

Personally, I got to about page 40. In general, having been steeped in the security realm for as long as I have, I see where the book is going so I'm not particularly interested in finishing despite being promised moments of humor from the person who recommended reading it (haven't found anything funny yet) but it certainly isn't as dry as I was expecting (it isn't a technical manual). The main problem with this book is that it is too long and not particularly sure of who the audience is. A good author addresses the target audience directly in a manner that makes sense. Not having a target audience causes meandering.

I do recommend reading at least the first 40-ish pages. The only thing I got out of this so far is the phrase "DV certificates". Since I don't do acronyms, I had to look that up on Google and, of course, ran into SSL cert vendors with posts on their blogs saying how they don't do "Domain Validated" certs and how they aren't any better than self-signed (that last part is a bald-faced lie). The real reason is that they would have to offer DV certs for free, which would eliminate a lucrative source of income that is based on a lie. Reading the novel to this point, I've come to the singular conclusion that the DV SSL certificates from StartCom StartSSL (only vendor I know of providing them) are more secure than anything you might purchase. The specified argument against Comodo has pretty significant implications against any type of paid-for SSL cert (i.e. nearly all CAs have similar issues that should raise significant questions of trust). Domain Validated certs are impossible to spoof for a specific domain if the CA does it right. CAs that charge money for a SSL cert are interested in the money, not the validation. While StartCom's policies are too strict and their SSL signing interface rather clunky, the most important aspect of a SSL cert is the trusted path to the root, which they maintain through specific procedures that they stick to. It is also a lie that DV certs are less secure than any other type of certificate. The underlying communication protocol is what matters. If it goes over SSL/TLS and the certificate is within its validation period and that certificate traces to a valid trusted root and the certificate is validated for the specified domain being connected to, then the communication that follows can be trusted as unmodified and unheard by a third-party up to the signing level of the certificate (e.g. SHA256 fingerprinting and a 4096 bit or greater public key as of 2013 is pretty darn secure). Everything else, including organization validation, is icing on the cake. (Firefox lies when it says that a cert that is only DV validated can be eavesdropped upon! Humans are the greatest source of error whereas DV certs can be issued without any human intervention.) The only time the icing matters is if you need to validate something other than a public domain (e.g. a private domain such as an internal network). If you need that, you maintain your own CA, your own trusted root, and your own certificates. For everything else, DV certs are sufficient. If someone obtains a valid SSL/TLS certificate for one of my public domains, how are they going to spoof that owner-locked domain? They would have to spoof DNS on the same subnet as the attack. That's hard enough to do in the first place and too isolated for me to care. Plus, the FBI has SSL cert appliances, which means they have obtained the trusted root certificate private keys for at least one major CA and can generate SSL certs for any domain they so choose and probably can generate EV certs as well (i.e. browser-based PKI, including green bar, is fundamentally broken). If a hacker manages to modify my official DNS entries at the source, then I'm going to notice really fast because only an idiot doesn't have monitoring software in place to watch for network changes (but an attacker is more likely to steal my domain first rather than switch DNS, which I'm for sure going to notice). The takeaway is that DV certs are more secure because CAs will sign organization validated certs for domains that the organization doesn't even own, which is less secure than total automation, which at least guarantees someone has access to the domain itself who is requesting the cert (probably the authorized party). The lesson we should learn from this is that we need to scrap everything but DV certs and reboot every public CA from the ground-up. If you need something more secure, make your own CA and have users install the root cert if they trust you - or just do it yourself since you probably own the computer the user wants to access the resource from.

Other thoughts: Longest paragraph ever.

Monday, February 11, 2013

Extending the block size of any symmetric block cipher

Before I begin, I need to preface this with the fact that I don't consider myself to be a cryptanalyst. Coming up with a new cryptographic algorithm that is deemed strong is hard to do and really takes a team of people. I know enough to be dangerous. Therefore, what is presented here is to be viewed as merely a theory to extend the block size of any trusted symmetric block cipher without modifying the core algorithm of the cipher itself. What follows should not be assumed to be more secure than what is available today until this theory has undergone appropriate cryptanalysis.

About 20 years ago, before the Internet had entered my life, I was toying around with "encryption" and what I came up with involved some of the XOR logic stuff we see today involving block ciphers, but I also did multi-bit rotations (e.g. rcl/rcr). After the Internet entered my life, I encountered my first block cipher. I was a bit surprised to see little tiny chunks of data (8 to 32 bytes - aka "block size") with dinky keys (16 to 56 bytes) being encrypted purely with XOR, addition, and byte swapping and subsequently declared secure.

While I've generally accepted the current state of cryptography, there is a nagging problem with symmetric block ciphers and it has to do with the fact that they are primarily based on the whole XOR with byte swapping thing and further isolated into fixed-length chunks. People working with block ciphers discovered early on that what is known today as the Electronic Code Book (ECB) "mode" has problems with patterns. Initialization Vectors (IVs) and other modes (CBC, CTR, OFB, etc.) were introduced to avoid patterns in the data from appearing, but IVs and block modes are not considered to be secure. Nor do they ultimately increase the strength of the underlying algorithm and actually cause end-user confusion in this regard.

Using an IV and a non-ECB mode just makes it harder to determine the underlying data based on simple observation. For example, in CBC mode, if someone correctly guesses the encryption key but gets the IV wrong, all of the data will be correctly decrypted except for the first block. Because CBC mode is quite popular, this means that the IV can be thrown away and the key focused on - after all, the total data loss is 8 to 32 bytes, which will likely just exclude some minor header or introduction information that can be accurately guessed later if it is needed. What I'm trying to say is that IVs and modes don't matter - longer keys and larger block sizes do. Do keep in mind that cryptographically strong algorithms have been chosen very carefully, so someone can't just go in and directly alter any algorithm to accept bigger keys or block sizes, because doing so would likely introduce weaknesses.

This is the problem that I see: Symmetric block ciphers are intended to work on streams of data and do their operations as quickly as possible. Those reasons are why the amount of data worked on at any given time is only a few bytes and also why the number of rounds - that is, the number of times the core algorithm is executed per block - is generally limited to 20 or fewer. However, long-term, these same benefits will introduce weaknesses into these algorithms and finding new algorithms is a complex task that can take years of effort. The AES competition, for example, took approximately 4 years to complete before gaining approval by the NIST. We're already seeing how algorithmic speed is a significant problem in other realms (e.g. password storage). It really is just a matter of time before someone starts to find weaknesses in small block sizes and cipher speed in the realm of data encryption (assuming it hasn't been done already). Losing trust in the data encryption algorithms we rely upon results in major setbacks in data security and, as far as I know, there are no contingencies available for dealing with the potential losses - cryptanalysts tend to aim to just break things rather than come up with a strategy in advance such as, "okay, if we break this, what are the options available for those using this algorithm?" as part of a comprehensive and responsible analysis. As such, I propose a simple solution that operates on today's proven algorithms without any changes to the underlying algorithms that MAY vastly improve their long-term strength. The steps for extending block size of any trusted symmetric block cipher are defined as such:

  1. Encrypt data as we do now with a trusted block cipher.
  2. After collecting some encrypted output, move the last byte before the start of the first byte.
  3. Encrypt the result from step 2 using the same algorithm but with a different key and IV.

As an example, and to clarify the above steps, let's say I encrypt a data stream. Instead of letting the data go through after encrypting only a few bytes as is done now, I accumulate enough encrypted output to occupy 4KB. The size doesn't actually matter - it could be 1KB or 1MB, but the size should be larger than and a multiple of the algorithm's default block size but still small enough to fit into a processor cache to avoid cache misses. 4KB is a good size for today's hardware.

Next, I move the last byte to the start of the string. For example, the bytes 1234xyz5678 would become 81234xyz567. This is technically rotating all the bits of the data by 8 if you visualize all the bytes as a stream of bits. For lack of a definition, this is what I call an "8-bit rotation across the bytes". For this example, the 8-bit rotation only requires 4KB + 1 byte of RAM with storing the first encryption result starting at the second position in the buffer. Moving a single byte at the end of a buffer to the beginning of a buffer is a very trivial operation. My original idea for this step called for a one-bit rotation across the bytes, but then I realized that every byte of data would have to be modified - it was still a trivial operation, but kind of pointless despite breathing some life into the under-utilized rcr and rcl (rotate with carry right/left) features of many processors.

Obviously, the second step is easily reversed. The last step is to encrypt the data again. The same algorithm may be used but should utilize a different key and IV.

The theoretical resulting effect here is that, instead of just an 8 to 32 byte block size encryption algorithm, it has been stretched to be a 4KB block size encryption algorithm with an effective doubling of the rounds and possible doubling of key size. This is made possible because of the bit rotation, which acts like digital glue across the algorithm's smaller blocks while only taking twice as long. Rotating eight bits instead of one bit is more efficient, but if there are weaknesses in the algorithm along byte boundaries, then rotating one bit might mitigate (or it could exasperate) such weaknesses whereas eight bits will do nothing for or against such weaknesses by comparison. Of course, such weaknesses would be cause to not trust the algorithm in the first place and only trusted ciphers should be used, but this method of extending block size could be used as an emergency stop-gap measure if a serious breakage occurs across a set of widely-used encryption algorithms to buy sufficient time to come up with a real solution.

Doing all of this may, and I repeat MAY, have a side effect of effectively increasing cryptographic strength of the underlying algorithm itself. By what factor, I'm not sure, if any. Questions have to be answered first: Is there an increase? A decrease? No effect? This is the part where someone who likes doing cryptanalysis comes in. I'll err on the very safe side and say that this does absolutely nothing to make any algorithm used with this stronger cryptographically until someone with more chops than me proves otherwise. I vaguely remember reading somewhere that encrypting twice does nothing to increase the strength and may do the exact opposite, but also seem to remember that the article had to do with using the same algorithm twice back to back with the same key and IV. At any rate, don't fool yourself into thinking that this is somehow stronger than a base algorithm used until someone who isn't just dabbling in crypto evaluates this theory.

Sunday, January 13, 2013

Are Internet mailing lists a dying breed?

That is the question I've been asking myself lately. Mailing lists used to be a staple communication mechanism on the Internet. Now nearly all of the mailing lists I'm subscribed to are very quiet - still have 10,000+ subscribers on most, just no one seems to use them. The distinct trend I am noticing is that people are forgoing mailing lists and using quick question and answer sites like Yahoo! Answers and StackOverflow to get the answer to their questions. (Or using the Facebook commenting system or Twitter - but that depends on your friends, connections, and followers). Experts Exchange used to hold the position and used to do quite well but then shot themselves in the foot by putting annoying barriers in the way. People went back to mailing lists after that fiasco.

Unfortunately, there are two significant problems with the Q&A websites out there that mailing lists solve and StackOverflow is demonstrating the problems quite well.

The first problem is community and ownership. On a mailing list, anonymity is possible but you don't get to be a community leader let alone the mailing list owner by being anonymous. About half of the StackOverflow questions get shut down before they get started by moderators who hide behind rather anonymous-sounding usernames. Yahoo! Answers would have suffered a similar fate had spammers not found it to be an effective medium. StackOverflow/StackExchange is experiencing growing pains by leaps and bounds but is suffering the Wikipedia effect, where there is significant power in the hands of few people who aren't vetted very well and repeatedly show up to cause long-term damage to the site. The current set of SO moderators police the site and abuse their power by shutting down valid questions after only being up for a few minutes. On the other side of things, mailing lists are generally open to everyone to ask questions and any moderation queues are used to just filter spam from reaching list members. The current SO strategy will ultimately kill the site in the long run and significant damage has already been done, but most people don't realize it yet. There is also the question of who owns the site. Ownership is important because it creates the important hierarchy of accountability. StackOverflow, from the observer perspective, appears to have no owners, which makes it seem like a free-for-all website. That also causes problems that are a lot harder to pin down in a single sentence but suffice it to say that where obvious ownership exists, chaos, which always exists, is better kept under control.

The second problem is continuity and continual learning. StackOverflow, Yahoo! Answers, forums, and other mediums are hit-and-run. You ask your question, you get your answer from someone, you give them karma/points/whatever, and you go away and generally forget that there are other people who need help. These sites imbue selfishness. Whereas a mailing list is a continual stream of thoughts - there are regulars but other people help out too as part of a continual community effort to improve each other. Everyone picks up tidbits of information here and there and refines knowledge in a common area as well as occasionally replying to posts, which further contributes to the stream of thoughts.

Now I'm going to give one downside to mailing lists that StackOverflow really excels at: SO excels at bringing together a collection of strategies for software development and selecting the best approach at the point in time that the answer is selected. It is basically a Wikipedia for common software development questions and answers and, specifically, produces a set of best practices that are impossible to obtain elsewhere in a single location. Which is why the site, if it doesn't change, will suffer the same fate as Wikipedia, only it is more deviously hidden. In fact, we can already see SO turning into a Wiki where the answers can be edited by other people.

This puts me in a bit of a dilemma: Should we use mailing lists for our questions and answers? Should we use StackOverflow/StackExchange/Yahoo! Answers/forums? Twitter/Facebook? It would be nice if we could somehow have the best of all of these worlds. This seems to be what we have been trying to strive for over the years of IRC, mailing lists, and Q&A websites: Hey, I asked a question that someone else may have gotten a great answer for, so I should use that, but I also want the personal touch rather than "Closed as Duplicate/Too Localized/etc." without any interaction by those closing the question. (Side note - "Too Localized" is irritating because it comes off as "You asked a Dumb Question, go away" - imagine how that would make you feel being the recipient of that.) Closure of questions is the equivalent of "This conversation is over because we decided it is over and there is no disputing our decision." Humans have emotions and desire interaction. Therefore, canned responses and question closures are too stoic and drive people away. Okay, so you've repeated yourself a zillion times already and it is kind of boring to do it again, but to someone else, it is that direct response that says, "You are important so I won't brush you off as a nobody."

If we can achieve the above while simultaneously having an effective database of best practices, it won't matter if mailing lists died. I'll be sad that I can't simply use my desktop e-mail client for community communication, but I'll move on too.

Saturday, January 05, 2013

Setting up MySQL + Postfix + Dovecot to do Gmail-like 'youremail+whatever@domain.com' plus multiple delimiters

When I set up my Postfix + Dovecot + MySQL installation, I wanted GMail-like filtering for my domain. GMail allows you to do 'youremail+whatever@gmail.com' and it will automatically be delivered to 'youremail@gmail.com' with the label 'whatever'. From there, it can be filtered into the folder of your choice. I figured something similar would be very useful when registering on websites where I'm not necessarily wanting to use a mailinator address but do want to track whether they sell my e-mail address or not. Well, I thought doing the same thing would be useful, but more on the difficulties I've encountered with special characters like '+' in a bit.

In MySQL, I have 'virtual_aliases' and 'virtual_users' tables. If I remember correctly, Postfix first attempts to find an e-mail address in users, then aliases, then tries again without the extension specified by 'recipient_delimiter'.

To set up Postfix with a 'recipient_delimiter', open up '/etc/postfix/main.cf' and add:

recipient_delimiter = +

Save the file. I seem to recall that 'recipient_delimiter' only works with aliases, not mailboxes, so that might explain why I had to add the 'virtual_mailbox_maps' value to the 'virtual_alias_maps' line. Or maybe I'm confused. I set all of this part up almost two years ago, so the memory's a bit rusty because I've been busy working on my software products.

Now open up '/etc/postfix/master.cf' and change the dovecot delivery line from '-d ${recipient}' to '-d ${user}@${nexthop}'. Here are the full lines:

dovecot unix - n n - - pipe
flags=ODRhu user=vmail:vmail null_sender= argv=/usr/lib/dovecot/deliver -c /etc/dovecot/dovecot.conf -f ${sender} -d ${user}@${nexthop}

Save the file. Run the whole 'postfix reload' thing. At this point, Postfix and Dovecot will handle e-mail just like GMail does. However, as I said earlier, the '+' symbol doesn't work very well. You'll quickly discover, as I have, that there are a lot of broken web forms out there that won't accept it because web developers use regular expressions to "validate" e-mail addresses. E-mail addresses are so complex that regular expressions don't cut it. Since there are a billion different regexes out there for doing broken "validation", I wanted to use "multiple recipient delimiters" with my Postfix + MySQL setup. Scouring Google, I found exactly one post that started to deal with this issue, but only barely. Their solution was to create a new MySQL-based 'virtual_alias_maps' entry that runs the query:

select concat(replace(left('%s', length('%s') - instr(reverse('%s'), '@')), '_', '+'), '@', reverse(substring_index(reverse('%s'), '@', 1))) "goto" from domain where name = reverse(substring_index(reverse('%s'), '@', 1)) and active = 1 and instr(left('%s', length('%s') - instr(reverse('%s'), '@')), '_') > 0

Yeah, my eyes bleed too. It is an interesting approach but they are replacing every '_' character with a '+' character such that 'my_user_name_extension@domain.com' becomes 'my+user+name+extension@domain.com' and Postfix will then look up 'my@domain.com' and not find anything when it should have looked up 'my_user_name@domain.com'. It also risks potential delivery issues to random addresses depending on how Dovecot is set up. It seems that a better approach would be to first ignore any address that already contains a '+' symbol, since Postfix handles that natively. Then, make sure the address part contains the current symbol. Finally, locate the first instance of the current symbol and construct a new string excluding everything after it up to the domain part and look THAT address up in the appropriate table. This approach will work with small and large databases:

SELECT dest FROM virtual_aliases WHERE INSTR('%s', '+') = 0 AND INSTR('%s', '@') > 0 AND INSTR(LEFT('%s', CHAR_LENGTH('%s') - INSTR(REVERSE('%s'), '@')), '_') > 0 AND source = CONCAT(SUBSTRING_INDEX(LEFT('%s', CHAR_LENGTH('%s') - INSTR(REVERSE('%s'), '@')), '_', 1), '@', REVERSE(SUBSTRING_INDEX(REVERSE('%s'), '@', 1))) AND active = 1

MY EYES! THEY BLEEEEED! I feel like I've done an injustice to the Internets after writing that. Let's break it down. Basically, it returns a 'dest' (this is the aliases table after all) if it finds a matching 'source' based on string manipulation that strips the first match of a specific symbol plus the extension. In this case, it is looking for the '_' character. But, before it even does a lookup, it verifies that the string being searched for doesn't contain a '+' and does contain both a '@' and a '_' for sanity checking purposes. MySQL will resolve the INSTR() and CONCAT() mess first before doing anything else. Then it will start looking at the database table. If there is a MySQL index on the 'source' column, then this will be a super-fast, index-based lookup regardless of database size. But good grief, look at that awful mess again! I don't think Postfix will let me have that on multiple lines to make it potentially readable.

Okay, now let's assume we want some additional flexibility with '_'. Let's say that we want to remove everything after the last underscore instead of the first underscore (e.g. so 'my_user_name_extension' becomes 'my_user_name' instead of just 'my'). Postfix will execute things in order and stop when it gets a non-empty response, so the above query could be run first and then this can be run second to catch this new special case:

SELECT dest FROM virtual_aliases WHERE INSTR('%s', '+') = 0 AND INSTR('%s', '@') > 0 AND INSTR(LEFT('%s', CHAR_LENGTH('%s') - INSTR(REVERSE('%s'), '@')), '_') > 0 AND source = CONCAT(REVERSE(SUBSTRING_INDEX(REVERSE(LEFT('%s', CHAR_LENGTH('%s') - INSTR(REVERSE('%s'), '@'))), '_', 1)), '@', REVERSE(SUBSTRING_INDEX(REVERSE('%s'), '@', 1))) AND active = 1

If you don't see the difference, look for the additional REVERSE() calls. Okay, the above might not actually work since I didn't try it out and I think the SUBSTRING_INDEX() call will pull off the part I want to remove instead of getting rid of it. I'm using the first query, but not this one. My head hurts from looking at this mess. If you fix the second query, let me know. It might just be better for 'my_user_name' to use a different delimiter such as a hyphen and then have a '-' based query because doing this could potentially introduce some weird issues.

Each query has to be in its own file and you'll want to check the 'users' table as well. So, to do both queries above for each character for both tables, you'll need four distinct files on the file system. This gets to be rather messy if you want to do a combination of characters such as '+', '_', '-', and '.' which results in a lot of distinct files (and potentially lots of distinct MySQL connections?). I suppose you could make one file and use OR with the query but you risk losing index functionality (the OR keyword tends to have the effect of doing full table scans) and the SQL query would become a vomitastic monstrosity. Actually, if each file means a separate database connection, it might be faster to not worry about the index. I honestly don't know which will be faster. I assume indexes will be faster and that Postfix intelligently maintains connections to MySQL, but I don't know and I don't really want to go digging around source code to find out. Postfix runs a ton of SQL queries per e-mail already but it is preferable to run a lot of fast queries instead of a few slow ones. At any rate, make it work first with semi-readability, worry about performance issues later. Don't forget to 'postfix reload' and be sure to check file permissions so that Postfix can read the file.

If you do use this query, please document the file(s) so someone (including you) viewing the query later doesn't have a "what in the..." head-scratching moment.

Disclaimers: I have never applied this many MySQL functions to a single query before. My knowledge of MySQL functions is fairly limited. I call CHAR_LENGTH() because it seems more Unicode-friendly than just plain LENGTH(). I also use the specific table for aliases instead of the domain as in the original example because, looking elsewhere on this topic, I've learned that Postfix likes 'specific' rather than 'general' definitions and not creating a potential exploitable delivery vulnerability in the system is important. I blame the authors of Postfix for this major oversight in their product that it requires a hack in MySQL to fix a problem that is clearly within their domain and capability to fix by simply allowing multiple delimiters. I'm also not responsible for any heart attack, stroke, stress, panic, heart palpitations, sweating, shock, bleeding from the eyes, weight gain, weight loss, hair loss, mutation, rage quit, and other medical disorders as a result of this blog post.