Saturday, June 20, 2015

How to call select() - the CORRECT way!

There is a TON of broken code out on the Internet with lots of programmers who enter the world of TCP/IP socket development and think they have figured out how to write socket code. They then disseminate their broken code to others who, in turn, disseminate broken code to other people.

One of the most egregious problems plaguing the world of software development today is the use, or abuse, of select(). Today, you are going to contribute to fixing this problem once and for all by reading what I have to say and then ingraining it into your brainz.

There are two types of file descriptors/sockets/what-have-you:

Blocking and non-blocking. Sometimes referred to as synchronous and asynchronous.

If you are using select() on synchronous sockets in your code, you are doing it wrong!

select() is ONLY for asynchronous sockets. Think of it this way: A synchronous socket is you telling the OS that you know the exact order of operations on that socket (e.g. POP3) and are willing to wait until hell freezes over for that read/write operation to complete.

Read that over again and you should come to the same conclusion: Calling select() on a synchronous socket is WRONG. Although, if you've been doing it wrong for decades, this fact becomes a lot harder to accept.

Where does this misunderstanding come from? A lot of people misunderstand select() because the book/teacher/website they learned *NIX Socket Programming from got it wrong because they learned the wrong approach from someone else. select() on a synchronous socket introduces bugs that are hard to trace and happen randomly. Also, most socket programmers start out using synchronous sockets with simple client-side applications and then later want a way to handle multiple sockets at one time. select() is referenced all over the manpages/MSDN and, when the programmer reads about it, it sounds like it will work, so they try it and it seems to work. That's the real problem: select() seems to work, which is why programmers use it improperly.

select()'s purpose in life is to offer a way to not have a busy loop in an asynchronous environment since no read/write operation will ever block. It is entirely possible, if you pass in a synchronous descriptor to select(), that select() will indicate that the socket is readable but when you go to read data, the synchronous socket will block. You might say that can't possibly happen but guess does happen! This is why select() being only for asynchronous sockets makes much more sense. Once you learn this, the code for asynchronous sockets becomes surprisingly cleaner and is only marginally more complex than synchronous socket code. If you ever thought your synchronous socket code using select() was kind of hacky/messy, then you now know why. This is a harsh lesson to learn for many people.

Therefore, to process more than one descriptor per thread at one time, use asynchronous descriptors and select().

The best way to fix this entire problem would be for select() to determine if a descriptor is synchronous and then reject it outright. Such a solution would break every major application and then, lo-and-behold, everyone would fix their code! The world would then have less broken code and we'd all be happier.

No comments:

Post a Comment