Saturday, November 21, 2015

Why developers should do their own documentation and code samples

I was recently on the Microsoft Developer Network website (aka MSDN) looking at some API documentation. Many of the more popular APIs have code examples so the developer can see example usage rather than have to try to understand every nuance of the API before using it. The particular API that I was looking to use had an example, so I made the unfortunate decision to look at the code. The example was a turd. It wasn't a polished turd. It was just a normal, run-of-the-mill turd. The code had HANDLE leaks, memory leaks, and a bunch of other critical issues. It looked like it was written by a 20 line Norris Number programmer (aka newbie).

Being rather bothered by this, I set out to learn how Microsoft produces its code samples. According to one source I found, the company hands the task off to interns. So, sample code that a whole bunch of other programmers are going to simply copy-pasta into their own code is being written by amateur programmers. Nothing could possibly go wrong with that. If the examples are indeed written by interns, it certainly explains why the quality of the code samples in the documentation is all over the map ranging from really bad to barely passable. It's certainly not what I would expect from a professional organization with 50,000 employees. If you open a HANDLE, close it. Allocate memory? Free it. Simple things that aren't hard to do but help achieve a level of professionalism because you know that other people are just going to copy the example into their code, expect it to work, and not have unforeseen bugs in production.

MSDN is the face of Microsoft most people don't really get to see unless they start developing for the Windows OS. But it matters who produces the documentation because a single mistake is going to affect (tens/hundreds of) thousands of applications and millions (billions?) of people. API documentation is almost always too intricate for most other developers to fully understand. While it is the be-all-end-all definitive overview of any give API call, code examples provide context and meaning. A lot of people struggle with "so if I use this API, what do I do next" but have the "aha!" moment when they see a working example connecting the API to other code. Developers will copy and paste an example long before they fully comprehend any given API. For this reason, code examples need to have the same care and professionalism applied to them as the API itself. Passing this responsibility off to an intern is going to create significant long-term problems.

Writing your own code examples for an API also has the benefit of revealing bugs in the API. If the developer who made the API is writing the documentation for it and the code sample, they are 15 zillion times more likely to spot mistakes and correct them before they get released into the wild. Pass that responsibility off to an intern? Well, the intern is going to not run into or just ignore the bugs in the API because THEY DON'T CARE. They want the paycheck and the checkmark on their graduation forms that says they did their internship. Users (developers) have to live with the disaster that interns leave behind in their wake. Putting them on documentation and code example writing tasks means interns will be the face of the company that developers (i.e. the people who matter the most) will see. That strikes me as unprofessional.

In short: Develop an API? Do your own documentation and code sample writing. Is it tedious and boring? Yes. But it is important to do it anyway. In fact, it is infinitely more important than the API you wrote.

Thursday, November 12, 2015

Let's NOT Encrypt - Critical problems with the new Mozilla-sponsored CA

Starting a new Certificate Authority is a time-consuming, expensive, and difficult task. It is also annoying to set up and maintain SSL/TLS certificates. So I completely understand what Let's Encrypt is trying to do. Their goal? Free, functional SSL/TLS certificates that are easy to create, install/deploy, and even keep up-to-date. What's not to like about that? Well, it turns out there are some serious problems with this up-and-coming Certificate Authority (CA). I'm going to list the issues in order of concern:

  1. Doesn't solve the problems of storing roots in the browser or global trust issues.
  2. A U.S.-based company.
  3. Browser support/acceptance.
  4. Sponsored by Mozilla.
  5. Other, publicly traded, corporate sponsors.
  6. A brand-new, relatively untested, and complex issuance protocol (ACME).
  7. Limited clients (Python bindings only) and no libraries.
  8. Linux only.
Each of these issues in detail:

For the first issue, even though it is all we have got, SSL/TLS is fundamentally broken. Let's Encrypt builds upon broken technology and is therefore also fundamentally broken. Instead of fixing the core problem, it merely obscures it. We need to scrap the current mess and start over, using the understanding of what we have learned over the years, not bury broken technology with more broken technology - see the spam in your in-box to learn how well that's worked out for you. Distributed authorities and/or trusted peering, sensible user-presentations (instead of today's scary-looking warning dialog boxes), NOT distributing default roots (we shouldn't even have root certificate stores - it should be root-per-domain), and web of trust are better steps in the right direction and lets people do things with certificates currently not possible (e.g. issuing their own signed cert chains without raising warnings), and possibly redesigning portions of TLS from the ground-up. Ultimately, each individual and company should be able to be their own CA free and clear on the Internet for true Internet security.

For the second issue, Let's Encrypt is a U.S.-based company. They proudly display that information when they say they are a non-profit 501(c)(3) organization. This is a HUGE problem because being a U.S.-based company makes that company susceptible to secret FISA rulings. As a result, a FISA court could order them to turn over their root certificates AND not say a word to the public with severe penalties if they violate the ruling. FISA courts are in cahoots with the NSA, CIA, and FBI and rarely rule in favor of companies or citizens. Until this relationship is resolved amicably (e.g. dissolve/neuter FISA and reset all root certs), it is extremely dangerous to have a Root Certificate Authority operate within U.S. borders.

For the third issue, Let's Encrypt has a huge uphill battle to get added to the root certificate store of every major browser and OS. StartCom, an Israeli-based company which also offers free domain validated certificates today via StartSSL, took years to get through the process to be added to browser and OS root certificate stores, and then even longer to get enough market share to be deemed viable for use. Let's Encrypt has to go through the same process that StartCom did, which means they are about 5 years away from viability. The only positive side to Let's Encrypt is they plan to offer free certificate revocation, whereas StartCom does not. Again, all of this process is required because, as the first issue pointed out, SSL/TLS is broken technology. Instead of fixing SSL/TLS, they opted to adopt it.

For the fourth issue, Mozilla appears to be the primary sponsor. Mozilla makes Firefox and they now basically own Let's Encrypt. It smacks of collusion and that can be quite dangerous. It certainly will be extremely suspicious if Mozilla is the first to adopt the Let's Encrypt root into the root certificate store of Firefox. Browser/OS vendors seem to wait until someone else includes the root first, so this is highly advantageous for Mozilla because they can artificially accelerate the process. If they pull such a stunt, it could result in a lawsuit from other CAs who had to go through the extended process and/or extremely ironic antitrust litigation against Let's Encrypt and Mozilla by the Department of Justice. I say ironic because Mozilla used to be Netscape, who was the source of antitrust litigation against Microsoft when they bundled Internet Explorer with Windows back in the day. Mozilla getting slapped with antitrust litigation would be the most entertaining thing for us tech watchers that could happen - if that happens, grab your popcorn and sit back and enjoy the show!

For the fifth issue, while I understand that a public Root Certificate Authority is expensive to start (estimated initial costs are at least $50,000 USD) and that corporate sponsors have that kind of money, it is rather inappropriate. There needs to be complete, full transparency with regards to the money here. It is extremely important during the setup phase of a CA like this. As far as I can tell, the project is distinctly missing that information. Also their financials aren't readily available online on their website despite being a non-profit organization that claims to increase web friendliness. According to Charity Navigator, they have collected about $100,400 to date, which is on par for starting up a CA.

For the sixth issue, the ACME protocol is a draft specification that I assume will eventually be sent to the IETF. However, it forms the basis of Let's Encrypt. It's a beta protocol and subject to change. As a software developer, I also feel like it is overly and unnecessarily complex as most IETF documents are wont to be. There are a number of issues with the ACME protocol that I feel are vague and therefore open to interpretation. As a counter-example, JSON-Base64 is NOT open to interpretation - it is an extremely clear file format and defers entirely to the TWO nearly identical, official public domain implementations of the library if there is any doubt as to how an implementation MUST implement JSON-Base64. As a result, there is no doubt about how JSON-Base64 works. This, of course, leads me to the next issue...

For the seventh issue, additional clients in other programming languages and libraries to talk ACME may come. Eventually. I have a serious problem with writing a spec before writing an implementation: Real implementations reveal flaws in the spec and updating the spec after it is written is always a low priority. Whereas writing the spec afterwards results in a clean, clear document that can defer to the implementations. Always write general guidelines for the implementation, THEN develop a couple of nearly identical implementations in a couple of different languages, hammer out the bugs, and FINALLY write the final specification based on the implementations BUT defer to the implementations. As usual with IETF related cruft that gets dumped into the wild, the reverse has been done here and this annoying habit results in inevitable problems later on. Again, see the spam in your in-box - you can thank the IETF for that. Tightly-controlled implementations first, specification second.

For the last issue, I give a great, big sigh with gentle facepalm. The authors claim a Windows Powershell solution is coming but that ignores, well, pretty much everything rational. Are they going to support Portable Apache + PHP + Maria DB too? People who develop first for Linux almost always leave cross-platform development as an afterthought and up to other people to resolve because they are too lazy to do the right thing. It's a shameful practice and there should be great amounts of public humiliation heaped on anyone who does it. Windows still dominates the desktop market share, which is where local corporate development boxes live. To choose to ignore the platform users actually work on is just plain stupid. The more critical issue is that supporting only web server software is going to result in headaches when people want it to work for EVERY piece of SSL-enabled software (e-mail servers, chat servers, etc) and supporting just a few products has opened a can of worms they can't close. The Let's Encrypt developers will forever be running around getting nothing of value done.

At the end of the day, Let's Encrypt solves nothing and creates a lot of unnecessary additional problems. It's also a long way off from being viable and there are plenty of legal landmines they have to navigate with extreme care. In short, I'd much rather see a complete replacement for the disaster that is SSL/TLS. Also, people need to stop getting so excited about Let's Encrypt, which simply builds upon fundamentally broken technology.

Saturday, September 12, 2015

GitHub commits publicly reveal your private life

GitHub is a great tool. It enables software developers to work together on open source projects. That's pretty awesome. However, it also unfortunately exposes your personal life to the entire world. It is easy to look at the history log of commits for any given GitHub user and identify their an incredibly creepy level.

Using GitHub histories, an attacker can identify when you are probably awake, asleep, at home, and at work. They can also identify habits such as what days of the week you tend to commit code. As well as what days of the week you never commit code. Which days and months you commit the most code and which days and months you do not as well as the frequency of commits. All of that information can be used to derive your physical location in the world, your religion, your favorite sports team(s), and your relationship status with your significant other (if you are on good terms or not, having sex or not, etc). And possibly your hobbies and general interests.

If you don't think your personal data isn't already being mined for the above, you are quite mistaken. It is.

In my opinion, commit timestamps are a security vulnerability. Let's say an attacker wants to "send a message" to a software developer they don't like. They simply figure out when the person is going to be away from their home, show up, do their thing (tag/graffiti, rob/steal/destroy property, drop a threatening letter, etc), and leave. GitHub commit timestamps provide a wealth of information and, according to the field of statistics, an attacker only needs 35 data points to achieve what is known as "statistical significance". Each commit timestamp is a valuable data point. Therefore, all someone needs is 35 commits to start building a profile. So the attacker may notice that the commit history is devoid of commits during the week Monday through Friday during "normal" business hours when mapped to a specific timezone (i.e. narrowed to a specific region of the world). They can reasonably assume that the target is at some form of a day job. Physical addresses are pretty easy to obtain when the real name is acquired, so the commit history just confirms what is already published information. The lack of commits is information that is just as important as the actual commits.

More data points simply improves the accuracy of the information. Thus, the more frequently you commit, the more information about your personal life that you give away! At some point, with enough commits, everything about your personal life can be determined. Oh, you don't commit code whenever a specific sports team plays a game and it airs on television? You might be a fan of that team because humans are literally incapable of doing two things at once - despite the fact that some people that claim the contrary humans are single-taskers. When you are committing code, you are in front of a computer screen and focused on that singular task. Your favorite TV shows are also possibly able to be determined with GitHub commits simply because you aren't committing code during the time you watch those shows - although if you use Netflix, Hulu, etc., then that information can be a lot harder to determine but you generally won't be committing code while watching any given show.

So how should this be fixed? First off, the entire world doesn't need to see commit timestamps. Timestamps should only be accessible to trusted users and services. I realize timestamps are part of the commit log, so Git itself will have to be changed to accommodate fixing this issue. Second, there should be tight controls over how much timestamp information is disseminated even to trusted users and services. And finally there should be a timestamp privacy (mangling) option to set commit timestamps to specific/random times according to a ruleset that the committer makes (e.g. hardcode all commits to Mondays at random times of the day regardless of the fact the code was committed on Thursday at some specific time of the day).

Wednesday, July 22, 2015

Solving "unresolved external symbol ___report_rangecheckfailure" Visual Studio linker errors

Let's say you import a library from Visual Studio 2012 or later into your project in an older version of Visual Studio (e.g. Visual Studio 2008 or Visual Studio 2010) but now get linker errors:

error LNK2019: unresolved external symbol ___report_rangecheckfailure referenced in function ...
error LNK2001: unresolved external symbol ___report_rangecheckfailure ...

Sad day. Especially since you don't really get a say in how that library is being built. Your options are:

  1. Upgrade your version of Visual Studio. That includes going through the whole project upgrade cycle. We know how well that usually goes.
  2. Recompile the library yourself. Sad day turns into sad week.
  3. Hack it.
The function __report_rangecheckfailure() is called when the /GS compiler option is used. The option enabled buffer overflow security cookie checking, which, in this day and age, is a good option to have enabled. Unfortunately, that causes problems with older versions of Visual Studio. Let's take a look at what the function does - the source code from 'VC\crt\src\gs_report.c' has this code:

// Declare stub for rangecheckfailure, since these occur often enough that the code bloat
// of setting up the parameters hurts performance
__declspec(noreturn) void __cdecl __report_rangecheckfailure(void)
Hmm...not really helpful since it calls another function. However, that function contains this very interesting comment after a lot of inline assembler and macros:

     * Raise the security failure by passing it to the unhandled exception
     * filter and then terminate the process.
So, knowing this normally triggers an unhandled exception and exits the process, we can hack it:

__declspec(noreturn) void __cdecl __report_rangecheckfailure(void)
I'm not sure whether to congratulate myself on this evil hack or cry. I think I'll do a little of both. You're welcome.

Oh, and if you work on the Microsoft Visual Studio development team, please develop a compatibility library that implements stuff correctly for older Visual Studio environments. Doesn't have to go back to 1995 VS6, but something reasonable like a 10 year window that addresses issues like these.

Monday, July 06, 2015

The Death Master File...a blackhat's dream come true

First, watch this CBS 60 Minutes special on the Social Security Administration's Death Master File:

The ultimate hack, from a blackhat/rogue government perspective, is the one that has significant negative impact on the financial stability of a country one can figure out who is responsible.

The Death Master File meets all of the prerequisite criteria:

  • Large quantities of data? Check.
  • Has significant financial consequences for anyone who gets into it? Check.
  • Individuals can't readily find out if they are on the file or not? Check.
  • Relatively easy to add anyone to the file? Probably check (e.g. plop some malware on funeral home computers and get remote access to adding entries to the file).
  • Takes years to get off the file? Check.
  • Has recurrent consequences for the rest of the individual's life? Check.
  • No way to track additions back to the original source? Check.
  • The head honcho at the Social Security Administration doesn't really care about "accidental" additions and only seems to care about paying out too much money? Check.

You really couldn't ask for a more perfect combination. It's pretty shocking when you think about it - zero safeguards, no one seems to care, and it has major repercussions for affected individuals (e.g. homelessness). Destroying the U.S. is quite literally available on an unprotected digital silver platter. There are so many different ways that this could go sideways I'm not really sure where to start other than to write a blog post about it to raise awareness.

As a software developer, the one thing that REALLY irks me is this:

There is a $200 annual subscription fee to access the data and is restricted to government entities and businesses with a need for the data. Individuals can't write a script to watch for the unfortunate event of being added to the list. The list is supposedly a very lucrative source of income, which means that every business out there seems to use it. Sorry, but my tax dollars aren't for NTIS to run an e-commerce store. Data for all or data for none.

The U.S. government is ill-equipped to handle modern threats - writing laws and charging money for access to the data doesn't close blatant security holes. Who was the person who decided to not bother with change tracking in this rather critical database? That's ridiculous and they should be fired and drop-kicked out the door. Also, to simply not care about those people whose lives the Social Security Administration has messed up is rather messed up too. The U.S. has enemies who would love nothing more than to destroy the country. Adding people to the Death Master File seems like a pretty easy way to accomplish such a task.

Saturday, June 20, 2015

How to call select() - the CORRECT way!

There is a TON of broken code out on the Internet with lots of programmers who enter the world of TCP/IP socket development and think they have figured out how to write socket code. They then disseminate their broken code to others who, in turn, disseminate broken code to other people.

One of the most egregious problems plaguing the world of software development today is the use, or abuse, of select(). Today, you are going to contribute to fixing this problem once and for all by reading what I have to say and then ingraining it into your brainz.

There are two types of file descriptors/sockets/what-have-you:

Blocking and non-blocking. Sometimes referred to as synchronous and asynchronous.

If you are using select() on synchronous sockets in your code, you are doing it wrong!

select() is ONLY for asynchronous sockets. Think of it this way: A synchronous socket is you telling the OS that you know the exact order of operations on that socket (e.g. POP3) and are willing to wait until hell freezes over for that read/write operation to complete.

Read that over again and you should come to the same conclusion: Calling select() on a synchronous socket is WRONG. Although, if you've been doing it wrong for decades, this fact becomes a lot harder to accept.

Where does this misunderstanding come from? A lot of people misunderstand select() because the book/teacher/website they learned *NIX Socket Programming from got it wrong because they learned the wrong approach from someone else. select() on a synchronous socket introduces bugs that are hard to trace and happen randomly. Also, most socket programmers start out using synchronous sockets with simple client-side applications and then later want a way to handle multiple sockets at one time. select() is referenced all over the manpages/MSDN and, when the programmer reads about it, it sounds like it will work, so they try it and it seems to work. That's the real problem: select() seems to work, which is why programmers use it improperly.

select()'s purpose in life is to offer a way to not have a busy loop in an asynchronous environment since no read/write operation will ever block. It is entirely possible, if you pass in a synchronous descriptor to select(), that select() will indicate that the socket is readable but when you go to read data, the synchronous socket will block. You might say that can't possibly happen but guess does happen! This is why select() being only for asynchronous sockets makes much more sense. Once you learn this, the code for asynchronous sockets becomes surprisingly cleaner and is only marginally more complex than synchronous socket code. If you ever thought your synchronous socket code using select() was kind of hacky/messy, then you now know why. This is a harsh lesson to learn for many people.

Therefore, to process more than one descriptor per thread at one time, use asynchronous descriptors and select().

The best way to fix this entire problem would be for select() to determine if a descriptor is synchronous and then reject it outright. Such a solution would break every major application and then, lo-and-behold, everyone would fix their code! The world would then have less broken code and we'd all be happier.

Friday, June 19, 2015

Dear WebSocket, 1980 called and wants its text mode back among other things

This is a mostly tongue-in-cheek response to RFC 6455, which defines the WebSocket protocol, which I recently built a client for and can be found in the Ultimate Web Scraper Toolkit. Certain things annoyed me.

Dear WebSocket,

FTP called (RFC 765, circa 1980) and wants its text mode back. Please return it to the nearest Internet Engineering Task Force (IETF) member as soon as possible. You may have shined it up a bit with UTF-8, which was basically designed on a napkin. Of course, Unicode has never had any implementation problems with it whatsoever. Ever.

Your masking key is for clients only and not even being optional for servers defies the core Internet tenet of being liberal with what you accept, strict with what you send. Technically, both are peers and therefore both are clients of each other since you are, after all, bi-directional and the client could easily function as a server after the connection is established. Because this has absolutely never ever been done before.

Your comments about the Origin header only being sent by web browsers enables differentiation from an automated script and a web browser. Or does it? No, it does not. For I can simply send to the target whatever Origin header I want and therefore look like any web browser of my choosing. Your false defenses are no match for my Kung Fu.

Your reserved bits per frame are silly and probably full of arbitrary limitations. From the comfort of my keyboard and screen, I fart in your general direction.

Your required closing packets are unnecessary and most likely a security risk with anyone assuming that such frames will ever be sent. TCP already has built-in shutdown mechanisms. Ping/pong frames are more than sufficient.

Since it is important to always remember the childrens, you get an F--. Enjoy.


The portion of the Internet that actually knows how to design rational communication protocols.