Saturday, May 31, 2008

Get ready to spew your morning coffee...

...onto your computer monitor and keyboard. Literally.

Following hot on the heels of the "Perl is a terrible language" post comes the reason why I had to re-learn Perl in the first place.

I just barely released MyProBB 2.3. It took nearly two months to get this release ready. 1.4 months of that 2 months consisted of writing a single plugin for the forum. What follows is the shamelessly copied portion of text from the MyProBB 2.3 announcement post:

[This plugin is] perhaps the best little gem to hit the Internet since AJAX-driven websites, er, Google...

The Official Instant Message plugin. Yup. That's right. I single-handedly hold the distinction of having the first web software package that sends Instant Messages. I hold the distinction of being the first to send IMs to five major IM networks from a web forum. I also hold the distinction of having the only web forum software package that sends real Instant Messages. I even hold the distinctions of having the most advanced Pidgin plugin using the Perl plugin for Pidgin, the first Perl plugin for Pidgin that contacts the Internet, and the first ever multi-protocol automated IM web-driven bot.

That's a lot of firsts. I've done something no one else on this planet has done. It took a lot of sweat, tears, and one-and-a-half months to get here, but I did it.

Despite making the plugin as simple as possible, the plugin itself has over five pages of documentation. However, while it does take a bit to set up, it really does send IMs. To five different networks: AOL IM (AIM), MSN Messenger, Yahoo Messenger, ICQ, and Jabber/XMPP.

There are three parts to the plugin: The 'User Profile' part (what the user sees), the Management Interface part (handles requests), and the Pidgin part.

The first two parts are straight-forward (more or less), the Pidgin part is not so straight-forward. Pidgin is a multi-protocol IM chat client that has Perl scripting support for writing plugins (it also supposedly supports Tcl plugins, which I don't know). I looked at the documentation and knew it would take a while but a Perl plugin for Pidgin was the best route. A Perl plugin for Pidgin would allow users to use the plugin with Pidgin on any Pidgin supported OS (which is all the major OSes).

My first step was to get Pidgin installed with the Perl plugin. That took several days. (The instructions cut through the awful steps I went through). Then I had to re-learn Perl. For the zillionth time. I hate Perl. Once I figured that out, I had to download the source code to Pidgin and learn how that worked. I was digging around in the C code far more than I cared to just to figure out what in the world was going on. I ran into so many ridiculous bugs (fully documented in the plugin source), I ended up writing all sorts of cheesy hacks to get the darn thing to not do weird things. Then I had to wait for Pidgin 2.4.2 to be released because they were making changes to the Perl part of the plugin, which resulted in more hacks. Then I had to figure out how to communicate between the Perl and PHP parts of the plugin over the Internet in a bandwidth-friendly fashion. And then I ran into rate limit and size limit issues of sending messages over IM. By the time I figured everything out, a whole month-and-a-half had passed.

Sorry for the rant. But hopefully that thoroughly explains the reason for the delayed release. Instant Message support for a web forum: A very cool feature. And it totally abuses someone else's software application (Pidgin).

And now here is why you should spew your morning coffee onto your computer screen: The Instant Message plugin significantly lowers the bar to creating IM spam bots for every protocol Pidgin supports. Pidgin supports every major (and a few minor) IM protocols on the planet.

I wrote the plugin such that it ties into MyProBB and requires verification, but I've basically done the hard work of figuring out how to turn innocent, sweet Pidgin into an evil IM spam bot. My purpose is for good. Someone else's purpose will be for awesome.

If you were holding it in, you can spew your coffee now.
Feel better? Good.

Every new innovation is a double-edged sword...

Microwave ovens can be used to cook food or they can be rigged to stop air-to-ground missiles from homing in on and hitting your very expensive radar dish that the enemy wants to take out. The missiles take out the microwave ovens instead. Fun.

Perl is a terrible language

Every time I go to use Perl, I end up having to re-learn it entirely from the ground up. That is how bad the language is. Most languages I can come back and look at some code and say, "Oh, I remember what that does." Not Perl. Perl is the only language I've ever used that I come back to the code and say, "Huh? What in the world did I do there?" And then as I read yet another Perl tutorial (how many do we really need?) to re-learn Perl for the zillionth time (okay, more like 25 times), I say, "Good grief. This language is terrible."

Plus Perl has these weirdly named modules and sticks everything, and I mean everything in the global namespace. Including variables you define inside a function. Perl is the only language I know of where, if you forget to use the word 'my' before using a variable for the first time, it royally messes up the entire script execution and takes hours to diagnose.

Additionally, every last Perl module reeks to high heaven of bad design. An example of a poorly designed Perl module is LWP. Whoever named it that should be soundly beaten. Not that I'm advocating violence. Microsoft would have named it Internet. Or Web. But, no, it has to be an acronym for "libwww-perl". Totally obvious. Plus the module itself is poorly written for the average programmer. What, precisely, does the average programmer want to do with a web module? Well, probably access the web and, almost always, a webpage sitting on a web server. So, someone created LWP::Simple. That is great if all you want to do is run GET requests all day, but if you want to do a simple POST request? Not allowed - you have to go use the full-blown LWP library for that.

Here's a wonderful little function that does a POST request LWP::Simple style:

sub LWP_Simple_Post
my ($URL, $Content) = @_;
my $Request = HTTP::Request->new(POST => $URL);

my $UserAgent = LWP::UserAgent->new();
my $Response = $UserAgent->request($Request);
if ($Response->is_success())
return $Response->content;

return undef;

Now, really, how hard would that be to have quickly thrown together in LWP::Simple? Probably not even five minutes. Perl programmers are clearly lazy. If you write a module for CPAN, please make it versatile enough to not be so incredibly painful to use. Thank you.

I swear that only bad programmers actually use Perl and swear by it. That statement includes you Larry Wall, Mr. high-and-mighty creator of the second worst but popular language on the planet. Only COBOL holds the dishonor of being worse than Perl.

PHP 5 is an infinitely better scripting language. I've settled on PHP for major web development projects and C++ for client-side projects. It is a pain to have to know more than that.

And now a word from our sponsors...

Oh. Wait. I don't have sponsors. Never mind.

I feel dirty after having to learn Perl again. I'll just go wash my brain or something.

Saturday, May 24, 2008

Sins of improper website operation

People who run websites are generally clueless about securing their websites. What follows is a short list of "seven sins" (I know...a cliché) that are committed by those who operate a website that use a dynamic backend scripting language (PHP, Perl, etc.):

1) Installing third-party components without first reviewing them for how well-written they are, if they have had major security vulnerabilities and/or exploits in the past 12 months, how well defended against automated scripts they are, and how well each component defends itself from known and unknown exploits. If you don't know how to do this, then employ the services of a security expert.

2) Not upgrading components the same day an upgrade becomes publicly available. PHP, MySQL, third-party components, etc. All major releases typically have fixes for security vulnerabilities. And most releases likely already have exploits for those vulnerabilities floating around in the wild. Security firms track both the vulnerability and the exploit and will themselves release the exploit a couple weeks after a fix becomes available (a couple ill-repute firms will release the exploit within 24 hours). Tracking upgrades is easy - use a website monitor program to watch for changes to the site - or sign up for e-mail notifications.

3) Choosing weak passwords for users with access to administrative interfaces (i.e. those with a greater than "generic user" status). A proper administrative password is at least 35 characters in length and completely random with care taken to ensure a crytographically secure random number generator is used (i.e. not based on a PRNG).

4) Writing PHP code without considering the security implications. File manipulation, handling a file upload, assuming that if someone can access a specific portion of code or script that they are allowed to execute it, and cross-site scripting vulnerabilities.

5) Writing code that interacts with MySQL (or other database) without considering the security implications. SQL injection, gaining access to a higher privilege level than allowed, and stealing the data in the database - which is usually critical to the business - if you lost your database data today due to someone stealing it, would you go out of business?

6) Not setting up firewall rules that block ports to anyone but authorized personnel. Blocking access to server resources is critical. If your web host control panel doesn't allow you to restrict access to server resources (e.g. FTP) on an IP address basis (e.g. restrict FTP access to one IP address), then you need to change hosts.

7) Not logging administrative activities somewhere. If something goes horribly wrong, and it can happen to anyone, a log can quickly tell you what the damage is and what needs to happen to fix it. If you don't have a log of all administrative level activities, then you are left to guess. Reinstalling a whole website after a hacker has done their thing can take weeks.

Well, hopefully this will improve the security of some websites and improve a few clueless website operators.

Monday, May 12, 2008

The day of the first mandated rolling blackout

Fossil fuels are important to programmers. We use computers which rely on electricity which rely on power plants which rely on transportation which rely on fossil fuels. When we run out of fossil fuels is the first day you won't be able to turn on your computer. Or much of anything else. I try to avoid doom-and-gloom in general but this is something that has been on my mind for a while. Basically, if we do nothing, all we will be able to say is, "Well, it was fun while it lasted." So, what should we do? Let's start with automobiles. The biggest consumer of fossil fuel/oil. I'm going to go for a myth vs fact approach here.

Myth: We have many, many years left before we run out of fossil fuels.
Fact: Nope. We've got maybe 30 years left. If even that. Some very conservative figures state 15 years before the first mandated rolling blackout. 30 years would be entertaining for sure - the magical year "2038" should ring some bells. You know - when all 32-bit clocks roll around to 0. Computers that work tend to not get replaced. Some line of code might be, "if (lasttime > time()) LaunchNuclearMissile();" thrown in perhaps as a dummy line that some engineer thought would be funny. Two disasters in one year would be...hilarious.

Myth: Oil is unlimited.
Fact: When I found out about this completely off the wall idea that oil is unlimited, I laughed. Apparently NASA sent out a space probe to Titan (Huygens) and one module, called GCMS, collected data that suggests the methane rich environment is not organic in nature (no Carbon-12). People then extrapolated upon NASA's scientific observations that oil is not necessarily based on fossils as previously assumed - only they convienently left out the word "necessarily". Then other people (not scientists) further extrapolated that oil is infinite. Then the conspiracy theorists decided to have their own take and declare a coverup that Big Oil knew about this all along and that there is actually an infinite supply of oil and that they are merely taking advantage to make a huge profit. All scientists know is that the methane on Titan comes from below the surface of that moon but they never made ANY claims that oil on earth is infinite. All they were doing was casting doubt on the possible likelihood that oil did not come from fossils. Sigh. Leave it to people's wild imaginations... Anyway, we probably won't step away from calling oil "fossil fuel" any time soon, since that phrase is so ingrained in us.

Myth: Biomass fuels (biofuels) based on corn are the answer.
Fact: Nope. Video 1. Video 2. Video 3. Video 2 - gotta love politicians in the hotseat. It takes as much energy as, if not more than, to produce a gallon of viable biofuel. It causes more pollution, not less due to less burning efficiency (a problem I see resolving itself more or less in time - being a relatively new thing). But the food issue is the main problem. We don't have enough food to begin with and farming is unfortunately viewed as a menial and dirty task by us city-slickers.

Myth: Hydrogen cars are the answer.
Fact: Nope. While this is currently hip and cool and trendy, there are so many obstacles to overcome. Pure hydrogen is hard to come by. We know that water contains two hydrogen atoms and one oxygen atom. However, separating a water molecule into its component pieces takes a LOT of energy. The most common method for getting water to separate is to use electrolysis. And the most common method of electrolysis in hydrogen cars is to use a platinum-based alloy. Platinum is an expensive and rare metal. Storage of extracted hydrogen (to keep it from bonding to something else) is also an issue but someone seems to have come up with an interesting solution using titanium disilicide, sunlight, darkness, and heat. But anything viable there is a long way off from reality.

Myth: Electric cars are the answer.
Fact: Possibly. But the electric car unfortunately had an untimely demise. Stuff that has an untimely demise, typically caused by an idiot who claims they have the solution and can't reproduce it, usually takes decades before anyone seriously bothers again. The main problem with electric is distance and that they actually cause the same amount or more pollution than gas-powered vehicles (the pollution is more centralized at the power plant).

Myth: Wind power is the answer.
Fact: Er. Where are you sticking the windmill? On the roof of the vehicle? I don't know why I bother sometimes.

Politicians are great. They have a one-track mind that believes anything they are told to believe. The problem with energy is that you need a multi-prong approach to make it work efficiently and environmentally friendly. People love the idea of a "silver bullet" solution, but you would think that we would have already figured out that just isn't how the world works.

Let me now move on to alternate sources of electric power. Again, myth vs. fact seems appropriate.

Myth: Geothermal plants are permanent.
Fact: Geothermal is a good idea but hardly permanent. They are heavily dependent upon the stability of the crust. A single, small earthquake could easily and permanently shut down a geothermal vent. Additionally forcing the earth to provide geothermal power over the long-term has unknown effects. Geothermal plants are also required to stay up and running 24/7.

Myth: Hydro/Tidal/Wave power is a good idea.
Fact: Yes and no. The latter two typically sit in salt water, which is highly corrosive and will require significant maintenance. The former is typically a dam which can have undesired consequences both upstream and downstream in the long-term. But moving water does create significant amounts of energy.

Myth: Wind power is weak and towers are ugly.
Fact: Well, it depends on the area of the world and where the turbine gets placed. Most turbines are placed well away from residences and businesses, which makes getting power to the destination difficult. For the ugliness, I recommend the engineers at Apple, Inc. You know - the people who came up with the iPod, the iMac. So we need the iTurbine. Hardware engineers are generally not interested in aesthetics. They just want to get it working, whatever it is. While Apple is at it, they should make the iTower to get rid of unsightly cellular towers.

Myth: Off-grid solar is the answer.
Fact: A lot of people are making a big deal about going completely off-grid. Going off-grid supposedly means no more bills from your utility company. The downside is that, while it is being touted as being paid for itself in 20 years, what the proponents fail to mention is battery replacement. The energy during the day has to be stored somewhere to last the night.

Myth: On-grid solar power is cheaper.
Fact: Well, people still have to be paid. And setting up a solar operation is not cheap. And there is the issue of how to store the power, but the power company can store massive amounts of energy and shove it around the grid as needed. And the big ol' power lines still have to be maintained. It may not be cheaper, per se, but it is more eco-friendly. Usually. Some solar plants have emissions but are far lower than most power plants.

Myth: Nuclear plants blow up and irradiate stuff.
Fact: That hasn't happened for a while. It is true we haven't figured out what to do with all the waste material but that is the stuff of movies. Perhaps we shouldn't use nuclear fuel for power plants.

Okay, those are the basic myths I could think of off the top of my head. Wikipedia has a lot of information on all this stuff since it is a hot topic.

The world sees the problem either as one issue or two issues: Stationary and mobile energy. I see a third problem: Localized energy. A viable solution for clean, renewable energy is going to be able to combat all three problems.

Stationary energy is "solved" using power plants. Power plants have the ability to store and move massive amounts of energy across long distances. In terms of renewable energy, there are really only two options - wind and solar. Of those, solar is the most consistent. The amount of sun isn't predictable but day and night are quite consistent. Wikipedia has a visual map of how much centralized solar is needed to power the world's current needs.

Mobile energy is harder to "solve". The current hoopla is over E85 (Ethanol 85%, gas 15%). Ethanol in the U.S. is based on corn/maize, which is a primary food source for stock. Corn-based ethanol is less energy efficient, only fractionally cuts down on emissions, consumes much more farmland, and costs more to make. The problem is the basis of Ethanol being corn. Corn has to be converted to sugar before it can be processed. Brazil, on the other hand, also makes Ethanol but the basis of their Ethanol is sugar cane. The end result of Brazil's sugar cane Ethanol fuel is that it is significantly more cost-effective and efficient than corn. Why not import? Well, there is an import tax that currently makes it more expensive to import sugar cane than to grow corn. The difference is that sugar cane is not essential to life. We can do just fine without refined sugar in our diets. We can't grow sugar cane in the U.S. very well or at least not in sufficient quantity. We could annex Brazil but the world might not be too happy with us.

But there are still emissions. Electric cars would be better. But only if an emmisions-free source (i.e. power plant) was used and batteries were eco-friendly as well instead of leaking toxins all over the place.

And now I come to the fun part: Localized energy. All the current solutions (pun intended) try to integrate with existing systems. Remember how I said earlier that it takes decades to return to a "blunder" (untimely demise) and try again? The famous inventor Thomas Edison actually had his own blunder back in the day. When electricity was first being distributed, it came in D.C. form (Direct Current). The A.C. form (Alternating Current) that we use almost exclusively today had not been invented yet. D.C. was big, expensive, and only traveled short distances. A.C. was able to travel longer distances and therefore required fewer power plants and therefore electricity was cheaper. A.C. is great for high-power applications such as refridgerators, freezers, central air conditioning, furnaces, washers, dryers, etc. You get the idea. What A.C. is NOT great for is small-power applications such as computers, laptops, iPods, cell phones and pretty much anything else with near 100% electronic components inside. These items are usually combined with the infamous block...the power inverter. Power inverters convert between A.C. and D.C. even when there is nothing physically connected to the other side of the unit (I assume that is the case since I can touch the unit and it is warm/hot).

Now, why the history lesson? Well, solar power for the home brings in D.C. current. The panels that people advocate "require" an inverter to convert D.C. to A.C. A lot of energy is lost in that continual conversion process. If all you are going to do is convert that energy again from A.C. to D.C. with another inverter (along with more wasted energy) then why bother with that at all? I hereby propose the reinstantiation of the D.C. wall socket to connect to localized solar power to power your tiny devices. Then, combine that with centralized solar power plants for powering the big-ticket items. If we're going to stop wasting energy then we need to stop wasting it on silly things like power inverters.

Additionally, instead of wasting fuel to power things like the radio, clock, etc. in a vehicle, slap some decent solar cells on the roof and power them that way instead. Again, localized solar power for devices that make sense.

Just some stuff to think about as we code our way to 2038.
Launch that missile!

Sunday, May 04, 2008

Google CAPTCHA broken

CAPTCHAs are those annoying little images that we have to use now to stop spammers from creating free e-mail accounts on the Internet. GMail, for several years now, has been considered a "safe haven" free e-mail address site where only manual signups were possible. This was made possible via their own homegrown CAPTCHA technology. But now it has been broken:

Article on The Register

Google actually uses its own CAPTCHA technology across multiple sites. For instance, Blogger requires filling out a CAPTCHA when posting comments to blog entries or having an account, which, unfortunately, uses the Google CAPTCHA.

What triggered this post is something I heard combined with a recent comment on an older blog posting.

At the top of every blog on Blogger is a little "Flag This Blog" button. I suspect that if enough people click that, it causes Blogger to declare the blog spam. Or at least it factors in. It could also include sudden bursts in traffic and maybe actual analysis of the blog entries itself. I'm assuming the person has to be logged in to use the button, which means they have to have used Google CAPTCHA. If they don't have to be logged in, this paragraph will make me look really silly.

At any rate, what I want to talk about is when to use third-party components instead of rolling your own. The authors of the original CAPTCHA have a pretty good idea of how spammers think and operate and have created reCAPTCHA. Programmers tend to think this way, "I'm going to reinvent the wheel regardless of what is out there already." Google has a lot of people who think that the world will end if they don't build it themselves.

When I was looking to implement a CAPTCHA plugin for MyProBB, I went looking around for the best CAPTCHA I could find first. reCAPTCHA quickly came to the top of my list. It is secure (public/private keys), the concepts seem fairly sound, they are the creators of the original CAPTCHA (so they know what they are doing), it helps make the Internet a better place (reads books), it has lots of features, it looks good, it has the right amount of visual "pop", it is free (nice plus but not important), and it is hosted on mostly neutral territory (an educational institution vs. a corporation). Plus, implementing it is super easy. The end result is that the reCAPTCHA plugin for MyProBB is the simplest plugin for MyProBB in terms of complexity - it makes a pretty good example plugin to learn from if you want to make new plugins for MyProBB.

Google programmers could learn a lot from me. Doing it yourself is not always a brilliant decision. Obviously, if Google relied on the reCAPTCHA servers, that would be a bad idea (someone would take down those servers real fast) but Google might be able to license/buy the back-end source code for a hefty chunk of change.