Let's talk about PHP. The scripting language, not the health insurance. PHP is, in my opinion, one of the greatest development tools ever created. It didn't start out that way, which is where most of its bad rap comes from, but it has transformed over the past decade into something worth using for any size project (and people do!). More specifically, I've personally found PHP to be an excellent prototyping and command-line scripting tool. I don't generally have to fire up Visual Studio to do complex things because I have access to a powerful cross-platform capable toolset at my fingertips. It's the perfect language for prototyping useful ideas without being forced into a box.
BUT! Some people WANT to force everyone into a box. Their box. Introducing the PHP-Framework Interop Group or PHP-FIG. A very professional sounding group of people. They are best known as the folks who produce documents called PHP Standard Recommendations aka PSRs. This group of 20 or so people from a wide-range of very popular projects have gotten together to try to work out some of the problems they have encountered when working with PHP. Their goal is simple:
There's nothing wrong with having Standards. In fact, I'm a strong advocate of them. What I'm NOT an advocate of is being told that my code has to be written a specific way by clueless people who blindly follow PHP-FIG PSRs without understanding where they are coming from. The worst offender is basically everyone in the Composer camp. In software development, the more dependencies you have, the more likely it is that your project will break in spectacular ways. And, as we all know, everything breaks at the most inopportune times. Composer takes that concept to its extreme conclusion and introduces the maximum amount of dependencies into your software project all at once. No thank you very much. Correct software development attempts to reduce dependencies to the bare minimum to avoid costly breakages.
Composer exists because PSRs and lazy programmers who don't know how to develop software exist. PSRs exist because PHP-FIG exists.
The worst PSR in PHP-FIG is PSR-4, formerly PSR-0: The autoloader. As hinted by the zero (0) in "PSR-0", it was the first accepted "Standard" by PHP-FIG - and I use the word Standard loosely here. The concept of the autoloader stems from a very broken portion of PHP known as a namespace. In most normal programming languages that implement namespaces, the idea is to isolate a set of classes or functions so they won't conflict with other classes and functions that share the same name. Then the application developer can choose to 'import' (or 'use') the namespace into their project and the code compiler takes care of the rest at compile-time - all the classes and functions of the whole namespace become immediately available to the code.
That sounds great! So what could possibly go wrong?
In PHP, however, namespaces were only halfway implemented. PHP developers have to declare, up front, each class they want to 'use' from a namespace to simplify later code AND manually load each file that contains the appropriate class. This, of course, created a problem - how to get the files to load that contain the code for the class without writing a zillion 'require_once' lines? Instead of correctly implementing namespaces and coming up with a sane solution, a hack was developed known as __autoload() and later became a formalized hack known as spl_autoload_register(). I call it a hack because the autoloader is effectively an exception handler for a traditional code compiler - something no one in their right mind would ever write. With an autoloader, at the very last moment before PHP would throw up an error about a missing class, the autoloader catches the exception and tells PHP, "Oh never mind about that, I got it." Thinking about all of the backend plumbing required to make THAT nonsense happen (instead of correctly implementing namespaces in PHP) makes my head hurt.
Exception handlers, when written correctly, do nothing except report the exception upstream and then bail out as fast as possible from the application. Exceptions happen when an unrecoverable error condition occurs. Good developers don't try to recover from an exception because they realize they are in a fatal, unrecoverable position. (This is why Java is fundamentally broken as a language and a certain company that shall not be named made many terrible decisions to ultimately select Java as their language of choice for a certain popular platform that shall also not be named.)
Instead of fixing the actual problem (i.e. broken namespace support), us PHP userland developers get the autoloader (i.e. a hack). Composer and its ilk then builds upon the broken autoloader concept to create a much larger, long-term disaster: Shattered libraries that have dependencies on project management tools that someone may or may not want to use (Hint: I don't) and dependencies on broken implementations of certain programming concepts that should be fixed (i.e. PHP namespaces. By the way, don't use things that are broken until they have been fixed - otherwise you end up with hacks).
Another problem lies in the zillions of little files that PHP-FIG PSRs have directly resulted in (e.g. insane rules like "every class MUST be in its own file"), which results in huge increases in upload times to servers over protocols like SFTP (and FTP). What is known as a "standalone build" is pretty rare to see these days. A standalone build takes all of the relevant files in a project and merges them into one file. A good standalone build tool also allows users to customize what they receive so the file doesn't end up having more bloat than what they actually need.
Last, but hardly least, the one-class-per-file rule and the autoloader concept directly violates operating system design. File I/O is extremely expensive (time-wise) compared to most operations in a computer. You want to load as few files as possible when loading an application. Every time a file is requested, PHP first checks the opcode cache (if enabled) to see if it has loaded/parsed the file recently. If it has, it may make a request to get the timestamp of the last modification. Again, PHP tends to cache this information in RAM. I wonder why that might possibly be the case? Perhaps it has a little something to do with the fact that opening a real file on the filesystem and reading data off disk is the most expensive system call outside of memory allocation. But, nah, that couldn't possibly be the reason. Oh wait! It is the reason! Reading 1,000 tiny files off disk is always going to be way more costly than reading a single file off disk.
Congratulations PHP-FIG (and anyone else who designs package dependency managers like Composer): You've successfully exchanged one problem (i.e. "poorly written classes") for a different problem (i.e. "poorly written classes spanning hundreds to thousands of files with massive, sprawling, unnecessary dependencies that take forever to upload and read off disk and now relies on broken-by-design non-features of PHP").
Update June 2017: I've created and released a powerful new tool called Decomposer. It basically does what you might imagine it would do in about as tongue-in-cheek a way as you might expect.
Update April 2, 2022: This post is still relevant. And in the intervening years, we've seen several popular package authors intentionally turn their dependent packages into malware. So not only is all of the above still valid, you now don't know which software authors will go rogue. The assumption that all software developers follow an ethical/moral code is false. This is yet another reason why automatic dependencies are so bad and each component in an application should be carefully chosen and carefully vetted and, better yet, don't use package management for software development in the first place.
BUT! Some people WANT to force everyone into a box. Their box. Introducing the PHP-Framework Interop Group or PHP-FIG. A very professional sounding group of people. They are best known as the folks who produce documents called PHP Standard Recommendations aka PSRs. This group of 20 or so people from a wide-range of very popular projects have gotten together to try to work out some of the problems they have encountered when working with PHP. Their goal is simple:
"The idea behind the group is for project representatives to talk about the commonalities between our projects and find ways we can work together. Our main audience is each other, but we’re very aware that the rest of the PHP community is watching. If other folks want to adopt what we’re doing they are welcome to do so, but that is not the aim. Nobody in the group wants to tell you, as a programmer, how to build your application."No, "We'll just let everyone else tell you how to build your application." At least that's the implication and it certainly is what seems to be happening.
There's nothing wrong with having Standards. In fact, I'm a strong advocate of them. What I'm NOT an advocate of is being told that my code has to be written a specific way by clueless people who blindly follow PHP-FIG PSRs without understanding where they are coming from. The worst offender is basically everyone in the Composer camp. In software development, the more dependencies you have, the more likely it is that your project will break in spectacular ways. And, as we all know, everything breaks at the most inopportune times. Composer takes that concept to its extreme conclusion and introduces the maximum amount of dependencies into your software project all at once. No thank you very much. Correct software development attempts to reduce dependencies to the bare minimum to avoid costly breakages.
Composer exists because PSRs and lazy programmers who don't know how to develop software exist. PSRs exist because PHP-FIG exists.
The worst PSR in PHP-FIG is PSR-4, formerly PSR-0: The autoloader. As hinted by the zero (0) in "PSR-0", it was the first accepted "Standard" by PHP-FIG - and I use the word Standard loosely here. The concept of the autoloader stems from a very broken portion of PHP known as a namespace. In most normal programming languages that implement namespaces, the idea is to isolate a set of classes or functions so they won't conflict with other classes and functions that share the same name. Then the application developer can choose to 'import' (or 'use') the namespace into their project and the code compiler takes care of the rest at compile-time - all the classes and functions of the whole namespace become immediately available to the code.
That sounds great! So what could possibly go wrong?
In PHP, however, namespaces were only halfway implemented. PHP developers have to declare, up front, each class they want to 'use' from a namespace to simplify later code AND manually load each file that contains the appropriate class. This, of course, created a problem - how to get the files to load that contain the code for the class without writing a zillion 'require_once' lines? Instead of correctly implementing namespaces and coming up with a sane solution, a hack was developed known as __autoload() and later became a formalized hack known as spl_autoload_register(). I call it a hack because the autoloader is effectively an exception handler for a traditional code compiler - something no one in their right mind would ever write. With an autoloader, at the very last moment before PHP would throw up an error about a missing class, the autoloader catches the exception and tells PHP, "Oh never mind about that, I got it." Thinking about all of the backend plumbing required to make THAT nonsense happen (instead of correctly implementing namespaces in PHP) makes my head hurt.
Exception handlers, when written correctly, do nothing except report the exception upstream and then bail out as fast as possible from the application. Exceptions happen when an unrecoverable error condition occurs. Good developers don't try to recover from an exception because they realize they are in a fatal, unrecoverable position. (This is why Java is fundamentally broken as a language and a certain company that shall not be named made many terrible decisions to ultimately select Java as their language of choice for a certain popular platform that shall also not be named.)
Instead of fixing the actual problem (i.e. broken namespace support), us PHP userland developers get the autoloader (i.e. a hack). Composer and its ilk then builds upon the broken autoloader concept to create a much larger, long-term disaster: Shattered libraries that have dependencies on project management tools that someone may or may not want to use (Hint: I don't) and dependencies on broken implementations of certain programming concepts that should be fixed (i.e. PHP namespaces. By the way, don't use things that are broken until they have been fixed - otherwise you end up with hacks).
Another problem lies in the zillions of little files that PHP-FIG PSRs have directly resulted in (e.g. insane rules like "every class MUST be in its own file"), which results in huge increases in upload times to servers over protocols like SFTP (and FTP). What is known as a "standalone build" is pretty rare to see these days. A standalone build takes all of the relevant files in a project and merges them into one file. A good standalone build tool also allows users to customize what they receive so the file doesn't end up having more bloat than what they actually need.
Last, but hardly least, the one-class-per-file rule and the autoloader concept directly violates operating system design. File I/O is extremely expensive (time-wise) compared to most operations in a computer. You want to load as few files as possible when loading an application. Every time a file is requested, PHP first checks the opcode cache (if enabled) to see if it has loaded/parsed the file recently. If it has, it may make a request to get the timestamp of the last modification. Again, PHP tends to cache this information in RAM. I wonder why that might possibly be the case? Perhaps it has a little something to do with the fact that opening a real file on the filesystem and reading data off disk is the most expensive system call outside of memory allocation. But, nah, that couldn't possibly be the reason. Oh wait! It is the reason! Reading 1,000 tiny files off disk is always going to be way more costly than reading a single file off disk.
Congratulations PHP-FIG (and anyone else who designs package dependency managers like Composer): You've successfully exchanged one problem (i.e. "poorly written classes") for a different problem (i.e. "poorly written classes spanning hundreds to thousands of files with massive, sprawling, unnecessary dependencies that take forever to upload and read off disk and now relies on broken-by-design non-features of PHP").
Update June 2017: I've created and released a powerful new tool called Decomposer. It basically does what you might imagine it would do in about as tongue-in-cheek a way as you might expect.
Update April 2, 2022: This post is still relevant. And in the intervening years, we've seen several popular package authors intentionally turn their dependent packages into malware. So not only is all of the above still valid, you now don't know which software authors will go rogue. The assumption that all software developers follow an ethical/moral code is false. This is yet another reason why automatic dependencies are so bad and each component in an application should be carefully chosen and carefully vetted and, better yet, don't use package management for software development in the first place.
Since writing this rather ranty post (hey, it's a blog!), I have added three fully automated repositories to GitHub:
ReplyDeletehttps://github.com/cubiclesoft/php-libs
https://github.com/cubiclesoft/php-libs-namespaced
https://github.com/cubiclesoft/php-libs-to-composer
To deal with the occasional person who asks for Composer-enabled versions of my libraries. It is possible to take any flat class-based solution and convert it to (broken - as per this article) namespaces and then transform the namespaced version into a Composer compatible version, which then automatically pushes that out to Packagist (which is, BTW, a terrible name for a website). Everyone wins: Same working code, multiple flavors, all automated and I get to generally ignore Composer as should everyone else.