Monday, August 29, 2005

Google has nifty feature.

If anyone can figure out how Google is selecting what links appear on certain searches, I would really like to know.

Here is what I'm referring to:

http://www.google.com/search?hl=en&q=cnet
http://www.google.com/search?hl=en&q=yahoo
http://www.google.com/search?hl=en&q=microsoft

Each of those three searches return four select links to sections of each company's website. What I want to know is how Google is deciding what links come up. This is possibly the biggest feature they have put together in a long time. It seems to me that anyone can benefit from this, but the rules seem really sketchy at best. It looks like different subdomains are favored, but then why is CNet inconsistent? It also appears that you can only have 15 characters for each displayed link.

I've not cared much for SEO until now. This, however, piques my interest because this potentially allows searchers to quickly find what they are looking for if they know your company name or website name but not the target product name. Or they know the product name and you want to show related products (e.g. popular plug-ins for the product) right inside Google. People tend to get lost on websites after finding them with Google - it makes sense to help them out as much as possible while on Google.

Unfortunately, there will be those idiots out there who will try to abuse the feature, which means Google will probably be forced to lock it down or remove it. Those people should not be allowed to touch a computer let alone have Internet access. That would save the rest of us from the usual migraines and wasted time.

Saturday, August 20, 2005

Misinformation

Misinformation is abundant. For instance, I needed a read/write locking mechanism for Windows and was hoping to get some development tips and ran across this as one of the top ten results from Google:

http://www.joecheng.com/blog/entries/Writinganinter-processRea.html

That's all fine and dandy...if you want application performance to, to use a technical term, suck. Performance really goes down the tubes if 10 threads want to write all at once. Every thread wanting to write has to line up and then 1000+ synchronization primitives have to be acquired before the last one can finish writing.

If you are a software developer with less than 5 years of experience and don't know the first thing about a topic, please, please, please do NOT blog on it, do not write an article, do not visit a forum declaring knowledge of it, do pass GO, and do not collect $200. (If anything, you owe the world a formal apology and you owe me $200 for wasting my time). Google and other search engines will pick up your misinformation, spider it, cache it, and spread the misinformation to the uninformed population. Who in turn will think that that is the answer to their question and will blindly redistribute it in an equivalent format - or worse, link to the site with the misinformation - giving it a higher Google ranking.

Some people wonder why applications keep getting slower even though more CPU and RAM are available. This is why. Misinformation is quite abundant and, while there are millions of programmers out there, there are few people knowledgable enough to understand what is wrong, why it is wrong, how it should have been done in the first place, and then be able to accomplish it in a reasonable amount of time. Over time, misinformation builds on itself and now we have programmers who think they are real software developers by reading books that give them warm fuzzies as they put CDs with source code on them into the drive.

For those interested, the solution to the problem of read/write locking involves using a mutex, two events, and four integers that track:

1) How many read locks there are.
2) How many threads want a read lock.
3) Whether or not a write lock is currently executing.
4) How many threads want a write lock.

With those four variables, it is easy to determine what to do - either wait on an event or acquire the lock. When unlocking a write lock, check for other threads wanting a write lock and let them have it (it is still "hard" to obtain a write lock, so once you have it, don't let go), otherwise let those wanting a read lock to have it. Those event objects I mentioned earlier come in handy here. This setup favors letting writers have priority, which can be a problem if there are lots of writes and very few reads. However, read/write locks are supposed to be mostly read operations with few writes. If it is mostly writes, a single mutex is more efficient.

The other difference between the former's misinformation and my real information is that I wrote real code for a read/write locking mechanism while the misinformer states they never even tried it out, but it "should work". Mine actually works, is performance-friendly, and there is real, working code using it. Granted, I didn't publish any of it here (for numerous reasons - some legal), but I did describe it well enough that someone could easily duplicate the effort without too much trouble. Besides, if I displayed code here, it wouldn't teach anyone anything about how a read/write locking mechanism works. People would simply swipe the code and assume it worked properly. Then they would complain to me about problems they are having that wouldn't even be related to read/write locking. I'm not going to hand-hold or wipe noses.

Today we learned that if we are new programmers to not start spouting out information left and right. It is dangerous. And stupid. And tends to come back to haunt us later. We also learned how to develop a real read/write locking mechanism for Windows. Yay for us.

This episode of the Cubicspot blog is brought to you by:
ShareWrap - the only product activation system you will ever need.
The letters W and Z.
And ramen. Everyone loves ramen (especially the Mi Goreng kind - mmmm...large quantities of MSG).