Troutgirl wieghs in with some more information on the Friendster PHP conversion:
1) We had not one but TWO guys here who had written bestselling JSP books. Not that this necessarily means they’re great Java devs, but I actually think our guys were as good as any team.
2) We tried rewriting the site in Java twice, using MVC and all available best practices. It actually got slower. Anyway, what does MVC have to do with speed or scalability? I thought it was a design cleanliness and maintainability thing.
3) We tried different app servers, different JVMs, different machines.
4) Anything that money could do, it did.
George Schlossnagle has the best coverage of the topic, saying It’s the App code, stupid (As an Adult Swim fan, I have to appreciate the Big O reference):
My personal bias against java is that many Java programmers seem to be thread-crazy, and as has been noted, humans just aren’t smart enough to program threads.
I once lost a job by saying “Threads are dangerous” at a job interview. Everything was going well, until they asked me what I thought about threading. (this was a business application) Obviously, my opinion didn’t agree with this companies vision for their software. Ironically, I was being interviewed to help them fix their problems with frequent unrepeatable lockups and GPFs. They did not see the connection.
One advantage of PHP is that it simplifies your life by not bothering you with threading, just as it doesn’t bother you with memory management. That doesn’t mean that concurrency issues go away, any multi user application will have them, it just means that they are made explicit at the application level.
George Schlossnagle:
Share nothing’ (which if you don’t want to click on the link basically means not performing much inter-process or inter-thread data sharing or pooling) is not something unique to PHP. In fact, I’m quite sure that you can implement it in Java as well. The problem is that Java gives you a number of powerful facilities with which to shoot yourself in the foot
Here is a java performance article that recommends against object pooling. The struts vs. webwork comparison has an interesting take on this:
Struts Actions must be thread-safe because there will only be one instance to handle all requests. This places restrictions on what can be done with Struts Actions as any resources held must be thread-safe or access to them must be synchronized.
WebWork Actions are instantiated for each request, so there are no thread-safety issues. In practice, Servlet containers generate many throw-away objects per request, and one more Object does not prove to be a problem for performance or garbage collection.
It seems like the Java culture, if not the java language seems to encourage premature optimization in the form of creating pools of shared information.
John Lim expresses some skepticism about the inherent scalability properties of any language, claiming its the developer skill not the language.
I think I’ll end this post with heresy. The field of web development seems to have a mental model of application development forged from the dot-com boom era. We operate with the vision that our applications are going to experience exponential usage growth. Perhaps this leads to an unhealthy focus on scalability in web applications versus other requirements. Perhaps this also leads us to employ optimizations prematurely before we can even understand their impact or even have a need for them. Perhaps these premature optimizations even hurt scalability and performance and needlessly complicate our applications.
Perhaps the Java Culture is more infected with “dot-com-itis” than the php culture?
Scalability – YAGNI?
[...] had hoped that more information would come out about Friendsters Java to PHP conversion (1 2 3). Sadly, I don’t think thats going to happen. It seems [...]
Hi Jeff,
Nice Summary.
>> John Lim expresses some skepticism about the inherent scalability properties of any language,
>> claiming its the developer skill not the language.
A slight correction. I certainly don’t believe it’s merely developer skill per se.
The technology needs to be mature enough to support scalability. But scalability does not come out of the box.
And you’re right there is an obsession with scalability which is irrelevant for the typical website. Not many sites will ever get slash-doted.
Really excellent posts today by you, Schlossnagle and Fuecks.
I find one of the great strengths of PHP is that I can code small, medium and large sites in a way appropriate to each. Small sites can be traditional mixed PHP and HTML pages because nothing is faster to code. Medium sites can us templates, db and some light controller logic. Large sites can deploy a framework to improve working as a team. This makes programmers scalable as well as their apps.
The great thing is that all of these applications use the same concepts, just rolled into or unrolled from libraries/frameworks. This allows you to code small, smart and fast, and makes it easy to scale up an app as needed.
I think PHP need a whole new take on the idea of “design patterns.” In Java/C++ there tend to be bad ways and a good way. In PHP there are still bad ways, but there are multiple good ways. When there is more than one way to do the same thing, it’s understanding the concept and choosing the right path given the requirements.
Perhaps these are more best practices, but there is a pattern feel to them as well. Traditional compiled language patterns apply to PHP when dealing with the purely algorithmic. But has any PHP group dealt with things like the Request and nailed down the key concepts and then shown 2-3 solid implementations depending on the scale and type of the application? We now need some guidance with how and when to best use exceptions. The list goes on.
Jeff, very good links on PHP scalability. It seems like PHP does try to focus on its niche and lets other technology do what they do best. With this in mind, I’m paying close attention to MySQL Cluster and how it might handle scalability on the database side. I can only imagine how fast things can get when you use an in-memory database. My initial euphoria was dampened by the memory requirements (said to be 2.5x your DB size by some reviewers), but in looking over some papers, it seems like you each DB node can serve up a portion of the DB. Why hasn’t there been more hysteria from PHP scalability guys on MySQL cluster? PHP on the front-end and MySQL Cluster on the back-end sounds like a match made in “shared nothing” heaven
Just throwing another link into the party: The Scalability Holy Grail – wise remarks.