One of the big things over the last few years in websites has been the rise of dynamically generated sites. Instead of writing each page by hand developers use languages like php, perl and python to create pages. Websites are written as if they were software programs.
Microsoft in particular makes it very easy to produce sophisticated websites. The .net architecture allows you to deploy programs in different ways and makes it easy for you program to become a webpage by writing a few new forms and connecting the dots together.
But should web pages be programs?
Obviously there are some things that must be dynamically generated: Shopping carts, interactive content, online games, etc. But there are a lot of sites with fairly static content that are using dynamic systems. Why?
The answer is pretty simple. It makes life easier. With dynamic content sites you can reuse bits of code really easily. Need a common toolbar? Add a line of code and it appears. Need to produce 5000 different product pages? Write a template that grabs data from a database.
But is this really the right way to go?
The main problem that always hits dynamic sites is performance. If you invest enough you can get excellent performance, but it will not usually be as efficient as writing a site using static HTML pages. The static page has to be read from disk and sent to the browser. The dynamic page has to be read from disk, the program executed and the result sent to the user. The program execution time might be significant and various things may make it slower yet, particular if there is a database involved. This tends to affect scalability more than raw performance. The overhead of serving a single page to a single user is unlikely to cause problems no matter how inefficient. If you’re serving a page to a million visitors then scalability is very important.
As so often happens in computing there is a tradeoff. You can cache everything and you might get better performance but you have to store it and manage it.
This is more useful in some situations than others so you have to really think about what exactly you want to do.
There are also strategies like pre-compiling scripts and code accelerators that can help by reducing the execution time of the dynamic pages. This approach can be extremely sucessful if you need dynamic content.
So the question to be answered is: how dynamic do you need the site to be?
Many sites that I visit are much more dynamic than they need to be. Some of the sites that I’ve created myself fall into this category.
If the content isn’t changing for each user then you have to ask whether a dynamic approach is correct. In many cases it probably isn’t.
This is especially true of news sites. Many news and commentary sites I visit use dynamic page generation. Yet they aren’t customizing the page for me at all. Are the adverts customized? I guess they must be, but advertisers handle advertising in their own way whether the page is dynamic or static. The main body of the page isn’t changing.
Of course for all I know the sites may be using some kind of internal caching system (and I really hope that they are). But it becomes clear that many are not. Witness how many sites that get slashdotted go down with database errors. (As this one would probably do if the link ended up on slashdot. WordPress doesn’t seem to cache anything, although I think there is an output cache plugin which I’m hoping to try soon)
One approach that can be used quite sucessfully is a hybrid system. Pages are produced using a dynamic system and then rendered statically. This reduces the overall load considerably because the page is only rendered once. It saves storage because only pages that are actually viewed need to be cached.
If you ever get really stuck and you know you’re about to get slashdotted or something there is sometimes a simple solution. If works if you’re using php, you need immediate caching and you have a site structure using directories (e.g /pagename/). You can use wget to grab a static copy of the page and save it as index.html. This will usually cause apache to display the static page instead of a dynamic php page. Although you have to remember to delete the index.html file before making changes to the dynamic page.
Of course the best approach is probably a reverse proxy cache. But that’s a bit much for most sites.
So the key conclusion is: make your sites only as dynamic as they need to be.
[This entry has been completely rewritten from the prevous version by AngusHardie]