CMS - Content Management Systems Wish List

Keeping redundant information consistent

Keeping rendundant code consistent, is probably the reason why we need CMS at all.

A page like this one would benefit from having a table of content generated automatically whenever a new h2-element is added.

Whenever a new page is added, menus and next/prev links need updating, too.

Plain Vanilla vs. detailed control

Plain Vanilla

Most editing tasks should be straightforward and as much what-you-see-is-what-you-get as possible.

Exceptions

But plain vanilla is often simply too plain:

Canonical URLs

Each page in the site should have a canoncial URL. Maybe a page is available as

and many more. But one URL, e.g. http://eiði.com/%FAr%20ei%F0i/ should be the canonical one, the others should redirect (301 Permanently Moved) to the canonical one.

Human readable URLs

Using real words

The URL http://gadget.com/tv/ is easier remembered than
http://gadget.com/bin/iis/display-page.asp?product-category={af34fef887}

Use encoded non-ASCII

E.g. http://heima.olivant.fo/~egilstro/eiði.html
is encoded as http://heima.olivant.fo/~egilstro/ei%F0i.html

Don't force structure on the URL

http://gadget.com/products/tv/ may be a good URL is TVs are placed under the Products menu item. But the relation to the menu structure should not be enforced. http://gadget.com/tv/ is easier to remember. And more robust, if the TVs are moved from the Products menu to the Consumer Electronics submenu.

Change URL without breaking links

E.g. hoppa.com changed from "Defence" to the more objective term "Military" in a URL, without breaking the old link to http://hoppa.com/Defence/index.en.pl.gz?. (2014 update: with new owners of the hoppa.com domain, the link is now broken. But broken in a technically correct and graceful way: 410 gone.)

A choice of formats

The content should be deliverable in XHTML (for easier consumption by other applications), and maybe HTML for better legacy browser compatibility, WML for mobile phones and PDF or XSL for printing. Highly structured data maybe as XML, comma-separeded files, Open-Office (or legacy Excel) spreadsheets...

WML pages are generally smaller than HTML pages. And PDF/XSL should generally contain everything the user wants to know in one file.

Graphs

Graphs should be presented as SVG, with fallback to GIF, and the underlaying data available as XML.

Accessibility

See w3.org/WAI

Validity

Validity is not and end in itself, but it is a good indication that the code is clean, and taht it will do what is expected on most platforms.

In general, it si OK for a page to be inavalid, as long as there is a reason for it, and the consequences—if any— are known and accepted.

Import data from existing sites

The most practical way to do that, is probably to rewrite the original system to create a set of XML files suited for import into the new system.

But if the new system can read the database of the old one, or simply read its generated pages, nothing could be better.

Import URLs from existing sites

Changing CMS should not break the links to the old site. Old links should still work, if the new system uses new URL, the old URLs should be redirected (301 Permanently Moved) to the new one.

Run on mainstream web servers

If the CMS requires special support from the web hotel, that severely restricts our freedom to choose web hosting provider.

Running on static HTTP server

If the site doesn't require data from the users, the CMS site could run offline. The CMS could FTP a complete site to the web hotel once a day, or FTP pages as they change.

Running on mainstream CMS

If special CMS software must be installed on the server, it should be generally available software that is running on many different web hotels. Being locked to one web provider is dangerous.

At the very least, you should have access to all source, and the option of running the CMS on your own server.

Javascript allows some dynamics on a static server. Incredibly, the language setting of IE, Mozilla and Opera is invisible to Javascript. So language negotiation is not an option on static servers, at least not if the client is one of these mainstream browsers.

Running on mainstream dynamic servers

When dynamics are needed, the CMS should still not requre anythng more exotic than JSP, ASP, PHP, XSLT or mySQL. Maybe stuff like Cold Fusion, Oracle and MS SQL; these are rare, but still more common than any CMS offering by web hotels.

Correct timing and caching

The Last-Modified header should be correct.

This is not easy, because the page is usually made from different 'atoms' that may have been updated at different times. Last-Modified of the page should be maximum of Last-Modified for the atoms in the page

Almost all CMS 'solve' this problem by claiming that all pages have been modified right now. (If they don't know the modification time, they should at least shut up, and omit the Last-Modified header.) This causes browsers to retrieve updates of pages that have not been changed, and makes it harder for serach robots to determine what they should bother to re-index.

Editoral control over what is considered new.

Any change in a page, no matter how small, will cause Last-Modified to be updated, and the browser to retrieve the entire page when refreshing.

But smaller corrections should not cause emails to keep people informed of updates at the site, and not new items in RSS streams. Only a human editor can determine when a new page is really new.

Compression

When the user agent supports it, pages should be delivered in compressed form.

Access control without passwords

The CMS should accept client SSL certificates as authentication, when authentication is needed.

Alternative forms

Pages should be presented for both HTML browsers and WAP. There should not be a 1-to-1 mapping between WAP and HTM pages, because WAP pages need to be smaller. RSS feeds and other metadata usually doesn't map to specific HTML pages.

Moving an item without changing its URL

A web site always has a structure, often a tree. A page must keep its URL, even if it is moved to a different location in that tree.

Handling document versions

Like W3C often does: Quoting two URLs in a document, the URL of this version, and the URL of the newest version. (These URLs must be different, even when this version is the newest.)

Handling exceptions

I have a hard time defineing this point, because exceptions are—well—exceptional.

One example: I have this group of gallery pages. CMS would be a tremendous help in keeping link titles consistent with headers, thumbnails concistent with big images, <prev> links consistent with <next> links &c. Except that I want something special for leynar_surf.html and spot.html.

A CMS should be able to handle this kind of thing, but I don't know how.

Receiving external links

Generate bookmarks

The CMS should generate unique bookmarks to make deep linking from the outside easy.

And the URLs of internal pages should be visible to the user. I.e.: no frames, and no full page Flash.

Provide context

When following a deep link, maybe from a search engine, the page should provide some context, e.g. with pointers up and home.

Redirect unwanted external links

If deep links for some reason are unwanted, they should be stopped politely. Simply redirecting to the home page, or displaying 404 Not Found, is not quite polite enough.

For an image, one way to do it, could be sending an image of the URL of the page that the image is stolen from. Another could be returning the page that the image belongs in.

Titles

Titles should—even with only the first few characters shown—help the user navigate between his windows. E.g. this page is titles CMS - Content Management Systems Wish List, and the first two characters, CM, is enough to identify it in my task bar.

Terminate menus on different levels

There may be a fixed maximum for menu nesting, but there should be no minimum: A top menu item can exist without sub-items. We don't want:

just to get down to level 3.

Help Google

Help Google by generating clean code, using informative file names, title elements etc, and supply correct update metadata.

Don't try to trick Google. That will backfire when Google catches up. And I haven't really gained anything, if I get a million hits on this page from peaple looking for Britney Spears.

Language meta data

The CMS should keep track of what language is used in each text, and declare the main language in the HTTP header. Page elements with other languages, shuld be declared with the lang (or xml:lang) attribute.

Language negotiation

Default language should be determined by browser settings. Translations should be assigned a lower q than originals, so someone who prefers English but understands Faroese, will get a Faroese original, rather than the English translation.

Links between languages should link to the similar page in the other langguage. E.g. Uttanlands, International and International should link to each other, not to various translations of the home page.

When there is no similar page in the other language, there are to possibilities:

If you don't read Faroese, and someone sends you a link to Familjuferðir, you would expect to find an English translation behind Union Jack English and Danish behind Danebrog Dansk but you won't.

Open source

Yeah, open source is OK. But from the above, it is clear that I consider other forms of openness more important.

Technology

I guess that XML, XSLT and JSP are good technologies to satisfy these wishes. But technology is not part of my demands, these are guesses.