I meant to write article in the last big set of publications, but I exceeded my time, and it dropped off the end of my queue. The strapline on my site is “Literate programming: 150,000 words and rising”, am I using this correctly?

At the start of this website project, my objectives where 1) get experience in PHP5, 2) organise my older work into a structure so I could talk about it easier 3) write some new code, so I could delete older very-rushed code 4) get practice writing long texts, as I was poor at this when I was 20. My initial objective was write twice or thrice as many lines in English as lines in code. To be in a position where I used references, like my degrees are a successful learning experience. Most of my budget has been in the content. The strap line was sarcastic jibe that I would be literate at the end of the project if I wasn't at the start. Literate in the sense of writing, rather than reading.


Literate programming was invented by Prof D Knuth 1. The paradigm was to reduce errors by writing the entire solution as a narrative in English (or other natural language). Tools extracted the structured text and published it to a viewing format (normally TEX or HTML), and the source code, then compiled it. The structure of the code was to be built out of the phrases in the text. The order of items is should be set to “psychologically correct order”, which is important if you are using Pascal like D Knuth was. The same source lists proper tools for literate programming.
A more readable summary is 2, with much better examples. There is some analysis on outcomes for literate programing 3. It describes that thinking in English would improve the quality of the source, but lacks actual study results. Further work by another academic 4, but no study results. There seem to be a website 5 purely listing references to literate programming. Quite a few of the XML Schema I have handled are quite similar to literate programming. This fact is observed 6 by other people. There are still tools being produced for literate programming 7 8, and it seems to have a following in the R language (example 9 ). It should be remembered that Prof Knuth didn't have access to “the web” that current users do, as it hadn't been invented yet.

My practice:

Just listing relevant details...

  • Put things into objects ~ it is easier to talk or write about a functionality or feature, when it is decomposed into Nouns.
  • Create the interfaces/ API for the whole platform before writing any code ~ this is just to ensure consistency. As I am designing the entire thing it will read better. Thinking about information flow, rather than implementation details will often lead to less code being needed. Secondly where the same action seems to be needed repeatedly, I can refactor ahead of any actual cost.
  • Where I am uncertain about the implementation, I document my entire proposal as short statements in Engineering English. This is pseudo-code, but only needed for new technologies. As new technologies involve lots of reading, this helps me focus the research. These statements are to be left in the code, as a class doc header; and updated as the code changes. What is atypical to me, will be atypical to other developers. I add my references after these instructions.
  • Use tools to API document the unit. This is extended for the other tools, such as PHPunit, FOS REST etc. API docs should list all exception or error states; so other developers will know that they may occur.
  • When writing the code, and I need to do something that is stupid; I document why it is necessary. If the reason becomes invalid, I know I can correct my practice. This process normally only takes 30s per stupid, I don't have high levels of stupid in my code.
  • Everything that is visible should be named to be self-explanatory. If its not visible it should have a short lifespan. Obvious, isn't the same as written out long-windedly. Avoid un-cited cultural references.
  • When my code is well factored, each method is fairly short, this means that API docs normally cover everything that is needed. Needing to add alot of docs inside a function means you can't test easily, which is bad for its own reasons. Knuth wrote the code sections as macros, I use methods.
  • Use task trackers intelligently. Ensue binary formats that you can't diff.
  • Store things like SQL installation scripts with the source in the VCS. This means that each version of a feature has the correct and matching environment. This fights silo-coding, which is probably a good idea.
  • Tools like SASS and minify are useful, but don't check the results into the VCS. In high performance places, I have minify tools for the PHP, and frameworks like Symfony often compile the source. To reference Knuth, “tangle”.
  • For my open source items, I log my thoughts and details as I go in a wiki page, similar to what you are reading.

It is probably more accurate to say “semi-literate programming”.