src in github

If you are on a small screen, the academic habit of putting reference links as super-text is annoying and fiddly. I think the content should still have the links in normal fashion for bigger screen users. This module is to extract the links and display them as a OL list at the end of the page. The result list will need to be styled so the list can be operated with a thumb. I am putting my wiki content inside “screenful” blocks, so the columnisation is easier to read. This means the link number keep starting from 1 again; and result biblio needs to be similarly paginated. This requirement isn't likely to be needed by other people.
As this library is targeting links created by a wiki software; there is shortage of meta data. I can code to inject values when they are supplied, and supply generic values otherwise. It would be nice to have automatic publication date extraction, and a title extraction. Date published can be presented from last-modified HTTP headers, this can be taken from AJAX, setting the request type to HEAD. HTTP doesn't present any means to get a title outside of a full request; as not all resources have titles.

I have put an MVP solution in the drupal-notes page. Real code will need to be more flexible and think about HTML generation. This is lower down the prioritise than modules that are likely to be used. Update, I bumped this up the work list as I have another boring train journey, with intermittent net connection.

Dev notes

  • I built a static demo, and a unit test (linking qunit);
  • I will remodel my resources to be more clear (jQuery articles will now start with jQuery);
  • Part of the responsibility of the plugin will include fixing the TOC;
  • All the HTML will need to be in callbacks or be done with templates; so it is brandable;
  • I think I now have enough local jQuery dev to create a template project (and expected future developments);
  • Work will be hosted on github along with the other projects;
  • This is is using prototype based objects, rather than literal object to make it abit more readable;
  • Callbacks are part of the structure to make localisation to sites easier. These will be done with apply(), and be run in the context of the plugin. Callbacks are expected to be really simple and stateless. They either return HTML text (so may need to be altered to match the current site); or adjust attributes of structural elements (so will definitely need to match the param feed in to the plugin).
  • Semantically this is applied to a document, and doesn't reference the “selected element”. One could write this as a plugin to extract links from the selected element, but semantically where does that imply that one should insert them? This is abit confused, but does occur in other plugins.
  • I will add some tests specifically for jQuery mobile, but this is only expected to be small patches if anything.
  • Logging is controlled via option.debug, and is currently output via console. Developers who use MSIE first are expected to load the dev tools, as your platform doesn't do this automatically. With dev tools loaded (via <f12>), your browser can execute console statements. Casual web-use should be logging disabled for performance (and lost souls who use MSIE won't know what a debug is anyway).
  • My usecase requires altering a TOC. As of Tue 20th, I have added code to update the TOC. Practically this requires a small change the TOC folding code, as the TOC wasn't expected to change size.
  • I need to build a sensible solution to tagging my source with version numbers. Obviously development is done inside a VCS, which holds all the granularity on version that you can need or want. The common way to tag releases, is to include the plugin version is the filename. I have solved this problem previously, but I'm not building RPM, NPM or similar. At present I am hand building these filename, but this wouldn't scale very well. As a second issue I'm not tagging the version in the meta data anywhere, so can't reference it.
  • I have tested this with qunit not jasmine, as qunit is really small and light to setup. I can include the qunit source in my src tree without much effect. Secondly as a standalone plugin, I can't CI, as I have no target to CI into.
  • I note that when I integrated the plugin into my site; I suddenly started to behave like a designer. I wrote a version, looked at it, and rewrote it till the layout “worked”.
  • My usecase has paged data, as very long columns in text are hard to use. Print media as books would put the references at the end of each page or chapter. Journal print media at the end of the paper, but many journal articles are only a few pages. I think its better narrative to put the references at the end of the content.
  • Although I have edited the DOM via JS, this is actually the first time I have edited HTML headers or HTTP headers from JS. I will move the high risk sections to the start of the second edition.

By April, I have built the second edition that will pull a references cache from the original server. This is a reasonable performance for the indices. To build the cache, please use the tests/extractor.php file, run as a webpage. This will extract the references on a URL that you set (assuming on the same host, due to CORs), and return to you with a JSON array. This should be written to into your CMS in your normal process (a jQuery library shouldn't dictate about the CMS). The version that I publish copes with erratic headers and meta-data of the real internet. My personal site has a wrapper library that integrates the plugin into the site, allowing for resources which are cached and ones which are not. I have worked the erratic source data as far as possible.


  • Be an OO jQuery plugin (cheap solution will be tightly dependant on the framework, so separation will make little practical difference). This is mentioning a solution, but should be taken as focussed on budget.
  • Copy all the links inside an HTML element.
  • Neuter the original links to point to the new biblio (those links shouldn't be used, otherwise is confusing).
  • Append the semantics of the original links to a visible OL list, inserted at a target block element.
  • My prime usecase supplies data in “pages”, sensible output will need to be applied, so the links are numbered usefully.
  • Library to strip “display:none” CSS attrib from targeted elements (I additionally use notices at the head of the article to inform mobile users).
  • As extension get the publish date from the HTTP headers of the links target; and the resource title, for a more informative biblio title.

Usability concerns / meta data

  • My prime usecase is wiki links, these are terse on meta data.
  • Good UI/ UX / accessible sites have detail on each link.
  • I can optionally read Aria attributes off links when they are available.
  • For my usecase, I will explore to see if it is possible to inject data into the wiki structure.
  • A third option is write JS to pull the linked documents, and append the meta DESCRIPTION tag, where found. This will need to work with CORS. Most web browsers allow a maximum of 4 requests concurrently, and frequently 1 AJAX request %%. HTTP2 will address this, but not all facilities support it. This architecture is slow, and requires a net connection (I mostly use mobile sites on the train, so have spotty connection). It may be easier to cache them as a second resource on the server. Empirical testing demonstrates more user friendliness with cached extensions.
  • As addendum to previous, if this is used, should provide visible notice that the link is being altered.

%% I am citing my testing for a pre-jQuery AJAX library I wrote.

API and interface notes

This has no public API functions, just a single access via biblio(). Biblio is configured by the following (the function takes a hash, with the following keys):

  • debug ~ whether to write to console.log, or not.
  • width ~ default: 400 ~ size to trigger the extraction.
  • wholeURL ~ default:8 ~ how long an URL must be to be “converted” to a simpler name.
  • wholeTitle ~ default:50 ~ when supplied with links, the maximum size that is used.
  • extendViaDownload ~ default:0 ~ Attempt to download further information from the target link. See list at end.
  • referencesCache ~ defaults:"" ~ if using the single remote download option, what URL?
  • selector ~ default: 'sup a' ~ what to look for, WRT the links that being extracted.
  • gainingElement ~ default: '#biblio' ~ where to add the generated OL.
  • loosingElement ~ default: '.lotsOfWords' ~ where to look for the links (you probably don't want footer links to show up, for example)
  • tocEdit ~ default:0 ~ revise the TOC? For ob site only probably.
  • tocElement ~ default: 'fieldset.h4_menu > .h4_lean' ~ what element to add content to.
  • textAsName ~ default:3 ~ if using the visible text in a link, minimum number of chars before guess its word.
  • limitDesc ~ default:200 ~ really long meta descriptions break the layout. Trim them to this length.
  • callbacks ~ see following list:



Callbacks are functions called in the context of the BibliographyExtractor class, so may access the options hash. They should return HTML, or perform the stated change. If you want to override many of these, it may be faster to fork the plugin and edit the library.

Value for extendViaDownload

The downloading has to be organised sequentially, due to webrowsers sensibly limiting available threads. Additionally I add a small delay, between each one, as I don't assume that this is the only page loaded.

  • 0: Off
  • 1: Download the extra data from '3rd parties' when requested; show a button.
  • 2: Download the extra data from '3rd parties' on module load.
  • 4: Download the extra data as JSON from 'referencesCache' URL.

Everything is tagged '3rd party', as the links may be on your own website.

Smoke Tested on:

In normal TDD fashion, I made real unit tests at the point I wrote the units. I iterated the UX side enough times with a range of data. Recently I had access machine with alot of different browsers installed on it, so I can state that this works on:

  • recent FF/linux
  • recent FF/dows
  • recent chrome/linux
  • recent chrome/linux
  • recent opera/dows
  • safari/droid
  • chrome/droid

As I am importing es5-shim, the older MSIE will work...


  • Add aria attributes DONE
  • impl TOC editing DONE
  • impl extendViaDownload DONE
  • make the downloaded stuff state file size and last edit date DONE
  • apply to other articles on my site DONE
  • test on other webrowsers, not expected to be a problem, as it doesn't do anything awkward. DONE mostly