My first project at MT was to create a visual interface for the tracking and usage data users of sites create. It is important to marketeers to be able to see what happens when they launch a new campaign, and that it reaches its audience.
This was work ahead of sale, so had to be generically branded. Due to the uncertain goals, the implementation needed to be flexible. I have no idea on any current release schedules.

What data?

The business outcome was a tool to be able to see what end-users where doing, against time, location, and a few data categories. The data needed to be rasterised in a way that let sales people see large volumes of results easily. Secondly in a fashion that made trends visible.
Data sources are what make and break this type of application. Google spends something like a small countries GNP on hardware 1 to allow quick data lookups by a lot of the planet concurrently. MT is contractually prohibited from using Google analytics services on it main platforms, so other data sources are required. This is quite a difficult problem, but not the focus of this text. The raw data was converted so that each IP was also recorded a GPS-style location. The data was categorised via the URLs into into action types, so it could be used better.

This project is:

A point usage pin map

  • Rendering aggregate data points into a google map;
  • Each point has an attached template that provides non-graphical information;
    • in addition to the details on the aggregates;
  • The pins are colour coded, so the sales enquiries are different to general searches;
  • Client side event loop polls the server every so often and downloads more recent pins;
  • When enabled, older pins are deleted, to conserve RAM on the clients machine;

Alternatively a heatmap

  • The same data, but rendered as an intensity gradient heatmap;
  • Contained a slide-show function, to refresh the data at predetermined intervals;
  • The IP of the company.

As a discussion on technology

  • Some care was needed to avoid flooding the data collector, during busy patches;
    • The data needed to be mapped URLs => clients, so the right data may be returned to them;
  • The raw data off the webserver obviously held the client IP, but no co-ordinates;
    • A lookup service was used for IP => [long,lat] mapping;
    • Obviously my test data from the LAN is meaningless, LAN IPs have no geographical location;
    • The currentness and accuracy of the mapping data is important. Knowing the location of the registered offices of an ISP (who are renting the IPs) does nothing useful;
    • Some people with static IPs sometimes give away more information;
    • Quite a few of the IP mappers held useful data in a single continent, but more vague else where. Chinese IPs tend to be useless, they all say the centre of that country;
    • There are still some IPs with no known location;
  • The most configurable heatmapping tool was written in Python ~ an OSS product ;
    • I edited it, so it read from MTs DB (the OSS version ran from text files);
    • I performance tweaked the script by about 1000% speedup (although it was originally really bad). The original script was designed to be run asynchronous to the webserver, so performance was no concern;
    • I adjusted the class hierarchy, to allow the interaction with various DBs in a credible fashion for OO.
    • I added range clipping so it only rendered the currently visible bits of the planet;
  • To implement a heatmap via Google's heatmapapi (one of the tech options), one needs to proxy data via your server to prohibit cross-site-scripting.
    • Second to this, one needs to manage ones google licence carefully, a heatmap of 100points isn't professionally useful.
    • One may clip the off-screen data from the heatmap request to make best use of that 100.
    • It is unfortunate that the map cells in the google overlay are controlled by the client (Google API). The data goes from the server to the client, for the googlemap api, to request it from the server, as a proxy request to the google heatmap server.
    • The google image rendering code was written in a faster platform than Python, or they had some sort of cache (convert the data to a fast fourier, use the result as a hash key, and select data rendered last week from a large memcache? );
    • I had more plans for more complex use of DB, and a few other enhancements in the Python build;
  • I spent some time trying to get a “nice looking slider” for the heatmap. After the images are computed, the images can be rendered for the user at the speed of the users internet connection;
    • Obviously good trending visibility;
    • The values in the slider need to be dynamically computed to match the users current start and stop selection;
    • Start and stop entries are comparatively simple;
    • The existing CMS needed updates to support complex libraries interacting with tabs;
  • The existing CMS supported the logins, and user identities, so nothing needed changing there;
  • It should be noted that PHP5.4 is a much better language for doing OO, than earlier builds. The amount of OO practice that can be applied in older code is limited. This comment is focused on the established CMS.

As a discussion on PM

This list will be more short.

  • My original spec was vague, so I made a point of showing my stakeholders something every so often (at least once a week). I was asking for feedback regularly.
  • Further to this I did a static visualisation prototype in Gimp (there was some L&F updates going through the CMS, and I didn't want to clash with the new release). As I could steal screenshots of the existing site, this process didn't take more than a few hours.
  • As I needed to integrate quick a few JS libraries (which may or may not operate together), I built quick UX demos. The second reason for this is the ability to demo something before the data collectors were installed.
  • My process was fairly Agile, in that I did short sprints, and had a fairly lean process. In my opinion that application was too complicated, and had a lot of data, so structuring was required. Even in Agile, everything must work; so I wrote quite alot of testcases (using differing approaches and tool kits for different aspects).
  • I followed the IEEE mandate on getting something running, then make it fast.
  • I learned my way into MT systems, whilst building this.
  • The project was about 60% frontend technologies, of various types. It covered two server-side languages, three front-end languages, five data storage systems and four test platforms.

As a discussion on testing

  • The Python project shipped with test cases, I extended them, as I wrote more code. I has plans to do various implementations, and performance monitor the outcomes, but this was clearly a “do last” thing.
  • I created testcases to the limits of the platform for the work inside the established CMS. The underlying CMS isn't built to the methodology for unit tests, although to be fair dependency injection and interface isolation is harder in the older versions of PHP.
  • With the JS, I had alot more control. My JS was created in objects, and I unit tested it. This process was abit more complex then desirable, as I needed to mock the communications with the server in the test cases.
  • When the basic system was together, I started updating the UX (this is roughly testing). I demonstrated the system to some people; monitoring their responses, but this uncovered no major changes required.
  • I did heavy usage over a period of of time. Logged the memory consumption across time. For a range of browsers.