The source is available as a compressed bundle. Further reading on Symfony will be published symfony2-overview. This work currently seems relevant 1.

Requirements

Your task is to write an application that upon request can get data from the Openaid API and aggregate the data into the specified format. The response also needs to be cached somehow. Whether that is in the PHP session, in a textfile or somewhere else doesn’t matter, TicAid needs this as a security precaution so that they don’t misuse the Openaid API due to malicious users of their site. Given a country id, your app should return a JSON structure matching the one in the example. It does not matter if you implement your solution as a web service or cli app.

Metrics:

  • General code layout
  • OO-principles
  • Abstraction of underlying data API
  • Documentation
  • Performance

Some useful terms:

  • UD = 'Utrikesdepartementet' = 'Ministry for Foreign Affairs' 2
  • SIDA = 'Styrelsen för internationellt utvecklingssamarbete' = 'Swedish International Development Cooperation Agency' 3

Problem Analysis

A display of the data 4. Please read Tests/DataExtractTest.php, this is R & D on how to get the data. As requested in the spec; I have added a -5Y limit to the data.

I imported an OO layer for the network request, as I am unaware of this feature in SF2 core.

I looked at using a Xpath type library to access the data (before deciding that this was too complicated):

Errata

  • I wanted to do a CLI just to be different. I can do really simple HTML quickly, but without pre-done theme, a webpage will exceed expected time.
  • This data is mostly historical, and so may be cached cleanly.
  • I currently am applying no localisation on the error messages (just less typing).
  • I should add PHP version checking.
  • I wrote some tests that are properly [unit] isolated, and some tests that use bits of the framework. I think it is important to test the services.yml in addition to the classes. The second tests do this.
  • Mocks have been built.
  • I directly use many of the associated classes, as this means code is more resilient to refactoring.
  • Iteration1: Code presented before work on Tuesday 27th is Services, Entities and Tests. It doesn't contain any controllers and CLI events.
  • Iteration2: Code presented before work on Thursday 29th now includes further tests; and a CLI processor.
  • possible Iteration3: Add a web front end.

Data aggregation.

I worked out what is meant by “sida” and “ud” columns. They could have been ids, but are contributions made by other organisations. The agreement ids don't look useful for sort/ search; they are non-unique.

  • We have tools to list DONATIONS by source (says manual).
  • But the data looks like donations to the destination (look at 238 as an example)
  • I have a tool to list country ids, but not convert those to country names.
  • I have a tool to list contributions by name, but this always returns empty array.
  • I can't find a means to list payments by SIDA or UD as agencies. If they show up, I will have them in the standard results...

I have built a summarisation of the payments made to Sudan, grouped by year (your goal). As there are no useful ids, I have had to group by the name fields.

Discussion on solution

  • I have been trying to do good quality OO in this code. This means development is comparatively slower.
  • Where I have had to check something, I have quickly built a script in the test dir. These are non OO, but allow visibility on libraries functionality.
  • I am using a cache directory as $projectroot/app/cache/json, as this is just code without an installer, could you make this manually?
  • When running this, don't forget to rebuild the Symfony caches, or the module doesn't show in console output.
  • As this is now a complete solution, the services.yml file is useful. As the classes are all decoupled and setup for DI, this is important.
  • I build the first iteration as TDD, because its good practice. The test cases took more time to fill-in than the code in most cases. I dislike the time to build mocks; and so have imported libraries for doing these 1 2.
  • The most expensive section as getting mocks to run without error. This is in-line with my previous experience (because these tests are faking type structures, in the most strongly typed segment of the PHP language ~ object definition and inheritance). Similar feelings to mine are discussed Y Combinator ~ Y combinator is a VC organisation.
  • I ought to write a test that looks at the cache, by running the code twice. I did that step by hand, but running the tests twice. A more thorough implementation would look at the age of the cache, and rebuild it every-so-often.

I added the following to my composer.json requires:
* “kriswallsmith/buzz”: “dev-master”,
* “ptrofimov/xpmock”: “dev-master”,
* “mockery/mockery”: “0.9.*@dev”

I have opened a fault against Buzz 3; when one uses file:// protocol URLs there is a big error (leading to test failure). This is not is problem for realistic use (from the open aid host).

Performance

  • I haven't done any optimisation of the OO classes that I present here. Code compilation, networking, and file IO are much more expensive the PHP execution. On a production environment, I would recommend the use of APC or other PHP caching/ optimising technologies. This step is applied after PHP source is completed. Merging all the files into one binary will reduce disk reading times when loading the app. Most merging processes strip comments, leading to less disk volume used.
  • Data is stored in JSON format, as it is frequently faster to parse than XML or SQL. This means it has the same format as the input and output, which is thematically pleasing. Another option would be to store the data as a PHP array, but I like that JSON is less likely to crash the interpreter.
  • As requested in the spec, I am caching the summarised data, in a local file. If the cache was larger I would compress it, but it seems below the minimum return size for gzip (as the data has a high level of entropy, it wont compress very well). If this was used very frequently, I would make the storage in SHM or memcache, rather than the file system. This would require configuring.
  • I have used the fastest “off the shelf” file IO implementation inside the PHP interpreter; for data less than the kernel page size. A buffering approach will lead to higher performance for large files, but this doesn't seem relevant to the requirements. I could also improve the performance by altering the file system settings, kernel settings or adding more RAM.
  • I have used a OO wrapper for cURL. cURL as a commandline tool is relatively high performance, mostly dependent on the TCP, and HTTP(S) protocols and the kernel implementation of sockets as slow downs.
  • Along side APC, obviously changing the PHP interpreter to HHVM would lead to a performance increase.

Symfony CLI and JSON demo.

RSS. Share: Share this resource on your twitter account. Share this resource on your linked-in account. G+

Symfony CLI and JSON demo.

RSS. Share: Share this resource on your linked-in account. Share this resource on your twitter account. G+ ­ Follow edited