This is an introduction for a couple of later articles (see links at end). What do I hope you gain from reading these articles? A actual list of reasons to use or not use a NoSQL system. There are too many articles on the web that assume technical skills below GCSE level. I personally am interested in “degree level” texts, which tend to have a long bibliography. This is the style I am writing towards. I want clear English, but expect a limited audience.

Relationship between the No SQL implementations

As stated in the image, its owned by 451. This seems to be 1, the blog2 will redirect to the first URL for a home page. I found the image without attribution on a blog, but have backtracked.

I claim I am a software engineer, but increasing feel the most important aspect is librarianship. Getting accurate requirements will make a project succeed or fail. Articulate and concise writing to record a conversation assists all later work. Updating all the internal data assets will ensure they are useful.
The words that you are reading are rendered by a documentation library 1, controlled by a documentation system 2, run on a protocol 3 originally designed for academic documentation 4. This cluster of articles is about a documentation system 5. A human-to-human asynchronous communication to ensure that ideas are communicated.

“Why mongo” is the title to this resource, and the answer is roughly “because of documentation”. It is more generically “why NoSQL”, but Mongo is a good concrete example. RDBMS are an established solution to managing and reporting volumes of data on slow hardware, needing access restrictions. The operational platform for information systems is about 10^3 times faster than in the 1970s [ 1 ]. Strict control on data structures and types is less important, when one quotes “computer operations per second” in 10^9 and human operations per second ~ unchanged since the 1970s ~ at about 10^-2 (e.g. 1 thing per 100seconds ). Structuring concepts in mutable 'documents' rather than normalised 'entities' achieves a functional solution faster.

Systems like Mongo are important, as they represent todays data and operating conditions. The second reason for NoSQL is map-reduce operations. That is parallelised crunching operations on >10GB datasets to show trends. Via NoSQL this is much faster to build than RDBMS; and as RDBMS aren't generally parallelised, faster performing.

The amount of data we record is at least 10^3 larger than the 1970s, and we expect to be able to access any/all of it quickly. Mongo as a data storage is introduced in 2010 6, and focussing on moving data storage from rigid tables to documents. This closely matches the concepts I have been working with for the last decade. As such I write about it.

Taking 7 as an example; your entire infrastructure is supposed to be ported to NoSQL. My earlier reading held that NoSQL was very good at certain things, but as RDBMS where established with a large tool set, they where more efficient.

Lastly I note the revision to the licence. As licences are supposed to reflect and restrict the social role of the systems use; the changing world requires their update. From now on, everything should be marked with agpl, until they publish the next one (there is a positional omission, which they have yet to resolve).

  • structured-storage ~ is background material, on which NoSQL to use.
  • mongo-interface ~ is about how to access this tool.
  • When v2.6 of iceline is done, I may add Mongo as a second DB access technology.
  • A third party article on using larger in-memory data structures.
  • A resource which I have yet to write on index management.

[1] This is rough data. The use of computing has changed alot in the last forty years, and the scale of the business is vastly larger. As such its a apples to oranges comparison, on what you use for reference figures. I spent 10m pulling a average machine stat from 1970, and then a recent 1u rack box. The numbers are about 10^3 larger in all cases, other than physical size, cost and power consumption. Even naive analysis must include effects of caching and parallel hardware/ multiple cores.