Software Design

Lemma: the primary function of any computer program is to produce outputs derived from some inputs.

The correctness of a program is defined by its ability to consistently produce the correct/appropriate outputs for any set of inputs. To this end, every program could, given an infinite storage and appropriate retrieval mechanism, be represented by a finite state machine. All computer programs are bounded by limits (the size of an integer, the amount of addressable memory, etc.), so they theoretically have a finite number of possible inputs. Given time to generate the full set of inputs, and the appropriate outputs, every program could be represented by a single lookup operation. Put another way: every program essentially represents a transformation of a set of inputs to a set of outputs, and that transformation is entirely deterministic.

The most important piece of information to take away from this is: how a program transforms its inputs to outputs is irrelevant.

That statement requires some qualification.

The transformation of inputs to outputs is primarily measured by the correctness of the outputs, however other metrics can be applied, depending on the nature of the program. I assert that the most important metric (after correctness) is the time taken to derive outputs from inputs. Since our usual tolerance for erroneous output is "none", it can be assumed that all programs produce the "right" outputs for a given set of inputs — any program that doesn't is buggy. Since all programs produce the same outputs, a good differentiator between programs is: how quickly they do it. In all cases, the faster the results are made available, the better. As such, how a program transforms its inputs to outputs is relevant, since a "better" algorithm may produce results more quickly than a "worse" one.

A third metric, easily overlooked in the modern age (especially with "infinite storage" and "infinite processing" options provided by things like cloud computing), is data usage. To whit: using less space is better. This has been accepted as fact since the dawn of computing, possibly born of the historical need to work within the limits of a computer's hardware. Trade-offs can and will always be made between space and time, trying to find the appropriate balance, so that correct outputs are produced in reasonable time without a chance of over-allocating the computer's available resources.

That is the key to the third metric: resource allocation and utilisation. If your transformation's "how" makes optimal use of the computer's resources, to produce the correct outputs in minimal time, that transformation is as good as it can be.

It's late and I should be in bed, so I'll stop rambling and summarise the steps of software design as I see them:

define the outputs that are considered "correct" for any set of inputs — use boundaries, induction, arithmetic, really really big truth tables, whatever is required
define how quickly the outputs should be derived (Hint: the answer is usually "as fast as possible")
enumerate the resources available on the target computer/platform; processors, storage (registers, cache, memory, hard discs), I/O devices, everything
sketch out an algorithm that will produce the correct outputs for all inputs, and in tandem devise a resource allocation scheme that supports it
write it

Fortunately that fifth step is pretty well understood, and is actually (probably) the easiest of all. Really it all comes down to step four — when you find a good solution to that one, the world is your mollusc.

... Matty /<

Matthew Kerwin

Published: 2010-03-09
Modified: 2012-09-21
License: CC BY-SA 4.0
Tags: development, software