2. Choice of models to be compared

The choice of models to be tested was made after an extensive study of the relevant literature and after discussions with other workers in this field.

        In addition to the references which are given below, special mention should be made of the outstanding bibliography in the book by Gavrilov and Gavrilova (1991). It was also helpful to have the very comprehensive review of current actuarial methods by Forfar, McCutcheon and Wilkie (1988).

        We have already mentioned in Chapter 1 the distinction between models which are explanatory and those which are purely descriptive. In selecting models for further study, there is a natural inclination to prefer the explanatory models. However, we shall also include some descriptive models in the selection, for comparison.

Gompertz's law of mortality

Early attempts to find a "law of mortality" go back at least as far as de Moivre, and an interesting account has been given by Hald (1990). However, it was not until the 19th century that it was discovered by Gompertz (1825) that over a large part of the age range (though not including infancy and youth or very old age) the force of mortality increases with age at a steady exponential rate. Thus the model can be written in the form

(1)

Gompertz's approach was very pragmatic. He discovered his law by studying the survival curves in the life tables which were available to him. He described it as an hypothesis and he considered the consequences if it should continue to apply to still higher ages, but he did not insist that this would necessarily be the case.

        The law has been found to apply (over appropriate age ranges) in many countries during the last 170 years. It is a recurring pattern. The problem is not to find fault with it, but to explain why it works so well.

        Gompertz himself put forward a possible physiological explanation: that a man's power to avoid death is gradually exhausted as his age increases, "congruous with many natural effects, as for instance, the exhaustions of the receiver of an air pump by strokes repeated at equal intervals of time". Most modern attempts to explain the law are linked to steady bodily deterioration, perhaps due to the accumulation of molecular and cellular damage, over the age ranges concerned.

Makeham's law

It was found by Makeham (1860) that Gompertz's law could be improved by adding a constant term, so that

 

(2)

The constant c can be explained as the risk of death from all causes which do not depend on age.

The logistic model

The logistic model is known under a variety of names. It was first discovered by Perks (1932), who found empirically that the values of in a life table which he was examining could be fitted by a certain curve, which was in fact a logistic function (though he did not describe it as such at the time). In this field there is a confusing variety of notations, which have been used by different authors. For the present purpose it is convenient to express the logistic function in the following form:

(3)

 

We can see at once that this includes Makeham's law as the special case when . When is small, any theories which may explain why should follow a logistic function will also help to explain why the Makeham and Gompertz laws work so well over much of the age range.

        Beard, who was a colleague of Perks, wrote several papers on this subject that were summarised in a paper published in 1971. He identified (3) as a logistic curve and showed how it could arise in a simple model of a heterogeneous population. If the members of the population are subject to hazards of the Makeham form (2), but with the parameter a varying from individual to individual in such a way that they have a gamma distribution at birth, then the average value of for the survivors who reach age will have the logistic form (3). This result, published in 1959, is the first appearance of the "Gamma Makeham" model. It was later discovered independently and developed extensively by Vaupel et al (1979) as a model of "frailty".

        Beard also showed how the logistic curve could arise from a very simple type of stochastic process which assumed that individuals accumulate "shots" from random firings and are assumed to be dead when the total reaches a given figure. Special assumptions were about initial conditions.

        Le Bras (1976), quite independently, discovered that the logistic function could arise if health is treated as a stochastic process. He considered a cohort which was homogeneous at birth, so that all its members were in the same state of health. Heterogeneity then develops during life, as people move from one state of health to another. Within a given time interval, there are probabilities that a person in a given state of health will either remain in that state, move to the next state, or die. Using some very simple illustrative assumptions, Le Bras found an expression for the average value of among the survivors who reach age . He then showed that with a suitable choice of parameters, could follow Gompertz's law, approximately, over much of the age range. Le Bras's results, slightly generalised, are also given in Gavrilov and Gavrilova (1991, pages 247-251). It was later shown by Yashin et al (1994) that the Le Bras formula for was essentially the same as that given by the Gamma-Makeham model, despite the fact that they were derived from completely different assumptions. In fact, it can be shown that all these formulae can be transformed into (3) above, so we shall describe them collectively as the logistic model.

The Kannisto model

It is a remarkable fact that the modern data for at high ages are very close to one of the simplest forms of the logistic model, in which  is a linear function of . This was noticed by Kannisto (1992) and was also used independently by Himes, Preston and Condran (1994).

        Kannisto was not proposing a general law; he was simply observing an empirical finding. However, it is convenient to give this a name and we shall describe it as the Kannisto model. The relevant formula is

(4)

 

which can also be written as

(5)

 

The Weibull model

We next come to a model proposed by Weibull (1951) to represent the failure of technical systems due to wear and tear. The model is

(6)

 

and this, along with the Gompertz model, is one of the limiting forms of the distribution of the lowest observed value in a large sample. If we consider the distribution of the times to failure of bodily organs or even damage to cells which may lead to death, and suppose that death results when the first such failure occurs, we have an analogy with mortality. Accordingly, this is an explanatory model which needs to be included in the analysis.

The Heligman & Pollard model

We now turn to a descriptive "law of mortality" which was proposed by Heligman & Pollard (1980). Their full law has three terms and eight parameters, and covers the whole of the age range. However, above age 50 the first two terms can be neglected and their expression reduces (in their notation) to

(7)

where  choic1image21.gif (954 bytes)and where and are constants. This implies that

can be written in the form

(8)

which shows that follows a logistic function and lies between the limits 0 and 1. The model can also be written in the form

(9)

By taking logarithms of (7), it can be shown that in this model, tends to a linear asymptote which increases with age.

The Quadratic model

The idea that can be fitted by a quadratic function of over a limited range of ages was used by Coale & Kisker (1990) for the purpose of interpolating in the range of ages from 85 to 110, between data up to age 85 and an assumed value at age 110. The relevant formula is

(10)

 

where is negative. For obvious reasons, we shall describe this as the quadratic model.

        It is important to note that Coale & Kisker only propose that this model should be used in a limited range of ages. Below age 85, it would conflict with findings of Horiuchi & Coale (1990). The model has also been used by Wilmoth (1995), in his case for estimating at age 110 from data which extended above age 85, but again only in this limited range of ages.

        We may observe that the model is purely descriptive and cannot possibly continue to hold indefinitely, to higher and higher ages. If it did, this would imply that the expectation of life is infinite at all ages. This follows because is always finite when is negative.

Actuarial methods

Forfar, McCutcheon & Wilkie (1988) have given a most comprehensive review of current actuarial methods for using mathematical formulae to graduate life tables. They give a general system (pages 15-17) in which the dependent variable can be either or or , and this can be graduated by a function F which can be either a polynomial of low order in , or the exponential of a low order polynomial, or a mixture of the two, or a logit transformation of the form . This system covers the Gompertz, Makeham and Heligman & Pollard models, and also the quadratic model and the Kannisto model. Surprisingly, though, since Perks and Beard were actuaries, it does not cover the full logistic model (3). The authors describe step-wise procedures which can be used to determine whether extra terms should be added to the polynomials. They then illustrate their methods by applying them to data from insurance companies. Their analyses, discussion and methods of fitting are extremely elaborate and thorough, and they lead to graduating functions which are remarkably simple. For example, for pensioners' widows the graduating functions for turn out to be the Gompertz law and the Kannisto model. The authors find (page 71) that a simple two-parameter model is the most complex that the data can support. For male pensioners, their preferred graduating function for (pages 84-5, 89) differs by only a constant from the quadratic model (10) above. Their other examples are more complicated because they depend on the duration of the contract.

The six selected models

In modern data, the fitted value of the constant term in the Makeham model (2) is very small, so at high ages the difference between the Gompertz and Makeham laws is negligible. For the present purpose, we shall therefore concentrate on the following six models: Gompertz, logistic, Kannisto, Weibull, Heligman & Pollard and quadratic.

        At this point, the reader may find it helpful to have a broad preliminary idea of what these six models look like. As an illustration, Figure 2.1 shows plots of the values of when the six models are all fitted to the same set of data.

        It will be seen that at ages 80-95 the models are very close, but above age 95 they start to diverge. These broad features are always present; they do not depend on which particular data set is used to fit the models. In fact, the divergence is inevitable. In the Gompertz model, increases exponentially with age. In the Weibull model, it increases as a power function of age. In Heligman & Pollard, it tends to an asymptote which increases linearly with age. In the logistic and Kannisto models, it tends to a constant (but not necessarily the same constant).

        Figure 2.2 shows the corresponding values of which naturally look rather different in appearance, though the structure is the same.




Updated by V. Castanova, 1 March 1999