Ill-posedness of inverse problems

Rukolaine, Sergey

ILL-POSEDNESS OF INVERSE PROBLEMS

Sergey A. Rukolaine

Leading to: Concept of Regularization

1 AN EXAMPLE

Major peculiarities, inherent in inverse problems, can be illustrated by the following synthetic example. Consider radiative heat transfer in nonabsorbing, nonemitting, and nonscattering gray plane-parallel “medium” of thickness h (see Fig. 1). The bounding surfaces, y = 0 and y = h, are assumed to be black. The temperature of the upper surface is T(x) (i.e., it depends on x). The temperature of the bottom surface is equal to zero. In view of the symmetry, radiative heat transfer (Özişik, 1973; Siegel and Howell, 2002; Modest, 2003) is described by the radiative transfer equation

with boundary conditions

where I ≡ I(x,y,ϕ) is the radiation intensity, q_h^out is the radiosity on the upper surface, σ is the Stefan–Boltzmann constant, T(x) is temperature at the point x of the surface y = h, the refraction index is equal to unity. Clearly, I(x,y,ϕ) = 0 for 0 < y < h and 0 < ϕ < π.

Figure 1. Plane-parallel “medium” and bounding surfaces.

Radiative heat flux, incident onto the bottom surface,

is “measured”. An inverse problem is posed as follows: given the incident radiative heat flux q₀^inc(x), measured on the bottom surface, restore temperature T(x), or, equivalently, the radiosity q_h^out(x) (which is also, in this case, the blackbody flux) on the upper surface. This problem admits analytic solution, and the incident flux q₀^inc can be expressed explicitly in terms of the radiosity q_h^out. Therefore, the inverse problem can be written in the form of the Fredholm integral equation of the first kind

(1)

with the kernel

(2)

Because the integral equation (1) is of convolution type, it can be solved with the help of the Fourier transform (Tikhonov and Arsenin, 1977; Tikhonov et al., 1995; Bertero and Boccacci, 1998)

The Fourier transform of the integral equation (1) gives

This equation gives the Fourier transform _h^out of the radiosity, and therefore, the inverse Fourier transform gives the sought-for radiosity

(3)

However, in reality, the measured incident flux q₀^inc always contains an error, i.e., q₀^inc(x) = q₀^inc(x)+e(x), where q₀^inc is an exact flux and e is the error. Therefore, the obtained radiosity (3) differs from the exact one

and the difference between the obtained and exact radiosities is

(4)

The problem is that (ξ) tends to zero very rapidly as ξ → ∞. Indeed,

(5)

where K₁ is the modified Bessel function of the order 1 (Abramowitz and Stegun, 1972), and

(designation of the kernel by K and of the modified Bessel function by K₁ is just a coincidence, both designations are conventional). The plot of , for h = 1, is shown in Fig. 2.

Figure 2. Fourier transform (5) of the kernel K(2) for h = 1.

Therefore, if the error e contains high-frequency Fourier harmonics, such that ê(ξ) is not small, then ê(ξ)/(ξ) takes very large values for these frequencies ξ, and the difference (4) between the obtained and exact radiosities takes very large values. For example, if e(x) = εcos(ωx), ω > 0, then ê(ξ) = επ[δ(ξ+ω)+δ(ξ-ω)], where δ is the Dirac δ-function, and the difference (4) is q_h^out(x)-q_h^out(x) = εcos(ωx)/hωK₁(hω) ≡ e(x)/hωK₁(hω).

Figure 3 shows the errors e(x) = 0.01cos(4πx) and e(x) = 0.01cos(6πx) and corresponding differences q_h^out(x)-q_h^out(x) for h = 1. In Fig. 3, the differences between the obtained and exact radiosities are 6.27×10⁴ and 2.77×10⁷ times larger than the errors, respectively. All the quantities are dimensionless.

Errors e(<i>x</i>) = 0.01cos(4πx) (top) and e(<i>x</i>) = 0.01cos(6πx) (bottom) and corresponding differences between the obtained and exact radiosities qhout(<i>x</i>)-qhout(<i>x</i>) for h = 1. The differences q1out(<i>x</i>)-q1out(<i>x</i>) are 6.27×104 and 2.77×107 times larger than the errors, respectively. All quantities are dimensionless.

Figure 3. Errors e(x) = 0.01cos(4πx) (top) and e(x) = 0.01cos(6πx) (bottom) and corresponding differences between the obtained and exact radiosities q_h^out(x)-q_h^out(x) for h = 1. The differences q₁^out(x)-q₁^out(x) are 6.27×10⁴ and 2.77×10⁷ times larger than the errors, respectively. All quantities are dimensionless.

2 INVERSE PROBLEMS OF INTERPRETATION OF EXPERIMENTAL DATA VS. OPTIMAL DESIGN AND CONTROL PROBLEMS

The example, given above, is considered as an inverse problem, of interpretation of experimental data. In this inverse problem one needs to restore the temperature T(x) on the upper surface from the radiative heat flux q₀^inc(x), incident onto the bottom surface. The flux q₀^inc is obtained experimentally, and therefore contains an error, i.e., q₀^inc(x) = q₀^inc(x)+e(x), where q₀^inc is an exact flux, which is unknown.

The same example can be considered as an optimal design problem. In this case, the exact flux q₀^inc, incident onto the bottom surface, is prescribed. The optimal design problem consists of finding the temperature T(x) or, equivalently, the radiosity q_h^out, on the upper surface, which provides the flux q₀^inc. The example is synthetic and admits analytic solution. Real optimal design problems are usually solved numerically. In this case, an exact solution cannot generally be obtained and a designer obtains an approximate solution q_h^out, providing the flux q₀^inc. Just as in the given example, even if the difference between the flux q₀^inc and the exact one q₀^inc is very small, the difference between the obtained approximate solution q_h^out and the exact one q_h^out may be very large. Therefore, optimal design and control problems should be considered as inverse problems.

3 INTEGRAL EQUATIONS OF THE FIRST KIND

The integral equation (1) is a particular case of the linear Fredholm integral equation of the first kind (Wing, 1991)

(6)

where K is a given kernel of the integral equation (the kernel can be considered as the apparatus function of a measuring device), f is a given function (input data), u is a sought-for solution, in the general case -∞≤ a < b ≤∞ and -∞≤ c < d ≤∞.

A more general form of the linear Fredholm integral equation of the first kind, encountered in practice, is

(7)

where D₁ and D₂ are domains in ⁿ.

The integral equations (6) and (7) are conveniently expressed in the form of a linear operator equation of the first kind

(8)

where the operator (mapping) A is linear [recall that the operator A is linear if A(u+v) = Au+Av].

If the dependence of f on u is not linear, then the operator equation of the first kind is conveniently expressed by

(9)

[though, sometimes, the form (8) is also used, if A is not linear]. Equation (9) can be considered as the most general form of inverse problems.

In Eqs. (6)–(9), u is an unknown “cause”, f is an observed (or desired) “effect” or “result” (at the same time, f presents input data of an inverse problem), and A expresses the dependence of the effect on the cause. “Inverse problems are associated with the reversal of the cause-effect sequence, and consist of finding the unknown causes of known consequences” (Turchin et al., 1971). A more general definition, including optimal design and control problems, is “Inverse problems are concerned with determining causes for a desired or an observed effect” (Engl et al., 1996). In this sentence, the word “desired” is related to optimal design or control problems, while the word “observed” is related to problems of interpretation of experimental data, such that restoration (recovery) and parameter identification problems.

The cause of the difficulties in solving Eqs. (8) and (9) is the compactness of the operator A (Engl et al., 1996; Kress, 1999), which is usually the case in practice [the notion of compactness can be easily understood with the help of the singular value expansion (SVE); see Hansen (1998); Engl et al. (1996); Bertero and Boccacci (1998); Kress (1999), which is a generalization of the singular value decomposition (SVD), see details below]. The compactness of the operator A implies the ill-posedness of the inverse problems (8) and (9), particularly, the instability of the solutions u with respect to small variations of f (Engl et al., 1996; Kress, 1999).

4 WELL- AND ILL-POSEDNESS

The concept of well- and ill-posedness goes back to Hadamard (1923). However, the definition was given by Courant in Courant and Hilbert (1937). According to the definition, a mathematical problem is called well posed (in the sense of Hadamard) or properly posed or correctly posed if:

a solution of the problem exists in a given set of “admissible” solutions (existence condition),
the solution is unique (uniqueness condition),
the solution depends continuously on input data (stability condition).

If at least one of the three conditions fails, then the problem is called ill-posed or improperly posed or incorrectly posed.

It is necessary to emphasize here that Hadamard was never engaged in the study of inverse problems. On the contrary, he had come to this concept when he was studying direct problems (Hadamard, 1923). Moreover, the reasoning behind the concept was mainly mathematical.

5 POOR CONDITION OF DISCRETIZED ILL-POSED INVERSE PROBLEMS

Equations (6)–(9) are continuous mathematical models of inverse problems. To solve an inverse problem numerically, one has to discretize the equation in one way or another (Wing, 1991; Hansen, 1998). Note that, in inverse problems of interpretation of experimental data, the input data f are always given in a discrete form because they are obtained experimentally. Major peculiarities of discretized ill-posed inverse problems can be illustrated by linear problems.

Discretization of the integral equations (6) and (7) [or the operator equation (8)] results in a linear algebraic system

(10)

where A ^m×n [i.e., A is a real (m×n)-matrix, or a matrix with m rows and n columns], f ≡ (f₁,...,f_m) ^m is a given vector, u ≡ (u₁,...,u_n) ⁿ is a sought-for solution.

The matrix A inherits the compactness of the integral operator A if the discretization is fine enough, then the matrix A is ill-conditioned. The notion of matrix condition is conveniently explained with the help of the SVD.

Suppose for simplicity that m ≥ n. The SVD is that the matrix A ^m×n [real (m×n)-matrix] can be represented in the form (Forsythe and Moler, 1967; Golub and Van Loan, 1996; Björck, 1996; Bertero and Boccacci, 1998; Hansen, 1998)

(11)

where V = [v₁...v_m] ^m×m and W = [w₁...w_n] ^n×n (i.e., V and W are real square matrices of the order m and n, having columns v₁,...,v_m ^m and w₁,...,w_n ⁿ, respectively), the superscript T indicates the transposition, the matrices V and W are orthogonal, i.e.,

(or, equivalently, V ^T = V ^-1 and W^T = W^-1), where I_m ^m×m and I_n ^n×n are the identity matrices; in other words, the columns v₁, ..., v_m of the matrix V and the columns w₁, ..., w_n of the matrix W are orthonormal, i.e.,

and

where δ_ij is the Kronecker delta.

Σ is a rectangular “diagonal” matrix,

where σ₁ ≥ σ₂ ≥ ... ≥ σ_n ≥ 0. From here on, it is supposed for simplicity that σ_n > 0.

The numbers σ₁, ..., σ_n are called the singular values of A, while the vectors v₁, ..., v_m and w₁, ..., w_n are called the left and right singular vectors of A, respectively.

The condition number of the matrix A (with respect to the Euclidean vector norm ) is defined by

If cond(A) is large, then the matrix A is said to be an ill-conditioned or poorly conditioned matrix. The condition number measures the sensitivity of the solutions (in a general sense) of the system (10) to perturbations of A and f.

The sensitivity analysis of the system (10) is conveniently explained with the help of the SVD (11). For more general sensitivity analysis, see (Golub and Van Loan, 1996) and (Björck, 1996).

If m = n, then the matrices V, Σ, and W are square and the (unique) solution of the system (10) is

If m > n [i.e., the number of equations in the system (10) is more than the number of unknowns (the system is overdetermined)], then the solution of the system (10) does not generally exist (the inverse of the matrix Σ is not defined). In this case, a more general definition of the solution of the system (10) is introduced: instead of the system (10), the least-squares problem

(12)

where is the Euclidean norm, is considered. Clearly, one can assume that in (12), m ≥ n, because in the case m = n, the least-squares solution is exactly the solution of the system (10). The least-squares problem (12) has a unique solution (since σ_n > 0), which is a (unique) solution of the system of normal equations (Forsythe and Moler, 1967; Golub and Van Loan, 1996; Björck, 1996)

(13)

The solution of the system (13) (the least-squares solution) is

(14)

which is considered as a generalized solution of the system (10) in the case m ≥ n.

In reality, f contains an error (i.e., f = f + e, where f is an exact “value” and e is the error). The matrix A may also be known with an error because it depends on radiative properties of a medium, whose numerical values are obtained from experimental data. But here the matrix A is supposed, for simplicity, to be exact. In this case, the solution (14) differs from the exact solution

and the difference between the obtained and exact solutions is

As is written above, the compactness of the integral operator A leads to the poor condition of the matrix A. Moreover, the better the system (10) approximates the integral equation [(6) or (7)], the larger the condition number of the matrix A (Bertero and Boccacci, 1998). The behavior of the singular values σ_i, as a “function” of the index i, is similar to the behavior of the Fourier transform (ξ) in Fig. 2 as a function of the frequency ξ: the singular values σ_i tend to zero, more or less gradually, as i increases [examples are given in Hansen (1998)].

Because the matrix A is ill-conditioned, the singular values σ_i with large indices i are very small. Therefore, it may be that, for some errors e, the corresponding dot products v_i · e are not small, and hence, the coefficients (v_i · e)/σ_i are large. In this case, the difference u - u is large. And, the larger the condition number of the matrix A is the larger the difference u - u may be.

REFERENCES

Abramowitz, M. and Stegun, I. A. (eds.), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standards, Washington, 1972.

Bertero, M. and Boccacci, P., Introduction to Inverse Problems in Imaging, Institute of Physics Publishing, Bristol and Philadelphia, 1998.

Björck, Å., Numerical Methods for Least Squares Problems, SIAM, Philadelphia, 1996.

Courant, R. and Hilbert, D., Methoden der mathematischen Physik, Bd. II, Julius Springer, Berlin, 1937; revised English ed., Methods of Mathematical Physics, Vol. II, Partial Differential Equations, Wiley, Hoboken, NJ, 1962.

Engl, H. W., Hanke, M., and Neubauer, A., Regularization of Inverse Problems, Kluwer, Dordrecht, 1996.

Forsythe, G. E. and Moler, C. B., Computer Solution of Linear Algebraic Systems, Prentice Hall, Englewood Cliffs, NJ, 1967.

Golub, G. H. and Van Loan, C. F., Matrix Computations, John Hopkins University Press, Baltimore, 1996.

Hadamard, J., Lectures on Cauchy’s Problem in Linear Partial Differential Equations, Yale University Press, New Haven, 1923 (reprinted by Dover Publ., New York, 1952); revised French ed., Le problème de Cauchy et les équations aux dérivées partielles linéaires hyperboliques, Hermann, Paris, 1932.

Hansen, P. C., Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion, SIAM, Philadelphia, 1998.

Kress, R., Linear Integral Equations, Springer–Verlag, New York, 1999.

Modest, M. F., Radiative Heat Transfer, Academic Press, New York, 2003.

Özişik, M. N., Radiative Transfer and Interactions with Conduction and Convection, Wiley, Hoboken, NJ, 1973.

Siegel, R. and Howell, J. R., Thermal Radiation Heat Transfer, Taylor & Francis, New York, 2002.

Tikhonov, A. N. and Arsenin, V. Y., Solution of Ill-Posed Problems, Wiley, Hoboken, NJ, 1977.

Tikhonov, A. N., Goncharsky, A. V., Stepanov, V. V., and Yagola, A. G., Numerical Methods for Solving Ill-Posed Problems, Kluwer, Dordrecht, 1995.

Turchin, V. F., Kozlov, V. P., and Malkevich, M. S., The Use of Mathematical-Statistics Methods in the Solution of Incorrectly Posed Problems, Sov. Phys. Usp., vol. 13, pp. 681–703, 1971.

Wing, G. M., A Primer on Integral Equations of the First Kind: The Problem of Deconvolution and Unfolding, SIAM, Philadelphia, 1991.

参考文献

Abramowitz, M. and Stegun, I. A. (eds.), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standards, Washington, 1972.
Bertero, M. and Boccacci, P., Introduction to Inverse Problems in Imaging, Institute of Physics Publishing, Bristol and Philadelphia, 1998.
BjÃ¶rck, Ã…., Numerical Methods for Least Squares Problems, SIAM, Philadelphia, 1996.
Courant, R. and Hilbert, D., Methoden der mathematischen Physik, Bd. II, Julius Springer, Berlin, 1937; revised English ed., Methods of Mathematical Physics, Vol. II, Partial Differential Equations, Wiley, Hoboken, NJ, 1962.
Engl, H. W., Hanke, M., and Neubauer, A., Regularization of Inverse Problems, Kluwer, Dordrecht, 1996.
Forsythe, G. E. and Moler, C. B., Computer Solution of Linear Algebraic Systems, Prentice Hall, Englewood Cliffs, NJ, 1967.
Golub, G. H. and Van Loan, C. F., Matrix Computations, John Hopkins University Press, Baltimore, 1996.
Hadamard, J., Lectures on Cauchyâ€™s Problem in Linear Partial Differential Equations, Yale University Press, New Haven, 1923 (reprinted by Dover Publ., New York, 1952); revised French ed., Le problÃ¨me de Cauchy et les Ã©quations aux dÃ©rivÃ©es partielles linÃ©aires hyperboliques, Hermann, Paris, 1932.
Hansen, P. C., Rank-Deficient and Discrete Ill-Posed Problems: Numerical Aspects of Linear Inversion, SIAM, Philadelphia, 1998.
Kress, R., Linear Integral Equations, Springerâ€“Verlag, New York, 1999.
Modest, M. F., Radiative Heat Transfer, Academic Press, New York, 2003.
Ã–ziÅŸik, M. N., Radiative Transfer and Interactions with Conduction and Convection, Wiley, Hoboken, NJ, 1973.
Siegel, R. and Howell, J. R., Thermal Radiation Heat Transfer, Taylor & Francis, New York, 2002.
Tikhonov, A. N. and Arsenin, V. Y., Solution of Ill-Posed Problems, Wiley, Hoboken, NJ, 1977.
Tikhonov, A. N., Goncharsky, A. V., Stepanov, V. V., and Yagola, A. G., Numerical Methods for Solving Ill-Posed Problems, Kluwer, Dordrecht, 1995.
Turchin, V. F., Kozlov, V. P., and Malkevich, M. S., The Use of Mathematical-Statistics Methods in the Solution of Incorrectly Posed Problems, Sov. Phys. Usp., vol. 13, pp. 681â€“703, 1971.
Wing, G. M., A Primer on Integral Equations of the First Kind: The Problem of Deconvolution and Unfolding, SIAM, Philadelphia, 1991.

Ill-posedness of inverse problems