zotero-db/storage/W32ZSSQQ/.zotero-ft-cache

Arch Comput Methods Eng (2009) 16: 109–160 DOI 10.1007/s11831-009-9029-2
ORIGINAL PAPER
Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy
Marco Prato

Received: 13 November 2008 / Accepted: 13 November 2008 / Published online: 13 February 2009 © CIMNE, Barcelona, Spain 2009

Abstract Astronomical practice often requires addressing remote sensing problems, whereby the radiation emitted by a source far in the sky and measured through ‘ad hoc’ observational techniques, contains very indirect information on the physical process at the basis of the emission. The main difﬁculties in this investigations rely on the poor quality of the measurements and on the ill-posedness of the mathematical model describing the relation between the measured data and the target functions. In the present paper we consider a set of problems in solar physics in the framework of the NASA Ramaty High Energy Solar Spectroscopic Imager (RHESSI) mission. The data analysis activity is essentially based on the regularization theory for ill-posed inverse problems and a review of the main regularization methods applied in this analysis is given. Furthermore, we describe the main results of these applications, in the case of both synthetic data and real observations recorded by RHESSI.
1 Introduction
A typical problem in applied sciences is the description of a physical system having on our hands just indirect information about it. This happens, for example, when we want to recover some properties of a very far source from the knowledge of its emitted radiation, or even when we use an acoustic or electromagnetic radiation as a probe, in order
M. Prato ( ) Dipartimento di Matematica Pura e Applicata, Università di Modena e Reggio Emilia, Via Campi 213/b, Modena, 41100, Italy e-mail: marco.prato@unimore.it
M. Prato CNR-INFM LAMIA, Via Dodecaneso 33, Genova, 16146, Italy

to make it interact with a physical system which cannot be explored directly and get information about its state. In all these cases, the detectors measure physical quantities which are related to the emitted or diffracted radiation; the result of the experiment is a function (or, more realistically, a vector) g which depends on one or more variables (space, time, energy, temperature) and the aim is the reconstruction, through the elaboration of g, of a function f which describes an unknown geometrical or physical property of the system.
When we have to deal with this kind of data, a ﬁrst method of analysis is a direct approach: we describe the unknown property of the examined system as a function and we assume several hypotheses about its explicit form. Then we simulate the action of the physical system on these expressions of the unknown parameters and we identify the ones which provide simulated data which are very close to the real data. Unfortunately, this method can lead to very unreliable results; in fact, due to reasons related to intrinsical mathematical properties of the problem, it may happen that among all the functions which reproduce faithfully the data we ﬁnd some of them which are very different one from the other. A physical problem which presents such features is said an ill-posed problem in the sense of Hadamard [35]. An effective way to solve an ill-posed problem is looking at the data analysis as an inversion problem, taking into account the pathological nature of most of inverse problems which are numerically unstable. Reliable solutions can be accomplished by applying a so-called regularization method, which looks for an optimal trade-off between stability and data reproducibility.
In addition to the inverse problems in classical frameworks like medical imaging, optics or geophysics, an increasing attention has been recently addressed to the application of inversion methods in astronomy and plasma physics. Indeed, astronomy is based on the observation of

110
phenomena which are not reproducible and in this sense it represents an example of an observational rather than experimental science. Moreover, these phenomena take place at great distances and concern physical systems with a high number of degrees of freedom so that generally very different theoretical models are formulated in order to describe them. In other terms, the typical inverse problem in astronomy is based on the reconstruction of a source function which, through an unknown physical process, produces a radiation which propagates in the interstellar medium whose properties are unknown, goes through the atmosphere whose properties are doubtful and ﬁnally is intercepted by a detector in order to provide the data. In a situation of such a great ambiguity it becomes crucial to take into account the intrinsical mathematical properties of the inversion process, describing rigorously the problem in an analytical way and applying properly the inversion techniques of the regularization theory in order to obtain an approximated solution of the inverse problem which is numerically stable.
We ﬁnd a very signiﬁcative example of this kind of problem in plasma physics when we deal with solar ﬂares. Solar ﬂares [55, 71, 77] are the most dramatic and mysterious events in the solar system. These transients phenomena are characterized by a sudden release of huge amounts of energy and their typical manifestations are the acceleration of electrons in the solar plasma, a notable heating of the solar atmosphere and a signiﬁcant electromagnetic emission particularly in the X-ray range. All the theoretical modelizations of the physics of the solar ﬂares formulated in the last thirty years are essentially based on the well-established equations of solar plasma physics and magnetohydrodynamics [27]. Common ground for these models is the assumption that the X-ray emission during ﬂares is the consequence of a collisional interaction between the accelerated electrons and the ions of the plasma. Such an event, known as bremsstrahlung collision, is fully described by a quantity, named bremsstrahlung cross section, which represents the probability that an X-ray photon of given energy is produced by an electron in the plasma of given (bigger) energy. However, for most ﬂares, none of these theoretical models is able to fully quantitatively predict the amounts of energy release observed by the detectors. As clearly explained by Brown et al. [17], the only way to bypass this puzzling situation is to focus on the analysis of the observed X-ray data to determine the mean electron spectrum associated to the electrons accelerated in the solar plasma. This function represents the electron ﬂux that would be required to produce the observed X-ray ﬂux in a homogeneous plasma source of given ion density and volume. The importance of this function relies on the fact that its determination from the observed X-rays depends only on the bremsstrahlung cross section and does not require any modeling assumption on the acceleration or

M. Prato
propagation mechanisms. Two ingredients are necessary in order to infer information on the mean electron ﬂux from the observed X-ray spectra, the ﬁrst one being the availability of a notable amount of high resolution X-ray measurements. On February 5, 2002 the NASA Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) mission has been launched with the precise intent of providing X-ray data of unprecedented spectral resolution and of combining them with X-ray 2D images of unprecedented spatial resolution [57]; RHESSI is currently operating and its spectra are at disposal of the scientiﬁc community. The second issue is concerned with the mathematical aspects of the equation relating the X-ray ﬂux to the unknown mean electron ﬂux. In fact, such equation is a linear Volterra equation of the ﬁrst kind whose integral kernel is given by the bremsstrahlung cross section. In this paper the bremsstrahlung Volterra equation is studied within a functional analytic setting and an approach based on regularization theory to its solution is discussed. Such an approach has been applied in the case of very general forms of the bremsstrahlung cross section and validated with both synthetic and real spectra measured by RHESSI [59, 60].
Further interpretation of the mean electron spectrum requires model dependent assumptions. For example, it is possible that bremsstrahlung arises not from energetic electrons moving in a cool background plasma but rather from an inhomogeneous hot plasma with electrons locally Maxwellian everywhere but with the combined distribution of plasma density and temperature (the emission measure differential in temperature) governing the photon spectrum. This assumption leads to a Fredholm integral equation with a Laplace kernel and consequently to an inverse problem whose severe ill-posedness is well known [7]. This is due to the very broad ﬁltering action of the negative exponential kernel (compared to that in the basic bremsstrahlung inverse problem which is of Volterra type and not severely ﬁltering). The huge ill-posedness of the Laplace inversion problem requires again the use of regularization methods to avoid the solution being swamped by ampliﬁed noise. In the paper the solution of this thermal bremsstrahlung inverse problem is addressed by means of an ‘ad hoc’ formulation of the Tikhonov regularization method in which the choice of the function space where the solution is reconstructed and smoothed (i.e., the form of the a priori information) plays a key role. It must be pointed out that for the present application the destructive effects of ill-posedness are even more dramatic, due to the huge dynamic range of the input data and the complexity of typical source functions. In this context a particular implementation of the method is proposed which allows to account for the variability of the data preserving the reliability of the reconstructions. In this case, one ﬁnds that the usual choice of the space of the square summable functions (zero-order regularization) does not provide

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

111

a reliable approximation of the true solution. Better reconstructions can be achieved by choosing a ﬁrst-order regularization approach, implemented by the assumption that the ﬁrst derivative of the solution belongs to the space of the square summable functions (i.e. by the choice of an appropriate inner product in the source space) and by the addiction of some boundary conditions [69].
The availability of spectroscopic methods is only one step toward the investigation of solar ﬂares. The other step is developing X-ray imaging methods which provide spatial information on the ﬂaring mechanisms. Combining imaging and spectroscopy is one of the main goal of the RHESSI mission; the only practical method of obtaining ∼arcsecond angular resolution in hard X-rays and gamma-rays within the cost, mass, and launch constraints of a small satellite is to use Fourier transform imaging [70]. One of the most powerful Fourier techniques is rotational modulation synthesis, ﬁrst proposed by Mertz [62] and implemented by Schnopper et al. [74]. This idea is at the basis of RHESSI, which uses nine bi-grid collimators, each consisting of a pair of widely separated grids (each pair with a different pitch) in front of a Germanium detector, to modulate the X-ray emission during solar ﬂares [40]. Thanks to this hardware, RHESSI is able to perform, in particular, hard X-ray imaging at an angular resolution in the range 2–7 arcseconds, a temporal resolution of tens of milliseconds in the energy range from 3 keV to 400 keV and hard X-ray spectroscopy with a spectral resolution from 0.5 keV to 2 keV, in the same energy range. So far, typical imaging spectroscopy techniques have combined these tools according to the following scheme: a) build a set of count images of the source at different count energies; b) extract count ﬂux spectra from speciﬁc regions of the count maps; c) reconstruct the corresponding electron ﬂux spectra through regularized inversion. What we want to do now is to go one step further: we propose a method which uses the same regularization technique in order to produce spatial maps of the electron ﬂux spectrum. Once these electron maps are available, electron ﬂux spectra from local regions can be straightly extracted and compared. The method involves three steps [68]: ﬁrst, for each count energy channel, a set of count visibilities (i.e., of calibrated measurements of spatial Fourier components of the source distribution) is extracted from RHESSI data in correspondence of different spatial frequencies; then, Tikhonov method is applied to obtain a set of regularized electron visibilities for each electron energy channel; ﬁnally, Fourier-based imaging techniques are applied to these reconstructed electron visibilities to obtain the two-dimensional electron ﬂux maps at different energies. This technique has three main advantages: ﬁrst, to provide information on the spatial distribution of the electron ﬂux, which is a quantity of greater interest than the spatial distribution of the count ﬂux; second, to utilize as input data for the inversion count visibilities, which

are the best measurements available in the RHESSI framework; ﬁnally, to impose a certain level of smoothness in the electron energy direction, thus avoiding the unphysical artifacts produced by traditional imaging spectroscopy methods which build up each count image independently and without any spectral correlation.
This paper is divided in eight sections. The ﬁrst two sections are devoted to the theoretical background which will be used in the whole paper. In the ﬁrst one a rigorous mathematical formulation of a linear inverse problem is given and the existence and uniqueness of its solution are discussed. A particular attention is given to the concepts of generalized inverse operator and generalized solution, both in the case of problems formulated in functional spaces (essentially Hilbert spaces) and in the case of problems with discrete data, which are more signiﬁcant in applications. In the second section we describe detailedly the Tikhonov regularization method, which represents the fundamental tool that we will use in the numerical applications.
The next three sections are dedicated to the analysis of the linear Volterra equation of the ﬁrst kind which relates the X-ray ﬂux to the unknown mean electron ﬂux. In Sect. 3 the inverse problem is described providing an analytical study of the bremsstrahlung equation with three different cross sections; in Sect. 4 the regularization method described in the previous section is applied to both simulated and real data in order to test the efﬁciency of the method itself (in the simulated cases) and to show the reconstructed solutions (in the real cases). In all the reconstructions of the mean electron spectrum a solid-angle-averaged form [37] for the bremsstrahlung cross section has been used. However, for a given photon emission, the cross section is in general a function not only of the photon energy and electron energy, but also of the incoming and outgoing electron directions and of the polarization state of the emitted photon [32, 33]. In Sect. 5 the angle-dependency of the bremsstrahlung cross section is considered and the consequences of this are investigated by comparing the solutions in the case of a real spectrum.
Section 6 is devoted to the thermal problem and namely to the inversion of the Fredholm integral equation which relates the mean electron spectrum to the differential emission measure. In particular, the inverse problem is described introducing and justifying the regularization method that is used in order to ﬁnd numerically stable solutions. Then, as in the case of the non-thermal problem, the regularization method is applied to both simulated and real data.
Finally, in the last part of the paper, the imaging spectroscopy topic in analyzed. In particular, in Sect. 7 we describe the new method for the imaging spectroscopy of RHESSI and we validate it both on synthetic and on real ﬂares. Some comments and conclusions are offered in Sect. 8.

112
2 Linear Inverse Problems and Regularization Theory
2.1 Ill-posedness
In 1976 Professor J.B. Keller gave the following deﬁnition of what an inverse problem means: one calls two problems inverse to each other if the formulation of one problem involves the solution of the other one [43]. What one usually does is to call one of these problems (usually the simpler one or the one which was studied earlier) the direct problem, while the other one is the inverse problem. This choice is less arbitrary if we have to deal with a mathematical problem applied to a concrete problem; in this case, there is a quite natural distinction between the direct and the inverse problem. For example, if we know a particular state of a physical system and the physical laws that rule it and we want to predict its future behavior, we are in front of a direct problem. On the contrary, if we want to calculate the evolution of the system backwards in time or we want to identify some physical parameter from observations of the evolution of the system we are dealing with inverse problems.
In other words, we might say that inverse problems regard the study of the causes for a speciﬁc effect.
Such inverse problems most often do not fulﬁll Hadamard’s postulates of well-posedness [35].
Deﬁnition 1 A problem is said well-posed in the sense of Hadamard if all the following properties hold:
1. for all admissible data, a solution exists; 2. for all admissible data, the solution is unique; 3. the solution depends continuously on the data.
A problem is said ill-posed if at least one of the three well-posedness conditions is not satisﬁed.
Let us consider an ill-posed problem. If we are handling exact data, the existence of a solution is an important requirement and its violation can usually be repaired by relaxing the notion of a solution. In the case of perturbed data, the problem has to be “regularized” and hence changed anyway.
If the condition of uniqueness of the solution is not fulﬁlled, we have to decide in some way which one among all the solutions is of interest, typically through the assumption of additional information about the solution itself. We observe that, when we work with real data, non-uniqueness is usually introduced by the need for discretization.
Finally, a solution of an inverse problem which does not depend continuously on the data causes serious numerical problems. In fact, if we face the problem by using a “traditional” numerical method (as in the case of well-posed problems), then we have to expect that the numerical solution becomes unstable. We can partially repair this problem with the use of “regularization methods”, although we

M. Prato
have to keep in mind that no mathematical trick can make an inherently unstable problem stable. All that a regularization method can do is to recover partial information about the solution as stably as possible. Our goal when we apply regularization methods must always be to ﬁnd the right compromise between accuracy and stability.

Deﬁnition 2 A general linear inverse problem can be formulated according to a general scheme. First the corresponding direct problem is deﬁned; the solution of the direct problem introduces a linear (continuous) operator A whose domain is in the Hilbert space X of the solutions and range in the Hilbert space Y of the functions representing the measured data: the inverse problem consists in determining f ∈ X from the knowledge of g ∈ Y when g and f are related by the equation

g = Af.

(1)

In the following, we assume that X and Y are Hilbert spaces and we denote with L(X, Y ) the space of the linear and continuous operators between X and Y .

In the case of a linear inverse problem (i.e., when A ∈ L(X, Y )), the well-posedness conditions can be synthetically written as:
1. D(A−1) = Y ; 2. N (A) = 0; 3. A−1 is continuous
where N (A) is the kernel of A and D(A−1) is the domain of its inverse operator.
Ill-posedness is a typical feature common to most inverse problems. For example, in the case of linear inverse problems with discrete data, i.e. in the case where Y is a ﬁnite dimensional Euclidean space, uniqueness is not veriﬁed. Existence does not occur in many linear integral equation of the ﬁrst kind. In fact, when X = L2(a, b) and Y = L2(c, d),
b
g(x) = (Af )(x) = K(x, y)f (y)dy, x ∈ [c, d] (2)
a
is an analytical function of x if the integral kernel K(x, y) is an analytical function of x. This implies that the set of functions deﬁned by (2) is a proper subset of Y and existence does not occur for all g in L2(c, d). Finally, let us consider the case where X = L1(E) and Y = C(F ), E and F are compact sets and A is the linear integral operator

(Af )(x) = K(x, y)f (y)dy, x ∈ F,

(3)

E

with K(x, y) continuous in x and y. It can be proved that in this case A−1 is unbounded.
Well-posedness is not a sufﬁcient condition for the sta-
bility of the solution of a linear inverse problem. Indeed, let

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

113

us assume that A−1 is well-deﬁned and continuous. Then, if
in (1) δg is a small variation of the datum and δf is the corresponding variation on the solution, the continuity of A−1
implies

δf X ≤ A−1 δg Y .

(4)

On the other hand the continuity of A implies

f

X≥

gY A

(5)

so that

(2) ⇒ (3) since P g ∈ R(A) and R(A) is closed, there exists a sequence {fn}∞ n=1 ⊆ R(A) such that P g = limn→∞ Afn. Then

Au − g 2 =

Au − P g 2 + lim
n→∞

Afn − g

2

≥ Au − P g 2 + Au − g 2.

(10)

It follows that Au − P g = 0 and then Au − g = P g − g. Since P g − g ∈ R(A)⊥ = N (A∗), (3) is true.
(3) ⇒ (1) if (3) holds, then Au − g ∈ N (A∗) = R(A)⊥
and (1) follows.

δf X ≤ A A−1 δg Y .

(6)

fX

gY

Deﬁnition 3 The real positive number

C(A) = A A−1

(7)

is said condition number and provides an estimate of the instability of the problem.

From Theorem 1 it follows that a pseudosolution of the
linear inverse problem (1) exists if and only if the datum g belongs to R(A) ⊕ R(A)⊥ (which is dense in Y ). In this case, the set of pseudosolution is convex and closed
in the Hilbert space X, and therefore there always exists a unique pseudosolution u† of the linear inverse problem (1)
with minimum norm. This property leads us to the following
deﬁnition:

From the inequality (6) we see that, if the condition number C(A) is much greater than 1, then a small relative variation on the data can produce very dramatic oscillations on the solution. It follows that, in general, the presence of even a small error on the data of an ill-posed inverse problem (with an elevate condition number) may make its solution extremely unstable.
2.2 Pseudosolutions and Generalized Inverse Operator

Deﬁnition 5 Let A ∈ L(X, Y ), g ∈ R(A) ⊕ R(A)⊥. A function u† ∈ X is called generalized solution of the linear inverse problem (1) if u† is the only pseudosolution of
(1) such that

u† X = inf{ u X : u is a pseudosolution of (1)}.

(11)

The operator A† : R(A) ⊕ R(A)⊥ → X deﬁned by

A†g = u†

(12)

Deﬁnition 4 Let A ∈ L(X, Y ), g ∈ Y . A function u ∈ X is called normal solution or pseudosolution of the linear inverse problem (1) if

Au − g Y = inf{ Af − g Y : f ∈ X}.

(8)

The following theorem gives a complete description of the pseudosolutions of a linear inverse problem:

Theorem 1 If A ∈ L(X, Y ), u ∈ X, g ∈ Y , A∗ is the adjoint operator of A and P : Y → R(A) is the linear projection onto the closure of the range of A, then the following conditions are equivalent:
(1) Au = P g; (2) Au − g Y ≤ Af − g Y ∀f ∈ X; (3) A∗Au = A∗g.

Proof (1) ⇒ (2) from the decomposition Y = R(A) ⊕ R(A)⊥ we have that Au − g = P g − g ∈ R(A)⊥; moreover, given f ∈ X, we have that Af − P g ∈ R(A) and then from (1) it follows that
Af − g 2 = Af − P g 2 + Au − g 2 ≥ Au − g 2. (9)

is said generalized inverse operator.

The generalized solution u† is the unique pseudosolution in N (A)⊥. In fact, the decomposition

u† = u1 + u2

(13)

with u1 ∈ N (A)⊥ and u2 ∈ N (A) implies that u1 is a pseudosolution too. Then

u† 2 = u1 + u2 2 = u1 2 + u2 2 ≥ u1 2.

(14)

The deﬁnition of generalized solution imply that u1 = u† and u2 = 0.
The previous remark allows us to show that the operator A† is linear. In fact, if g1 and g2 belong to R(A) ⊕ R(A)⊥,
then

AA†g1 + AA†g2 = P g1 + P g2

= P (g1 + g2) = AA†(g1 + g2).

(15)

Therefore A†g1 + A†g2 − A†(g1 + g2) is in the kernel of A. But A†g1, A†g2 and A†(g1 + g2) are the generalized solu-
tions of the linear inverse problems (1) corresponding to the

114
different data g1, g2 and g1 + g2 and each generalized solution is orthogonal to the kernel of A. Since N (A)⊥ is a linear subspace of X and N (A) ∩ N (A)⊥ = {0}, we have that

A†(g1 + g2) = A†g1 + A†g2.

(16)

On the other hand

AA†αg = P αg = αP g = αAA†g = AαA†g.

(17)

Therefore A†αg − αA†g is in N (A) ∩ N (A)⊥, i.e. A†αg = αA†g.
The relation between the range of the generalized inverse operator and the range of the adjoint operator is described by the following theorem:

Theorem 2 Let A ∈ L(X, Y ). Then R(A∗) ⊆ R(A†). Moreover, if R(A) is closed, then R(A∗) = R(A†).

Proof Let assume that u ∈ R(A∗); then u ∈ R(A∗) = N (A)⊥. If we deﬁne g = Au, then u is the generalized
solution of the linear inverse problem (1) corresponding
to the datum g (because u is a pseudosolution and belongs to N (A)⊥). It follows that u = A†g and therefore R(A∗) ⊆ R(A†).
Let now assume that u ∈ R(A†); we notice that u is a generalized solution, so we have that u ∈ N (A)⊥. But R(A) closed implies R(A∗) closed and then N (A)⊥ = R(A∗) = R(A∗); it follows that u ∈ R(A∗) and R(A†) ⊆ R(A∗).

The introduction of the generalized solution and the generalized inverse operator allows us to formulate a new inverse problem which is made up of the solution of two successive minimum problems, described by the two equations (8) and (11). This problem is well-posed if, for all function g in Y , the generalized solution exists unique and the generalized inverse operator is continuous. We show now that if the range of the operator A is closed, then the wellposedness of the problem is guaranteed. First of all, if R(A) is closed, then we have that Y = R(A) ⊕ R(A)⊥; it follows that the projection of g onto R(A) belongs to R(A) and consequently the space of the pseudosolutions is not empty. The existence and the uniqueness of the generalized solution are straight consequences of its own deﬁnition, while the continuity of the generalized inverse operator is provided by the following lemma and theorem:

Lemma 1 If R(A) is closed, then there exists a positive constant m such that

Af Y ≥ m f X

(18)

for all f in N (A)⊥.

M. Prato
Proof We observe that the restriction A : N (A)⊥ −→ R(A) of A is a bijective operator between two Hilbert spaces; then, for the Open Mapping Theorem, (A )−1 is continuous. It follows that, for any function f in N (A)⊥,

f X = A−1Af X ≤ A−1 Af Y

(19)

so we can choose m = 1/ A−1 and the claim is proved.

Theorem 3 Let A ∈ L(X, Y ). Then A† is continuous if and only if R(A) is closed.

Proof Let A† be a continuous operator, and let us sup-
pose that R(A) is not closed. Then the domain of A† is R(A) ⊕ R(A)⊥, which is dense in Y ; it follows that, for any function g in R(A) ⊕ R(A)⊥, we have

AA†g = P g,

(20)

where P is the linear projection onto R(A). But A† is lin-
ear and so, for the B.L.T. Theorem (see [72]) there exists a continuous linear extension Aˆ† of A† whose domain is the
space Y and such that

AAˆ†g = P g

(21)

for all g in Y . This is a contradiction, because we can always ﬁnd a function g which belongs to R(A) \ R(A).
Let us suppose now that R(A) is closed. From Lemma 1 we have that, for any function g in Y ,

g Y ≥ P g Y = AA†g Y ≥ m A†g X

(22)

and, consequently,

A†g

X

≤

1 m

g

Y.

(23)

Theorem 3 shows that, if the range of the operator A is closed, then the problem of the determination of the generalized solution is well-posed; on the contrary, if R(A) is not closed, the determination of u† is an ill-posed problem. Among the operators whose range is not closed, we ﬁnd very signiﬁcant classes of operators that people frequently use in applications. An important example is given by the compact operators, as we can see in the following theorem:

Theorem 4 If A ∈ L(X, Y ) is compact and R(A) is closed, then R(A) is ﬁnite-dimensional.

Proof The operator A : X −→ R(A) is linear, bounded and surjective between two Hilbert spaces. Then, from the Open Mapping Theorem, given g = Af , the open unit sphere in X with center f is mapped by A onto an open set of R(A) containing g. Since A is compact, the closure of this open

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

115

set (which still belongs to R(A)) is compact; this means that R(A) is locally compact and therefore that it has ﬁnite dimension [73].

In conclusion we notice that, as we have already observed in the previous section, the closure of the range of the operator A does not guarantee the stability of the generalized solution: the inequality

δu† u†

X X

≤ C(A)

δg Y gY

(24)

is true with, this time, C(A) = A A† . If C(A) is large and the datum is affected by noise, the generalized solution is numerically unstable.

2.3 Linear Inverse Problems with Discrete Data

The usual linear inverse problem in physics can be formulated in the following way: an unknown property of the system we are studying is described by a function f and is related to another function g through a linear relation A. In actual applications, we do not have the function g for each value of the independent variable x; what we usually know is a set of numbers {g1, . . . , gN } which represents the output of the physical system and which are in some way connected to the values of the function f (x) in the set of points {x1, . . . , xN }. In particular, if the response of the system is linear, then each gn is linearly related to the value of the data function g in the point xn; it follows that, if we neglect the constant which depends on the efﬁciency of the instruments, the relation between the vector of the data and the unknown function is given by

gn = g(xn) = (Af )n

(25)

where A is a linear operator which maps the Hilbert space X into the Euclidean space Y equipped with the inner product

N

(g, h)Y =

gmwmnh∗n.

(26)

m,n=1

The numbers wmn are the elements of a weight-matrix W , positive deﬁnite, whose choice depends on the properties of
the physical system.
If the operator A is continuous, the Riesz’s Lemma al-
lows us to formulate the inverse problem with discrete data in a particular way: given the set of functions {φ1, . . . , φN } in the Hilbert space X and the element g = {g1, . . . , gN } in the Euclidean space Y , ﬁnd f ∈ X such that [8]

gn = (f, φn)X, n = 1, . . . , N

(27)

where (·, ·)X is the inner product in X. In this way, the nth component of the element Af of the Euclidean space Y

corresponds to the value of the bounded linear functional described by

(Af )n = (f, φn)X.

(28)

If the functions {φ1, . . . , φN } are linearly independent, there exists at least one solution of (27) for each g ∈ Y . In
fact, if we suppose that A is not surjective, then R(A) is a
closed subspace of Y and, consequently, there exists an element c ∈ (Af )⊥ for all f ∈ X, i.e.

c1(f, φ1)X + · · · + cN (f, φN )X = 0 ∀f ∈ X.

(29)

From this relation it follows that

c1φ1 + · · · + cN φN = 0

(30)

and so the functions φ1, . . . , φN are linearly dependent, which is a contradiction. This solution is obviously not unique because, if XN = span{φ1, . . . , φN }, f0 is a solution of the problem and f1 ∈ (XN )⊥, then f0 + f1 is still a solution of the problem. But the set of the solutions of the prob-
lem is closed and convex, so there exists a unique solution
with minimum norm, called again generalized solution, denoted by u† and belonging to XN . The explicit form of the generalized solution can be easily obtained by writing u† as
a linear combination of the functions φn, i.e. posing

N

u† = anφn.

(31)

n=1

If we insert this expression of u† in (27) we obtain

N

gn = am(φm, φn)X, n = 1, . . . , N.

(32)

m=1

In a natural way (32) leads to the following deﬁnition [8]:

Deﬁnition 6 We call the Gram matrix and we denote by G the matrix whose entries are given by

Gmn = (φm, φn)X, m, n = 1, . . . , N.

(33)

The Gram matrix G is invertible because the φn are linearly independent, so the coefﬁcients an in (32) become

N

an = gm(G−1)mn, n = 1, . . . , N.

(34)

m=1

If we insert this expression of the coefﬁcients an in (31) we obtain

N

u† =

gm(G−1)mnφn.

(35)

m,n=1

116
Equation (35) shows that the generalized solution depends continuously on the data and so its determination is a well-posed problem.
This property is not true if the functions φn are linearly dependent. In this case, XN = span{φ1, . . . , φN } and YN = R(A) are two subspaces of X and Y respectively whose dimension is N < N ; it follows that a solution of problem (27) exists if and only if the datum belongs to YN . On the contrary, if g does not belong to the range of A, then it is natural to introduce the least-squares problem

Af − g Y = min

(36)

whose solutions are called, as in the previous section, pseudosolutions.
Theorem 1 holds even in the case of linear inverse problems with discrete data, so the pseudosolutions u satisfy the equations

Au = P g,

(37)

where P is the projection onto YN , and

A∗Au = A∗g.

(38)

We can obtain the explicit form of the adjoint operator A∗ if we observe that

N

(Af, g)Y =

(Af )mwmngn∗

m,n=1

N

=

(f, φm)Xwmngn∗

m,n=1

N

= f,

φmwmngn ,

(39)

m,n=1

X

so that A∗ : Y −→ X is the operator which maps a vector g ∈ Y in

NN

A∗g =

wmngn φm.

(40)

m=1 n=1

The set of the pseudosolutions is still closed and convex, so there exists a unique pseudosolution with minimum norm u†
of problem (36). If the generalized solution u† belongs to the subspace XN
(or, more precisely, XN if the φn are not linearly independent; in the following we will assume for sake of simplicity that N = N ) of X, then we can write it as a linear combi-
nation of the elements of any basis of XN . Among all the possible choices, there is one particular basis of XN which will result very useful in the following. In order to produce this basis, we have to introduce the operators A∗A and AA∗;

M. Prato
these two operators are linear, bounded, self-adjoint, positive deﬁnite and with ﬁnite rank N . A∗A is deﬁned on the Hilbert space X and its explicit form, using (40), is given by

NN

A∗Af =

wmn(f, φn)X φm

(41)

m=1 n=1

while AA∗ acts on the Euclidean space Y and its k-th component is given by

N

(AA∗g)k =

(φm, φk)Xwmngn, k = 1, . . . , N. (42)

m,n=1

If we remind the deﬁnition (33) of the Gram matrix, we obtain the relation

AA∗ = GT W.

(43)

The eigenvalues of AA∗, each with its multiplicity, are denoted by σn2 and ordered in order to provide the non increasing sequence

σ12 ≥ · · · ≥ σN2 .

(44)

The corresponding orthonormal basis

oefigYe.nvTehcetorospe{rvant}oNnr=A1 ∗oAf

AA∗ form an has the same

eigenvalues σn2 of AA∗ with the same multiplicity and the corresponding eigenfunctions {un}Nn=1 form an orthonormal

basis in XN . It can be shown [8] that we can always choose

the eigenfunctions un and the eigenvectors vn such that the

shifted-eigenvalues problem

Aun = σnvn,

A∗vn = σnun

(45)

holds.

Deﬁnition 7 The set of triples {σn; un, vn}Nn=1 which satisfy the shifted-eigenvalues problem (45) is called the singular
system of the operator A; the real numbers σn are the singular values, the functions un are the singular functions and the vectors vn are the singular vectors.

The knowledge of the singular system of the operator represents a crucial point in regularization theory. The singular values and the singular vectors can be calculated in a simply way by diagonalizing the matrix (43) while the singular functions can be obtained from the relations (45) and the explicit form (40) of the adjoint operator:

1N N

uk = σk m=1

wmn (vk )n
n=1

φm.

(46)

From the singular system of the operator A we can obtain a very meaningful expression for the generalized solution of

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

117

the inverse problem (25). This function, as we have already
observed, has to be a pseudosolution (and so it solves (38))
and has minimum norm among all the pseudosolutions (and so it belongs to XN , i.e. it has no component in (XN )⊥). It follows that, if we replace the expansions

N

u† = anun

(47)

n=1

and

N

g = (g, vn)Y vn

(48)

n=1

in (38) and we compare the two sums term by term, we can deduce the following expression for the coefﬁcients an:

an

=

1 σn (g, vn)Y ,

n = 1, . . . , N.

(49)

Consequently, the generalized solution becomes

u†

=

N n=1

(g, vn)Y σn

un.

(50)

From this formula we can notice that, if we want to show explicitly the generalized solution of an inverse problem with discrete data, it is necessary to start with the choice of the topology on the data space Y . Then, from (50) we can deduce immediately the continuous dependence of the generalized solution from the datum (in accordance with the fact that the range of a linear bounded functional is closed).
The knowledge of the singular system of the operator A gives us also a quantitative analysis of the instability of the generalized solution; in fact, it can be shown that the condition number is given by

and, since

(A†)∗uk, vn Y = uk, A†vn X

1

1

= uk, σn un X = σn δkn

(55)

we obtain that

N

(A†)∗uk =

(A†)∗uk, vn Y vn

n=1

=

N1 n=1 σn

δknvn

=

1 σk

vk .

(56)

It

follows

that

the

set

of

triples

{

1 σn

;

un

,

vn}Nn=1

satisﬁes

the

shifted-eigenvalues problem

A†vn

=

1 σn un,

(A†)∗un

=

1 σn

vn

(57)

and so it is the singular system of the operator A†; from (57) we have that

A†(A†)∗uk

=

1 σk

A†vk

=

1 σk2

uk

(58)

and

therefore

the

numbers

1 σk2

,

k

= 1, . . . , N

are

the

eigen-

values of the operator A†(A†)∗.

Expression (51) of the condition number shows that, even

if the determination of the generalized solution of an inverse

problem with discrete data is a well-posed problem, the nu-

merical stability is not guaranteed. In fact, the more the sin-

gular values fastly decrease, the more the ratio (51) increases

and the propagation of the error from the datum to the gen-

eralized solution produces very signiﬁcative effects.

C(A) = σ1 ,

(51)

σN

where σ1 is the greatest singular value while σN is the smallest one. Indeed we have that (see [72])

A 2 = A∗A = sup σk2 = σ12

(52)

k=1,...,N

while

A† 2 =

A†(A†)∗

= sup
k=1,...,N

1 σk2

=

1 σN2 .

(53)

In

order

to

show

that

the

numbers

1 σk2

,

k

= 1, . . . , N

are

the

eigenvalues of the operator A†(A†)∗, we observe that

A†vk

=

N n=1

(vk, vn)Y σn

un

=

1 σk

uk

(54)

2.4 Regularization Theory: General Formulation

In general terms, regularization is the best approximation of an ill-posed problem by a family of neighboring well-posed problems. We motivate the deﬁnition of a regularization operator and of a regularization method in this way: we want to approximate the best-approximate solution u† = A†g of the inverse problem (1) for a speciﬁc right-hand side g in the situation where the “exact data” g are not known precisely, but only an approximation g(δ) with

g(δ) − g Y ≤ δ

(59)

is available; we will call g(δ) the noisy data and δ the noise
level.
In the ill-posed case, and above all when the condition number is particularly great, A†g(δ) is certainly not a good approximation of A†g due to the high numerical instability.

118
In order to repair this pathology, we look for some approximation, say fλ(δ), of u† which does, on the one hand, depend continuously on the (noisy) data g(δ), so that it can be computed in a stable way, and has, on the other hand, the property that as the noise level δ decreases to zero and the regularization parameter λ is chosen appropriately (whatever this means), then fλ(δ) tends to u†.
These considerations lead to the following deﬁnition:

Deﬁnition 8 Let A be a linear continuous operator with domain in the Hilbert space X and range in the Hilbert space Y . The one-parameter family of operator {Rλ}λ>0 such that
(1) Rλ : Y → X is bounded ∀λ; (2) limλ→0 Rλg − u† X = 0 for every g in Y such that
P g ∈ R(A), where P is the linear projection onto R(A),
is said a regularization algorithm. The regularization algorithm {Rλ}λ>0 is said linear if Rλ
is linear for each value of the regularization parameter λ.

The operators Rλ represent continuous approximations of the generalized inverse operator of A; in particular, condition (2) can be replaced by

lim
λ→0

Rλg − A†g

X = 0,

(60)

and this relation is true for every g in R(A) ⊕ R(A)⊥, which is dense in Y .
The application of a regularization algorithm provides an approximation of the generalized solution of an ill-posed linear inverse problem in the case of noise-free data. However the use of regularization is crucial in the case where the data g is affected by experimental error. In this case g(δ) can be represented in the form

g(δ) = Au† + w(δ)

(61)

with

w(δ) Y = δ.

(62)

Expression (61) may not be explicitly known but it is always
possible to assume that it exists. Furthermore, it does not necessarily mean that the noise is additive, since w(δ) may depend on u†. In the case of a linear regularization algorithm
one easily obtains

Rλg(δ) − u† X ≤ RλAu† − u† X + δ Rλ .

(63)

Equation (63) represents a basic inequality in linear regularization theory. The ﬁrst term at the right hand side represents the approximation error due to the use of Rλ instead of the generalized inverse operator; from condition (2) it tends to zero when λ tends to zero. On the other hand, the second term measures the error on the regularized solution Rλg(δ)

M. Prato
due to the presence of noise on the data and typically grows up to inﬁnity when λ tends to 0. Every regularization algorithm requires a strategy for choosing the parameter λ in dependence on the error level δ in order to achieve an acceptable total error for the regularized solution. On the one hand, the accuracy of the approximation asks for a small error RλAu† − u† X, i.e., for a small parameter λ. On the other hand, the stability requires a small Rλ , i.e., a large parameter λ. An optimal choice would ﬁnd a value λopt(δ) of the regularization parameter such that the right hand side of (63) becomes minimal. This choice of the regularization parameter realizes a compromise between accuracy and stability. For a reasonable regularization strategy we expect the regularized solution to converge to the exact solution when the error level tends to zero. We express this requirement through the following deﬁnition:

Deﬁnition 9 A regularization algorithm {Rλ}λ>0 is said regular if, for δ → 0, λopt(δ) → 0 and Rλopt(δ)g(δ) → u†.
When we deal with a strongly ill-posed linear inverse problem, λopt(δ) changes very slowly for different values of δ. Moreover, we observe that, even if a function λopt(δ) exists, this does not mean that it is easy to determine it. In general, the estimate of a reliable λopt(δ) is the main problem in regularization theory.

2.5 A Regularization Algorithm: The Tikhonov Method

The Tikhonov method is, historically, the ﬁrst algorithm rigorously described in regularization theory and it has been introduced in order to solve Fredholm integral equations of the ﬁrst kind [79, 80]. The ﬁrst step in the deﬁnition of such method (in a general context) is the minimization of the functional

λ[f ] =

Af − g

2 Y

+

λ

f

2 X

(64)

with λ a real positive number. It is not difﬁcult to prove that for each λ the minimum problem is equivalent to the Euler equation

(A∗A + λI )f = A∗g.

(65)

Indeed, fλ is a solution of the minimum problem if and only if, for any t in C and for any function φ in the Hilbert space X, we have

Afλ − g

2 Y

+

λ

fλ

2 X

≤

A(fλ + tφ) − g

2 Y

+

λ

fλ + tφ

2X .

(66)

By writing the norms as scalar products one easily obtains

|t |2 (

Aφ

2 Y

+λ

φ

2X) + t{(Aφ, Afλ − g)Y

+ λ(φ, fλ)X}

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

119

+ t{(Afλ − g, Aφ)Y + λ(fλ, φ)X} ≥ 0.

(67) Now, if (AA∗) is the spectrum of AA∗, then

From the arbitrariness of the complex number t we have that

AA∗(AA∗ + λI )−1 = sup

ω ≤1

(76)

relation (67) holds if and only if the linear term in λ

ω∈ (AA∗) ω + λ

(Afλ − g, Aφ)Y + λ(fλ, φ)X

(68) and

is zero for all φ in X and therefore the Euler equation fol-

(AA∗ + λI )−1

11

= sup

≤

ω∈ (AA∗) ω + λ λ

(77)

lows.

Since A∗A + λI is (strictly) positive deﬁnite, its inverse so that

operator is continuous.1 It follows that the function fλ which solves the Euler equation is given by

Rλ ≤ √1 , λ

(78)

fλ = (A∗A + λI )−1A∗g. If we deﬁne the operator Rλ = (A∗A + λI )−1A∗, then the following theorem holds:

(69) i.e. Rλ is continuous for all λ > 0.

In order to prove the second property of the regularization

algorithms, we need arguments based on the spectral theory

of linear continuous operator. Let f ∈ N (A)⊥; then

(70)

RλAf − f X =

A2 λ

0

ω + λ dEωf X

(79)

Theorem 5 The one-parameter family of operator {Rλ}λ>0 deﬁned by

Rλ = (A∗A + λI )−1A∗

(71)

is a linear regularization algorithm. The algorithm is also regular.

Proof First of all, the relation

(A∗A + λI )A∗ = A∗(AA∗ + λI )

(72)

holds and, if we multiply both members on the left by (A∗A + λI )−1 and on the right by (AA∗ + λI )−1, we obtain

Rλ = A∗(AA∗ + λI )−1.

(73)

This means that the range of the regularized inverse operator Rλ is contained in the range of A∗; then, from Theorem 2, we have that the range of Rλ is contained in the range of A†.
The continuity of Rλ is a consequence of the inequality

Rλg

2 X

≤

AA∗(AA∗ + λI )−1

(AA∗ + λI )−1

g

2 Y

(74)

obtained by using the Schwartz inequality in the relation

Rλg

2 X

=

(A∗(AA∗

+

λI

)−1g,

A∗(AA∗

+

λI

)−1g)X

= (AA∗(AA∗ + λI )−1g, (AA∗ + λI )−1g)X. (75)

where dEω is the spectral measure associated to the operator

A∗A (which is self-adjoint and positive semideﬁnite).2 The

function

ω→

λ ω+λ

is

integrable

with

respect

to

the

spec-

tral measure over [0, A 2] and is bounded by 1, integrable

over the same interval. The Dominated Convergence Theo-

rem now implies

lim
λ→0

RλAf − f

X=

E0f

(80)

where E0 is the projection onto N (A∗A) = N (A). It follows

that E0f = 0.

Finally the regularity of the algorithm is guaranteed by

the

choice

of

a

function

λopt (δ )

such

that

√δ λ

→0

since

in

this case inequality (63) becomes

Rλg(δ) − u† X ≤ RλAu† − u† X + √δ .

(81)

λ

In the case of linear inverse problems with discrete data, the knowledge of the singular system of the operator A allows us to write the regularized solution fλ (i.e. the function which minimizes the Tikhonov functional deﬁned in (64)) in a very useful way for the applications. In fact, if we write the data vector g as

N

g = (g, vn)Y vn

(82)

n=1

2From the Spectral Theorem (see [72]) we have that

1In fact, if λ > 0 and f ∈ X, then λ f 2 = (λf, f )X ≤ ((A∗A + λI )f, f )X ≤ (A∗A + λI )f X f X and the claim follows from Theorem 12.12(c) of [73].

h(A∗A) =

h(ω)d Eω .

(A∗ A)

In

this

case

h(ω)

=

|

ω ω+λ

−

1|

=

λ ω+λ

.

Moreover, since supω∈ (A∗A) |ω| = A∗A = A 2, it follows that

(A∗A) ⊆ [0, A 2].

120
and we recall the shifted-eigenvalues problem (45) which deﬁnes the singular system of the operator A, then the Euler equation (65) becomes [9]

fλ

=

N k=1

σk σk2 +

λ

(g,

vk )Y

uk .

(83)

However, it is interesting to notice that, in the case of a lin-
ear inverse problem with discrete data, the knowledge of the
singular system of the operator is not necessary if we want
to ﬁnd the function which minimizes the Tikhonov func-
tional (64). In fact we can obtain Rλ also from (73); but (AA∗ +λI ) assumes values in the Euclidean space Y with ﬁnite dimension N , so it can be described by a N × N matrix.
It follows that the determination of Rλ implies the inversion of a matrix followed by the application of the operator A∗.

2.6 Optimal Choice of the Regularization Parameter

The parameter choice rule λ = λ(δ, g(δ)) depends explicitly on the noise level δ and on the actual perturbed data g(δ). Also, it usually depends on every speciﬁc g in the domain of the generalized inverse operator A†; since g is not known, this dependence can only be on some qualitative a-priori knowledge about g like smoothness properties. Finally, λ depends also on the operator A.
We distinguish between two types of parameter choice rules:
Deﬁnition 10 Let λ = λ(δ, g(δ)) be a parameter choice rule. If λ does not depends on g(δ), but only on δ, then we call λ an a-priori parameter choice rule. Otherwise, we call λ an a-posteriori parameter choice rule.

Thus, an a-priori parameter choice rule depends only on the noise level, not on the actual data and, hence, not on results obtained during the actual computation like the residual Afλ − g(δ) Y , where fλ = Rλg(δ) is the regularized solution. Such a rule may be devised before the actual calculation, hence the name a-priori parameter choice rule.
Let us consider now the speciﬁc case of the Tikhonov regularization algorithm. According to Theorem 5, any a-priori choice of the regularization parameter λ = λ(δ) satisfying δ2/λ(δ) → 0 as δ → 0 leads to a regular algorithm for the solution of the linear inverse problem Af = g. Although this asymptotic result may be theoretically satisfying, it would seem that a choice of the regularization parameter that is based on the actual computations performed, that is an aposteriori choice of the regularization parameter, would be more effective in practice. One such a-posteriori strategy is the discrepancy principle of Morozov (see [9]). The idea of the strategy is to choose the regularization parameter so that

M. Prato
the size of the residual Afλ − g(δ) Y is the same as the error level in the data, i.e.,

Afλ − g(δ) Y = δ.

(84)

In the case of linear inverse problems with discrete data, assuming that the signal-to-noise ratio is larger than one, that is g(δ) > δ, and that g ∈ R(A), then it is not hard to see that there is an unique positive parameter λ satisfying (84). To do this, we use the singular value decomposition

N

Afλ − g(δ)

2 Y

=

k=1

2

λ σk2 + λ

|(g(δ), vk)|2 + P g(δ) 2

(85)

where P is the projector of Y onto R(A)⊥. From (85) we see that the real function

f (λ) =

Afλ − g(δ)

2 Y

(86)

is a continuous, increasing function of λ satisfying (since P g = 0)

lim f (λ) = P g(δ) = P g(δ) − P g
λ→0+

≤ g(δ) − g ≤ δ

(87)

and

lim f (λ) = g(δ) > δ.

(88)

λ→+∞

Therefore, by the Intermediate Value Theorem, there is a unique λ = λ(δ, g(δ)) satisfying (84).
We close this section by showing that the choice λ(δ, g(δ))
as given by the discrepancy method (84) leads to a regular scheme for approximating A†g, that is

fλ(δ,g(δ)) → u† as δ → 0.

(89)

To do this it is sufﬁcient to show that for any sequence δn → 0 there is a subsequence, which for notational convenience we will denote by {δk}, such that

fλ(δk,g(δk)) → u†.

(90)

We are assuming that g ∈ R(A) and we recall that u† is the unique vector in X satisfying Au† = g and u† ∈ N (A)⊥. For sake of simplicity we will neglect the dependency of λ(δ, g(δ)) on g(δ).
From the functional characterization (64) of the Tik-
honov approximation we have

λ(δ)[fλ(δ)] ≤ λ(δ)[u†]

(91)

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

121

that is,

δ2 + λ(δ) fλ(δ)

2 X

=

Afλ(δ) − g(δ)

2 Y

+ λ(δ)

fλ(δ)

2 X

≤ λ(δ)[u†]

=

g − g(δ)

2 Y

+ λ(δ)

u†

2 X

≤ δ2 + λ(δ)

u†

2 X

(92)

and hence fλ(δ) X ≤ u† X. Therefore, for any sequence δn → 0 there is a subsequence δk → 0 with fλ(δk) z for some z ∈ X (where the arrow denote the weak conver-
gence). Since

fλ(δ) = A∗(AA∗ + λ(δ)I )−1g(δ) ∈ R(A∗) ⊆ N (A)⊥ (93)
and N (A)⊥ is weakly closed, we ﬁnd that z ∈ N (A)⊥. Also, since

Afλ(δk) − g(δk) Y → 0

(94)

we see that Afλ(δk) → g. But A is weakly continuous and therefore

Afλ(δk) Az.

(95)

It follows that Az = g and z ∈ N (A)⊥, i.e., z = u†. We observe now that, since

fλ(δk) u†

(96)

and the operator f → f X is continuous and convex, from Corollary 1.8.3 of [2] we have that

u†

X

≤

lim
k→+∞

inf

fλ(δk )

X.

(97)

But fλ(δk) X ≤ u† X ∀k, and therefore

fλ(δk) X → u† X.

(98)

From relations (96) and (98) it follows that

fλ(δk) → u†,

(99)

(see Theorem 1.8.3 of [2]) and the proof is complete.

of the X-ray spectra provides information about the physical processes that take place in the magnetized plasma of the solar atmosphere during a ﬂare, such as impulsive energy release, particle acceleration and particle and energy transport [78]. This high-energy processes play a major role at sites throughout the universe ranging from magnetosphere to active galaxies. Consequently, the importance of understanding these processes transcends the ﬁeld of solar physics, and represents one of the major goals of space physics and astrophysics.
Hard X-rays are emitted by electrons with relativistic velocities, typically impossible to reach if we take into account simply the agitation due to the temperature of the plasma. Soft X-rays have lower energies and are emitted by electrons with lower or thermal velocities. In the last case the electrons assume shift velocities (about 0.05 times c) that make them leave their original atomic nuclei. These electrons are then attracted by other atomic nuclei which slow them down. The excess energy is released in the form of X-rays in a process known with the German term bremsstrahlung (“braking” radiation). The same process is followed also by electrons with relativistic velocities, but in this case the energies involved are much bigger and cause the emission of hard X-rays.
The bremsstrahlung of electrons with the ions of the plasma is then a collisional process and so it is characterized by a cross section which, generally, depends on the energies of the photons and of the X-ray producing electrons. If this cross section is analytically known, the photon spectrum can be directly related to the electron distribution. Generally, the study of the emission process can be made under particular hypotheses about the physical conditions of the source. First of all, bremsstrahlung radiation is considered optically thin [15], that is absorption can be neglected; this implies that the observed X-ray spectra are quite similar to the ones emerging from the emitting region. Moreover the electron velocity distribution is characterized by isotropic conditions. Finally the plasma is assumed to be hydrogen dominated, so that the ions are almost completely protons. Under these hypotheses, the equation linking the distribution function of electrons with the hard X-ray intensity observed at distance R from a source can be written in the following way [20]:

3 Application to Solar Physics: Non Thermal Bremsstrahlung
3.1 The Bremsstrahlung Equation
A solar ﬂare is the rapid release of a large amount of energy stored in the solar atmosphere. During a ﬂare, gas is heated from 10 to 20 million degrees Kelvin and radiates both soft X-rays and longer-wavelength emission (hard Xrays and γ -rays). It is important to notice that the analysis

I(

)=

1 4π R2

n(r)
V

∞
F (E, r)Q( , E)dEdr

(100)

where V is the source volume, n(r) is the local proton density in the plasma, E is the electron energy, is the photon energy, F (E, r) is the electron distribution function, Q( , E) is the bremsstrahlung cross section, differential in
, I ( ) is the total rate of photon emission measured in photons cm−2 s−1 keV−1. It must be noted that I ( ) represents the ﬂux at the sun; in practice, measurements regard ﬂuxes at the earth, but here only the shape of the spectrum and the

122

relative errors in it are considered, so that the multiplicative constants are irrelevant.
Averaging the electron distribution function over the volume of the emitting region and weighting this function with the density of the ions, one obtains the relation [20]:

nV ∞

I ( ) = 4π R2

F (E)Q( , E)dE,

(101)

where the mean electron spectrum F (E) is given by

1

F (E) =

n(r)F (E, r)dr,

nV V

(102)

and the mean target ion density in the source is deﬁned as

1

n=

n(r)d r.

VV

For sake of simplicity, we deﬁne

(103)

f (E) = nV F (E),

(104)

M. Prato

g( ) = 4π R2I ( )

(105)

and the integral equation we are interested in becomes

∞
g( ) = f (E)Q( , E)dE.

(106)

Equation (106) is the bremsstrahlung equation in solar plasma physics: it can be written without any assumption on the physical processes in the source and just for this reason it is the most general equation describing the X-ray emission mechanism during solar ﬂares. We point out that this approach is completely isotropic, i.e. no angular dependency in the mean electron spectrum or in the cross section is considered. First investigations toward anisotropic modelization of the emission will be described in Sect. 5, although further generalizations of (106) are in progress.
A crucial role in (106) is played by the cross section Q( , E) differential in the photon energy at electron energy E. In practice, Q( , E) measures the probability that an X-ray photon of energy is emitted by an electron of energy E ≥ . This function carries a signiﬁcant physical meaning, describing the effectiveness of the bremsstrahlung process. Furthermore, on the mathematical side, its analytical shape has important consequences on the numerical stability of the solution process and provides a reliable indication of the information content which can be retrieved from the X-ray data. According to the physical conditions where the emission process takes place, many different forms of such a cross section can be written [44]. Among them, three representatives are particularly meaningful, representing three different amounts of the impact due to the relativistic effects on the emission process. These three cross sections, represented in Fig. 1, are:

Fig. 1 Cross sections: (a) Kramers approximation; (b) Bethe-Heitler approximation; (c) highly relativistic cross section

• The Kramers cross section

QK (

,E) =

Q0 . E

(107)

It is a completely classical formula, where the relativistic effects are neglected. In (107) the multiplicative constant

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

123

factor Q0 is deﬁned as

Q0

=

8 3

Zr02 137

mc2,

(108)

where r0 is the classical radius of the electron, m is its rest mass, c is the speed of light in the vacuum and Z is the average atomic number in the plasma (Z is of the order of 1, since the plasma is mostly made of ionized hydrogen). • The Bethe-Heitler formula

QBH (

,E) =

Q0 E

log

1+ 1−

1− E ,
1− E

(109)

of the relativistic corrections on the photon production in the solar plasma. The second reason is computational and is the fact that the numerical solution of (106) is notably less onerous in the case of (107) and (109) and then it is helpful to verify to what extent the highly relativistic effects can be neglected in the solution procedure. In the next two sections an analytical study of the solution of the bremsstrahlung problem will be performed in the case of continuous and discrete photon spectra for the Kramers and Bethe-Heitler approximations. The case concerned with the more complicated cross section (110) will be treated in Sect. 3.4.
3.2 Analytical Study of the Bremsstrahlung Equation

where Q0 is as in (108) and which accounts for mild relativistic effects in the logarithmic factor at the right hand side. • The highly relativistic formula given by equation (3BN) in [44]. Such cross section is considered the most general one for this kind of process and its analytical form is

Q3BN ( , E)

= Z2r02 pf 137 kp0

4 3

−

2E0Ef

pf2 + p02 pf2 p02

+

ν0Ef p03

+

νf E0 pf3

− ν0νf p0pf

+L

8E0Ef 3p0pf

+

k2(E02Ef2 + p02pf2 ) p03pf3

+k 2p0pf

E0Ef + p02 p03

ν0

−

E0Ef + pf2 pf3

νf

+

2kE0Ef pf2 p02

(110)

where

k = mc2 ;

E0

=

E mc2

+

1;

Ef = E0 − k;

p0 = E02 − 1;

pf = Ef2 − 1

L = 2 ln E0Ef + p0pf − 1 ; k

ν0 = ln

E0 + p0 E0 − p0

;

νf = ln

Ef + pf Ef − pf

.

There are two reasons why it is interesting to study (106) for different forms of the cross section. The ﬁrst one is physical and is the fact that the differences in the solution f (E) for the three different forms of Q( , E) previously introduced can provide quantitative information on the incidence

In the case of the Kramers and Bethe-Heitler cross sections, (106) can be analytically solved by applying the theory of Mellin transform [76]. The Mellin transform of a function f : (0, ∞) −→ C such that

∞
|f (x)| xσ −1dx < ∞
0

(111)

for

some

real

number

σ

in

(0,

1 2

)

is

deﬁned

as

f (ξ ) =

∞

f

(x

)x

−

1 2

+i

ξ

d

x

,

0

(112)

where ξ belongs to (−∞, +∞).

Theorem 6 Let f : (0, ∞) −→ C be a continuously differentiable function such that condition (111) holds. Then

f (x) = 1

+∞

f

(ξ

)x

−

1 2

−iξ

d

ξ

2π −∞

(113)

for every x ∈ (0, ∞).

Proof If we do the change of variable x = e−t in (112), we
see that the Mellin transform of a function x → f (x) coin-
cides with the Fourier transform of the function t → F (t) = f (e−t )e−t/2. In fact,

f (ξ ) = = = =

∞

f

(x

)x

−

1 2

+i

ξ

d

x

0

+∞

f

(e−t

)e(

1 2

−iξ

)t

e−t

d

t

−∞

+∞

f

(e−t

)e−

t 2

e−iξt dt

−∞

+∞
F (t)e−iξt dt = F (ξ ).
−∞

(114)

This equality and the inversion formula for the Fourier transform lead to the inversion formula for the Mellin transform.

124

Indeed

f

(e−t

)e−

t 2

=

F (t)

=

1

+∞
f (ξ )eiξt dξ.

2π −∞

Coming back to the variable x we have

(115)

√ f (x) x =

1

+∞
f (ξ )x−iξ dξ

2π −∞

(116)

and the explicit form of the Mellin inverse transform follows:

f (x) = 1

+∞

f

(ξ

)x

−

1 2

−i

ξ

d

ξ

.

2π −∞

(117)

The Mellin transform is particularly helpful in the case of linear integral equations of the ﬁrst kind when the kernel depends on the variables ratio. Indeed the following theorem holds:

Theorem 7 We consider the integral equation

g(x) =

+∞ x k

dy f (y) .

0

y

y

(118)

The application of the Mellin transform to both sides of this equation leads to the diagonalization

g(ξ ) = f (ξ )k(ξ ).

(119)

Proof We have

g(ξ ) =

∞

g

(x

)x

−

1 2

+i

ξ

d

x

0

∞ +∞ x

dy

=

k f (y)

0

0

y

y

=

∞

f

(y

)y

−

1 2

+i

ξ

dy

0

x

−

1 2

+iξ

d

x

×

∞x k

0

y

x

−

1 2

+i

ξ

dx

y

y

=

∞

f

(y

)y

−

1 2

+i

ξ

dy

0

∞

k(z)z−

1 2

+iξ

d

z

0

= f (ξ )k(ξ ).

(120)

When Q( , E) assumes the forms (107) or (109) the bremsstrahlung equation (106) assumes the same form as (118). In fact, if one introduces the new integral kernel

0,

E< ,

K( ,E) =

E Q0

Q(

, E),

E≥

(121)

M. Prato

and deﬁnes the new data function

J( )=

g( ) ,

Q0

(106) becomes

∞

dE

J ( ) = f (E)K( , E) .

0

E

Now let us introduce the functions

k( /E) = kK ( /E) =

1, 0,

/E ≤ 1, otherwise

(122) (123) (124)

for the Kramers cross section and

k( /E) = kBH ( /E)

⎧

√

=

⎨log ⎩0,

1+√1−
1− 1−

E E

,

/E ≤ 1, otherwise

(125)

for the Bethe-Heitler cross section, so that (123) can be written as

J( )=

∞

dE

f (E)k( /E) .

0

E

(126)

It follows that, if the Mellin transform of the integral kernels (124) and (125) does not vanish anywhere, the analytical solution of (106) when these two cross sections are used can be easily obtained by using Theorems 6 and 7.

Theorem 8 Let f : (0, ∞) −→ C be a continuously differentiable function such that condition (111) holds. Then the solution of (106) is given by

f (E) = 1

+∞ 1 + 2iξ

J

(ξ

)

E−(

1 2

+iξ

)d

ξ

2π −∞ 2

(127)

in the case of the Kramers cross section and by

f (E) = 1 2π

+∞ 1 +√2iξ −∞ 2 π

(1

(

1 2

+ iξ) + iξ)

J (ξ )

E

−(

1 2

+iξ

)

d

ξ

(128)

in the case of the Bethe-Heitler cross section.

Proof Let us introduce the Beta function B(p, q) deﬁned as

1
B(p, q) = xp−1(1 − x)q−1dx,
0

Re(p) > 0, Re(q) > 0 (129)

and its relation B(p, q) =

(p) (q) (p+q)

with

the

Gamma

func-

tion deﬁned as

∞
(z) = e−t tz−1dt, Re(z) > 0.
0

(130)

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

125

We observe that, for the Kramers case,

kK (ξ ) =

1

x

−

1 2

+iξ

dx

=

B

0

1 + iξ,1 2

=

(

1 2

+

(

3 2

iξ ) (1) + iξ)

=

1

2 + 2iξ

,

(131)

where in the last equality we used the relations (1) = 1 and

(1 + z) = z (z), z ∈ C.

For the Bethe-Heitler case, integration by parts leads to

kBH (ξ ) =

1

log

1

+

√ √1

−

x

x

−

1 2

+iξ

d

x

0 1− 1−x

=−

1 0

√−1 x 1−x

·

x

1 2

+iξ

1 2

+

iξ

dx

=

1 2

1 + iξ

11 √

x

−

1 2

+i

ξ

d

x

0 1−x

=

1 2

1 + iξ

B

1 + iξ, 1

2

2

.

Since

(1/2)

=

√ π

we

have

(132)

√

kBH (ξ ) =

1 2

π + iξ

(

1 2

+

iξ)

=

√ 2π

(1 + iξ ) 1 + 2iξ

(

1 2

(1

+ +

iξ) iξ)

.

(133)

We ﬁnally observe that both kK (ξ ) and kBH (ξ ) do not vanish anywhere.

Even if Theorem 8 formally gives the solutions of (106) in the case of both Kramers and Bethe-Heitler cross sections, we have to point out the intrinsic ill-posedness of this procedure. In fact, kK and kBH are analytical functions which tend to zero when |ξ | → ∞. Indeed, while in the Kramers approximation the computation is trivial, in the Bethe-Heitler case we have that

lim
|ξ |→∞

kBH (ξ )

√

= lim
|ξ |→∞

2π 1 + 2iξ

·

(

1 2

+

iξ)

(1 + iξ )

= lim
|ξ |→∞
= lim
|ξ |→∞
= lim
|ξ |→∞

√ 2π 1 + 4ξ 2

·

| |

(

1 2

+

iξ )|

(1 + iξ )|

√ 2π 1 + 4ξ 2

·

|

(

1 2

|iξ

+ iξ )| (iξ )|

√

2π · 1 + 4ξ 2

π cosh(πξ )

·

1 |ξ |

·

√ = lim 2 π ·
|ξ|→∞ 1 + 4ξ 2

tanh(πξ ) = 0, ξ

|ξ | |sinh(πξ )| π
(134)

where we used the properties of the Gamma function (see [1])

(iξ ) (−iξ ) = | (iξ )|2 = π ; ξ sinh(πξ )

(135)

1 + iξ 2

1 − iξ = 2

1 + iξ 2 2

= π. cosh(πξ )

(136)

Therefore a small perturbation of J (ξ ) for large values of ξ produces a completely different solution, since, as follows from (127) and (128), this small perturbation is divided by the vanishing values of kK and kBH . In general, the integrals (127) and (128) do not converge when (119) is replaced by the more realistic one

g(ξ ) = f (ξ )k(ξ ) + n(ξ )

(137)

where n(ξ ) is a function describing the effect of the noise on the data.
For practical applications, (127) and (128) are of little interest. Indeed, in order to obtain the mean electron spectrum from the analytic solutions of the bremsstrahlung equation provided by the previous theorem, the photon spectrum should be known over the whole energy interval (ideally, up to inﬁnite photon energies) and with an accuracy sufﬁciently high to make the computation of the Mellin transform of the data possible. However, in real observations, the detectors provide data vectors whose components correspond to noisy values of the X-ray spectra sampled over a bounded range of photon energies. Therefore it is natural to study the functional analytic properties of the operator A1 : L2( min, ∞) → L2( min, max) deﬁned as

∞

dE

(A1f )( ) =

f (E)K( , E)

min

E

(138)

when min > 0, max < ∞ and the integral kernel is given by (121) with Q( , E) as in (107) for the Kramers case and as in (109) for the Bethe-Heitler case.
The following theorem proves that such an operator is compact.

Theorem 9 Let us consider the linear integral operator A1 : L2( min, ∞) −→ L2( min, max) deﬁned in (138). For both the Kramers and the Bethe-Heitler integral kernels this
operator is compact.

Proof Let us consider the functions

E→

1 E2

K

2

(

, E)

(139)

126

and →

∞
min

1 E2

K

2(

, E)dE;

(140)

for the Fubini’s Theorem, if we show that the ﬁrst one
belongs to L1( min, ∞) and the second one belongs to L1( min, max), we have that the function

( ,E) → 1 K( ,E) E

(141)

is in L2 ([ min, max] × [ min, ∞)) and, therefore, that A1 is Hilbert-Schmidt. But, in the Kramers case,

∞
min

1 E2

K

2(

, E)dE ≤

1
min

(142)

while, in the Bethe-Heitler case,

∞
min

1 E2

K

2(

, E)dE

⎛

=

∞

1 E2

⎝log

1 1

+ −

⎛

≤

∞
min

1 E2

⎝log

1+ 1−

⎛

≤

∞
min

1 E2

⎝log

1+ 1−

⎞2 1 − E ⎠ dE 1− E

⎞2 1 − E ⎠ dE 1− E

⎞2

1−

min
E⎠

dE < ∞

1−

min
E

(143)

where we use the fact that the function

⎛ 1+
→ ⎝log 1−

⎞2 1− E⎠ 1− E

is a decreasing function of . The proof is completed by observing that [ min, max] is
compact.

The linear inverse problem

J = A1f

(144)

is ill-posed in the sense of Hadamard. In particular, the compactness of A1 implies that the solution of the problem does not depend continuously on the data. From a practical viewpoint, the impact of ill-posedness on the inversion of real photon spectra is notable. In fact, any discretization of (144) must account for the numerical ill-conditioning consequence of the presence of the measurement noise on the observed spectra. It follows that at some stage of the inversion

M. Prato
process a regularization procedure must be introduced, in order to reduce the numerical instabilities and to provide physically reliable approximate solutions of the inverse problem.

3.3 The Inverse Problem with Discrete Data

During ﬂare observations, X-ray detectors provide sets of numbers which are proportional to the integral of the photon spectrum over small ranges of photon energies. However, in the case of last generation devices such as RHESSI, it is realistic to assume that the difference between these counts and the point values of g( ) at some energy value in the channel are negligible (for a quantitative discussion of this difference, see [63]). It follows that the equation we have to deal with when treating observed spectra is

∞
g( n) = f (E)Q( n, E)dE, n = 1, . . . , N,
n

(145)

where n denotes the sampled photon energies for n = 1, . . . , N . In the case of the Kramers and Bethe-Heitler integral kernels, we want to address the study of this equation by maintaining its solution in an inﬁnite dimensional space. In fact, we still assume that the source space is L2( min, ∞) and we choose as data space the ﬁnite dimensional vector space Y equipped with the weighted inner product

N

(g, h)Y =

gmwmnhn.

m,n=1

(146)

Then we consider the ﬁnite-rank linear operator A2 : L2( min, ∞) → Y deﬁned by

∞
(A2f )n = f (E)Q( n, E)dE, n = 1, . . . , N
n

(147)

and the inverse problem we are interested in can be described by the equation

g = A2f.

(148)

Equation (148) shows that the problem of the determination of the distribution function of electrons from the knowledge of the photon spectrum is a linear inverse problem with discrete data. In the cases of the Kramers and the BetheHeitler cross sections, we can follow the approach described in Sect. 2.3 and compute analytically the functions φn(E) deﬁned in (27) and the entries of the Gram matrix given in (33). If we adopt the Kramers approximation, the explicit form of the functions φn is given by

φn(E) =

0,

1 nE

,

E < n, E≥ n

(149)

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

127

and a straightforward integration shows that the mn entry of the Gram matrix is

Gmn =

1
2 m

n

,

m ≥ n,

Gnm, m < n.

(150)

If we adopt the Bethe-Heitler cross section, the explicit form

of the functions φn is given by

⎧

φn(E)

=

⎨0, ⎩1
nE

log

√ 1+√1−
1− 1−

n E
n E

,

E < n, E≥ n

(151)

and the mn entry of the Gram matrix is

• for m > n:

4

Gmn =

3

1+

( n m) 2 log 1−

n

m−
n

2

n

2 m

log

n m

m

+

2

n

2 m

1+ m
n

log 1 − n
m

;

(152)

• for m = n:

8 log 2 Gmm = 3 ;
m

(153)

• for m < n:

Gmn = Gnm.

(154)

In the case of the relativistic cross section (110), the explicit form of the functions φn(E) is extremely complicated and the entries of the Gram matrix can be computed only through some numerical integration.

3.4 The Relativistic Cross Section

Owing to computational reasons, in the case of the highly relativistic cross section Q3BN ( , E) we have preferred not to maintain the solution in an inﬁnite dimension space but to fully discretize the integral equation (106) and to study the resulting rectangular linear system

g = Af,

(155)

where g is the data vector with components

(g)n = g( n), n = 1, . . . , N,

(156)

f is the solution vector with components

ηnm being the quadrature coefﬁcients (which depend on the kind of sampling used). Here, for reasons related to the physics of the problem and, in particular, to the characteristics of the acquisition procedure followed by RHESSI, we will always have M > N . The determination of the generalized solution of the linear system (155), i.e. the vector solving the constrained least-squares problem

g − Af = min,

f = min,

(159)

is given by

f†

=

N k=1

1 σk

(g,

vk )uk ,

(160)

where {σk; vk, uk}Nk=1 is the singular system of the matrix A deﬁned by means of

Auk = σkvk, At vk = σkuk.

(161)

Analogously, the Tikhonov regularized solution of (155) is given by

fλ

=

N k=1

σk σk2 +

λ

(g,

vk )uk

(162)

and the optimal value of the regularization parameter can be ﬁxed by applying the discrepancy principle in its obvious discrete formulation.

4 Reconstruction of the Mean Electron Spectrum
The aim of this section is two-fold. From one side we want to study the effectiveness of the Tikhonov regularization method for solving the general bremsstrahlung equation, by using simulated spectra. On the other hand, we want to describe the sensitivity of this inversion approach to the use of different integration kernels, i.e. for different choices of the bremsstrahlung cross section. In particular, our discussion will be organized into two sections. In the ﬁrst one we will study the conditioning of the problem by means of the Singular Value Decomposition of the integral operator. In the second one we will use realistic synthetic data to test the effectiveness of the Tikhonov method and of the optimal criterion for choosing the regularization parameter. In the last section, we will give an example of application of the Tikhonov method to real X-ray data recorded by RHESSI.

(f)m = f (Em), m = 1, . . . , M and A is the rectangular matrix with entries Anm = Q( n, Em)ηnm,

(157) 4.1 Condition Number for the Different Cross Sections
A critical point in the numerical solution of integral equa(158) tions is the dependence of the numerical instability degree

128

on the shape of the integral kernel. In the case of the bremsstrahlung equation, such an issue assumes a particularly relevant role, owing to the notable physical meaning of the approximations leading to the different forms of the bremsstrahlung cross section. The key question, here, is to assess if and under which physical conditions, the use of simpliﬁed cross sections may increase the ill-conditioning of the inverse problem and, therefore, the reliability of the corresponding regularized solutions. From a computational viewpoint, the use of the singular system of the ﬁnite rank operator (147), in the case of the Kramers and Bethe-Heitler cross sections, and of the rectangular matrix (158) in the case of the relativistic cross section, allows to easily determine the condition number

C = σ1 σN

(163)

for different values of the number of sampling points and for different values of the photon sampled energies.
Let us consider ﬁrst the computation of the singular system for a ﬁxed range [ min, max] of photon energies and a ﬁxed number N of sampling points. For this application we chose min = 10 keV, max = 99 keV, N = 90 in the case of a uniform sampling with sampling distance equal to 1 keV (these experimental conditions are compatible with the typical parameters used during the pre-processing of RHESSI observations). Table 1 contains the ﬁrst ten singular values for the three cross sections while Fig. 2 shows the ﬁrst four singular functions uk for the three cross sections. Finally, Table 2 contains the condition numbers corresponding to the three kernels with min = 10 keV and three different values of the maximum photon energy. Under the assumption that the cross section (110) represents the exact emission probability in the bremsstrahlung process (or, at least, its most accurate approximation), these results point out that:

• when the Kramers form is used, i.e., when all the relativistic effects are neglected, the resulting singular system is signiﬁcantly different than the exact one. In particular, the Kramers singular spectrum runs ﬁrst below and then above the exact one with relative differences of even more than 50% for the largest and smallest singular values.
• The singular functions corresponding to the Kramers cross section systematically reaches larger maximum values than the exact ones and, what is more important, their zeros occur at smaller energy values. Since the regularized solution can be expanded on the basis given by the singular functions or vectors (see (83) and (162)), this implies that the spectral resolution achievable when the Kramers approximation is assumed for the inversion deteriorates with energy more rapidly than for the other two cross sections.
• The semi-relativistic Bethe-Heitler approximation provides a singular spectrum which is close to the exact one

M. Prato
Fig. 2 Singular functions of the ﬁnite-rank operator A2 for the Kramers and Bethe-Heitler integral kernels and of the matrix A in the case of the relativistic cross section: (a) ﬁrst singular function; (b) second singular function; (c) third singular function; (d) fourth singular function

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

129

Table 1 Singular values of the ﬁnite-rank operator A2 for the Kramers and Bethe-Heitler integral kernels and of the matrix A in the case of
the fully relativistic cross section. The experimental conditions are: min = 10 keV, max = 99 keV, N = 90, uniform sampling with 1 keV sampling distance

Kramers

Bethe–Heitler

Cross 3BN

σ1

0.69675061

1.4327710

σ2

0.23456303

0.3009761

σ3

0.14120844

0.1354425

σ4

0.10133624

0.0817600

σ5

0.07936335

0.0561889

σ6

0.06555440

0.0418901

σ7

0.05611209

0.0329645

σ8

0.04916533

0.0270082

σ9

0.04366882

0.0227958

σ10

0.03908788

0.0196456

1.6616698 0.3251901 0.1442251 0.0865855 0.0589741 0.0434190 0.0335835 0.0269361 0.0221901 0.0186748

Table 2 Condition numbers for the linear inverse problem with discrete data when the cross sections are given by the Kramers and BetheHeitler formulas and for the fully discretized problem in the case of the relativistic cross section. The minimum sampled photon energy is min = 10 keV in all cases, while three different values of the maximum sampled photon energy are considered

Kramers

Bethe–Heitler

Cross 3BN

max = 99 max = 149 max = 199

764.32 1424.45 3043.64

9662.26 28069.19 60945.02

17572.21 54625.84 117049.32

(the relative error is smaller than 15% for all the singular values). The singular functions are very similar too. • As one may expect, the conditioning of the problem worsens when the sampling range increases. Anyway, the inverse problem corresponding to the Kramers kernel is always better conditioned.

4.2 Reconstructions: Synthetic Data

The inversion of simulated photon spectra allows to validate two important aspects of the regularization approach: the robustness of data reduction to modiﬁcations of the integral kernel and the effectiveness of the discrepancy principle for the optimal choice of the regularization parameter. In order to study such issues we have generated simulated data corresponding to three different experimental situations.

Power-law: small solar ﬂares typically generate photon spectra characterized by a monotonically steeply decreasing behavior approximated by

g( ) ∼ −γ

(164)

Fig. 3 Synthetic data: (a) the three theoretical electron spectra; (b) the three corresponding photon spectra. In both ﬁgures, in order to distinguish the ‘power-law’ case (solid) and the ‘power-law plus dip’ case (dotted) from the ‘power-law plus dip plus thermal component’ case (dashed), a ×1000 and a ×100 artiﬁcial scale factors have been used respectively

with, typically, γ ≥ 3. Simple asymptotic considerations on (106) show that this kind of photon spectrum is generated by an averaged electron spectrum of the kind

f (E) ∼ E−δ

(165)

with δ ∼ γ − 1. In our simulations we have assumed

f (E) = 104

E

−δ
,

50

(166)

where the numerical constants have been introduced for physical reasons.
Dip: in [67] a rather surprising small wavelength structure has been reconstructed in the mean electron spectrum corresponding to a photon spectrum emitted during the July 23, 2002 ﬂare. The nature and origin of such an intermediate energy dip are still under investigation. Here we mimic

130

M. Prato

Fig. 4 Reconstruction of three different forms of the averaged electron spectrum for the three cross sections considered: (a) the power law when λ is chosen by minimizing the distance between the theoretical and regularized solutions; (b) as in (a), with λ chosen by the discrepancy principle. Artiﬁcial scale factors have been used to better distinguish the three cases

Fig. 5 Reconstruction of three different forms of the averaged electron spectrum for the three cross sections considered: (a) the power law plus an intermediate energy dip when λ is chosen by minimizing the distance between the theoretical and regularized solutions; (b) as in (a), with λ chosen by the discrepancy principle. Artiﬁcial scale factors have been used to better distinguish the three cases

its presence by means of the analytical form

f (E) =

0.25 104

· 104
E −δ 50

E 50
,

−δ ,

50 ≤ E ≤ 60, E < 50, E > 60.

(167)

Thermal component: particularly in the case of intense ﬂares, part of the low energy X-ray emission is not due to the injection of accelerated electrons into a cold plasma but directly comes from a thermal emission corresponding to peak temperatures of some million degrees. In the analytical form

f (E) =

0.25 104

·

5E1004−δ5E+0

−δ
5·

, 105e−E/2.7

,

50 ≤ E ≤ 60 otherwise,
(168)

the thermal component is represented by the negative exponential function at E < 50 keV.
In all simulations, we have assumed the realistic value δ = 4. The photon spectra have been computed by inserting the

functions (166), (167) and (168) into the bremsstrahlung equation by using the relativistic formula Q3BN as integral kernel. The photon data have been sampled from min = 10 keV to max = 99 keV with a uniform sampling of 1 keV bin. Finally, realistic Poisson noise has been added to the photon data vectors. The three theoretical input mean electron spectra and the corresponding photon spectra are plotted in Fig. 3. This plot clearly shows that, due to the blurring action played by the integral kernel, slightly distinguishable differences in the output data correspond to notably different input functions. The regularized solutions have been computed by means of (83) for the Kramers and Bethe-Heitler cases and of (162) for the highly relativistic cross section. The regularization parameter has been ﬁxed according to two different approaches: 1) by minimizing the Euclidean distance between the regularized solution and the theoretical input function (criterion I); 2) by applying the discrepancy principle (criterion II). The results of this computation are described in Figs. 4–6 and Table 3. Figures 4–6 contain the reconstructions of the three theoretical averaged electron

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

131

Table 3 Reconstruction errors provided by Tikhonov regularization while reproducing three different forms of the averaged electron spectrum for the three cross sections adopted. The regularization parameter is ﬁxed by determining the minimum distance to the theoretical form (criterion I) and by applying the discrepancy principle (criterion II)

P.l.

I

II

P.l.+dip

I

II

P.l.+dip+thermal

I

II

K

0.045 0.082 0.045 0.081 0.136 0.144

BH 0.328 0.439 0.330 0.319 0.344 0.419

3BN 0.019 0.257 0.019 0.257 0.021 0.291

Fig. 6 Reconstruction of three different forms of the averaged electron spectrum for the three cross sections considered: (a) the power law plus dip plus a low energy thermal component, when λ is chosen by minimizing the distance between the theoretical and regularized solutions; (b) as in (a), with λ chosen by the discrepancy principle. Artiﬁcial scale factors have been used to better distinguish the three cases

the fact that in minimizing the overall error a more signiﬁcant role is played by the low energy part of the spectrum, where the actual values of the function are signiﬁcantly bigger. The use of the discrepancy principle focuses on the residuals and provides smaller λ: therefore the reconstructions are more unstable (see, in particular, the low energy parts) but more reliable in reproducing small features. • If λ is chosen by means of criterion I, the inversion with the highly relativistic formula is the most reliable (as one may expect, since the data have been simulated by using just that cross section). If λ is chosen with criterion II, the minimum restoration error is obtained by using Kramers approximation. This is again reasonable since the discrepancy principle provides smaller λ and, as pointed out in the previous section, the conditioning is smaller when the Kramers kernel is adopted.
4.3 Reconstructions: Application to RHESSI Data

spectra provided by the Tikhonov regularization method for the three cross sections; in the top panels the regularization parameter has been ﬁxed by minimizing the distance between the theoretical and regularized solutions while in the bottom panels the discrepancy principle has been applied. Table 3 contains the reconstruction errors deﬁned as

ρ = f − fλ 2 , f2

(169)

where · 2 is the L2 norm in the Kramers and Bethe-Heitler cases and the canonical Euclidean norm in the highly relativistic case, while λ is ﬁxed with the two criteria. These results point out that:

• in general Tikhonov regularization provides satisfactory reconstructions of the theoretical mean electron spectra. The regularized solutions are sufﬁciently stable and the overall behavior is always correctly reproduced.
• The reconstructions given by criterion I are more stable but cannot reproduce the intermediate dip. This is due to

As an example of application of the Tikhonov method to real X-ray data recorded by RHESSI, we consider the photon spectrum corresponding to emission peak during the August 21, 2002 ﬂare. In the restoration of real data two further issues must be accounted for, both concerning the physical interpretation of the regularized reconstructions. The ﬁrst item is the determination of the uncertainty on the reconstruction. This assessment can be accomplished by producing the socalled conﬁdence strip of the regularized solution [3]. Such a construction is performed by generating different random realizations of the photon data, produced by modifying each X-ray vector component according to a Gaussian distribution of zero mean and standard deviation equal to the experimental one. The regularized solutions for all these realizations are determined and superimposed in order to evaluate the robustness of the regularization algorithm to data instability. The height of the conﬁdence strip at each value of the electron energy provides the propagation error on the regularized solution.
The second item is concerned with the assessment of the energy resolution achievable by the method. Our approach

132

M. Prato

dence strip is represented together with the horizontal resolution bars. In this reconstructed spectrum the thermal component at low energies is fairly visible, together with a spectral ‘knee’ at energies in the range 20–40 keV. The asymptotic electron spectral index (i.e., the exponent of the power law best ﬁtting the spectrum at high energies) is δ 2.45 while the photon spectral index is γ 3.54, and this is in accordance with the ‘rule of thumb’ γ δ + 1 based on asymptotic approximations.

5 Anisotropic Bremsstrahlung Emission

5.1 Angular Dependency of the Bremsstrahlung Equation

Fig. 7 Inversion of a real spectrum emitted during the August 21, 2002 ﬂare: (a) the photon spectrum at the emission peak; (b) the reconstructed mean electron spectrum provided by the Tikhonov method. The regularization parameter has been ﬁxed by means of the discrepancy principle

follows the method introduced in [4] and is based on the ob-

servation that in the expansions (83) and (162) not all the

singular components actually contribute to the form of the

regularized solution with the same signiﬁcance. In fact, for

increasing k values, the contribution of the corresponding

singular functions is increasingly ﬁltered out by the coefﬁ-

cient

σk σk2+λ

.

We

use

the

truncation

rule

√ σk ≥ λ

(170)

in these expansions and observe that, since the singular values are decreasingly ordered, there will be an index k for which condition (170) is satisﬁed for the last time. Therefore a reasonable estimate of the resolution achievable by the method is given by the distance between the successive zeros of uk.
Figure 7 contains the result of the application of Tikhonov regularization to the inversion of the real spectrum recorded during the August 21, 2002 ﬂare. The photon spectrum corresponding to the emission peak is given in Fig. 7(a) while Fig. 7(b) presents the corresponding reconstruction provided by the Tikhonov method. In this plot the conﬁ-

In the previous sections we showed how to apply regularization techniques to the inversion of high resolution X-ray spectra from solar ﬂares with the aim of inferring information on the electron ﬂux spectrum in the source. In our analysis we considered particular forms [37] for the bremsstrahlung cross section Q( , E) which depend only on the photon energy and the electron energy E. However, the correct cross section to use must in practice take into account two important aspects of the geometrical and physical environment. First, the direction of the precollision electron is not, in general, univocally deﬁnite, but there will be a signiﬁcant spread in the incoming directions of the electrons (e.g., [45, 46]). Second, the guiding magnetic ﬁeld may be inclined away from the vertical toward the observer, and a similar inclination in guiding ﬁeld direction angle may be appropriate for the bremsstrahlung-producing electron beam [75]. It follows that the probabilistic kernel which describes the bremsstrahlung phenomenon is, in general, a function also of the incoming and outcoming electron directions and of the polarization state3 of the emitted photon [33]. In principle, also components due to both electron-ion and electronelectron bremsstrahlung have to be considered, but it can be shown that the latter is quite negligible except at mildly or extremely relativistic energies [50]. Several authors (e.g., [13, 25, 26, 36, 38, 53, 54]) showed how the X-ray emission from a particular electron source changes signiﬁcantly for different viewing directions. In this section we consider the expression for the electron-photon bremsstrahlung cross section, integrated over the direction of the outgoing electron and summed over the polarization states of the emitted photon, provided by Gluckstern & Hull [32]. This kernel Q( , E; θ ) is a function of three variables: besides the energies E of the electron (keV) and of the photon (keV), also the dependency on the angle θ between the directions
3Because propagating light consists of a transverse electric and magnetic ﬁeld, a single photon will oscillate on a line perpendicular to the propagation direction.

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

133

of the incoming electron and the emitted photon is kept in consideration.
Let us generalize now the concept of mean source electron ﬂux spectrum given in Sect. 3.1 [12, 17] to the case of an anisotropic cross section and/or source electron distribution function. The hard X-ray intensity I ( ) observed at distance R from a source can be written as

1

∞

I ( ) = 4π R2

Q( , E; θ )
V

× F (E, r, )n(r)drd dE,

(171)

where F (E, r, ) is the electron ﬂux differential in electron energy E, position r, and solid angle of the incoming electron direction , and Q( , E; ) is the cross section for bremsstrahlung emission in the direction of the observer. If we deﬁne

F (E, ) = V F (E, r, )n(r)dr V n(r)dr

=

V F (E, r,

)n(r)d r ,

nV

(172)

With such an assumption, (173) can be written as

nV ∞

I ( ) = 4π R2

F (E)dE

Q( , E; )h( )d h( )d

. (178)

If we deﬁne

Q( , E; )h( )d

Q( , E) =

,

h( )d

(179)

then (178) is formally identical to (175) and can be solved for any adopted form of h( ) once Q( , E) is evaluated using (179). We remark that in practice, e.g., in a collisional thick target, the E and dependencies of F (E, ) are not separable and further vary along the electron paths. To deal with that situation more properly requires explicit modeling of the electron propagation, i.e., of electron scattering and energy losses; in general, this can probably be done only by forward modeling. Nevertheless, the results of our separable inversion formulation will provide a better starting point than the assumption of isotropy used hitherto.

where we recall that n = (1/V ) V n(r)dr, then (171) can be written as

I(

)=

nV 4π R2

∞

Q( , E; θ )F (E, )d dE.

(173)

If F (E, ) is isotropic, then we deﬁne F (E) ≡ F (E) and let

Q( , E) = Q( , E; θ )d .

(174)

Then we can write

I(

)=

nV 4π R2

∞
F (E)Q( , E)dE.

(175)

This is the expression we used to deﬁne the “mean electron spectrum” F (E) in (101) and (102). It should be noted that (175) also applies in the (somewhat hypothetical) case where Q( , E; θ ) is independent of θ , with Q ≡ Q and

F (E) = F (E, )d .

(176)

In the physically realistic case, neither Q( , E; θ ) nor F (E, ) is isotropic. To make progress, therefore, requires some further assumptions on the form of F (E, ). In this section we restrict ourselves to the simplest assumption, namely that F (E, ) is separable in E and , i.e.,

F (E, ) = F (E) h( ) . h( )d

(177)

5.2 Anisotropic Form of the Cross Section

The form of the angle-dependent cross section Q( , E; θ ) has been given by Gluckstern & Hull [32] and Koch & Motz ([44], formula 2BN). Here we reprint this result in our notation (and in units of cm−2 keV−1 sr−1),4 and we also present a polar diagram of the angular dependence for various electron and photon energies in order to make some of the discussion in the following sections more comprehensible.
Formally, the cross section Q( , E; θ ) for electron-ion bremsstrahlung, differential in photon energy , electron energy E, and the angle between the incoming electron and the emitted photon (but integrated over the direction of the emergent electron and summed over the polarization states of the emitted photon) is

Q( , E; θ )

=

Z2

α 2

r02 me c2

1 ˜

(E˜ − ˜)2 − 1 E˜ 2 − 1

×

2E˜ 2 + 1 8 (E˜ 2 − 1)

4

sin2

θ

−

5E˜ 2 2

+ 2E˜ (E˜ − (E˜ 2 − 1)

˜)
2

+

3

−

2

E˜ 2

− ˜2 − T2 2

1

+

4

E˜ − ˜ (E˜ 2 − 1)

4The steradian (sr) is deﬁned as the solid angle subtended at the center
of a sphere of radius r by a portion of the surface of the sphere having an area r2. Since the surface area of this sphere is 4π r2, then the
deﬁnition implies that a sphere measures 4π steradians.

134
+

L ((E˜ − ˜)2 − 1)(E˜ 2 − 1)

×

4E˜ (3

˜

− (E˜ 2 − 1)(E˜ (E˜ 2 − 1) 4

−

˜))

sin2

θ

+

4E˜ 2(E˜ 2 + (E˜ − ˜)2) (E˜ 2 − 1) 2

+

−2(7E˜ 2 − 3E˜ (E˜ − ˜) + (E˜ − ˜)2) + 2 (E˜ 2 − 1) 2

+

2 ˜ E˜ 2

+ E˜ (E˜ − ˜) − 1 (E˜ 2 − 1)

+ T

γT (E˜ − ˜)2 − 1

×

4
2

−

6˜

−

2 ˜(E˜ 2 − T2

˜2

− 1)

⎫

−

4γ

⎬

(E˜ − ˜)2 − 1 ⎭

×

β (1 β (1

− −

e−2πZα/β )

e−2π Zα/β

, )

where Z is the atomic number of the ion, α 1/137 is the ﬁne-structure constant, r0 2.8 × 10−13 cm is the classical electron radius, mec2 511 keV is the electron rest energy,
and

E˜

=

1

+

E me c2

,

˜ = mec2 ,

β=

1

−

1 E˜ 2

,

β=

1

−

(E˜

1 −

˜)2

,

= E˜ − E˜ 2 − 1 cos θ,

1/2
T = E˜ 2 − 1 + ˜2 − 2 ˜ E˜ 2 − 1 cos θ ,

⎧ ⎨ E˜ (E˜ − ˜) − 1 + L = ln ⎩ E˜ (E˜ − ˜) − 1 −

⎫ (E˜ 2 − 1)[(E˜ − ˜)2 − 1] ⎬ (E˜ 2 − 1)[(E˜ − ˜)2 − 1] ⎭ ,

⎡ γT = ln ⎣ T +
T−

⎤ (E˜ − ˜)2 − 1 ⎦ , (E˜ − ˜)2 − 1

⎡ γ = ln ⎣ (E˜ − ˜) +
(E˜ − ˜) −

⎤ (E˜ − ˜)2 − 1 ⎦ . (E˜ − ˜)2 − 1

The last factor in the deﬁnition of Q( , E; θ )—involving the velocity β (in units of the speed of light c)—is the Elwert [24] Coulomb correction and does not appear in the

M. Prato
Fig. 8 Polar diagram of the bremsstrahlung cross section for E = 100 keV and photon energies = 30 keV (solid line), = 50 keV (dotted line), and = 80 keV (dashed line); the radial coordinate is proportional to the size of the cross section and the angle from the x-axis corresponds to the angle between the incoming electron direction and the line to the observer. Note that at energies E the cross section peaks at θ = 0◦, while for E the cross section peaks at θ 30◦–40◦
expressions in [32]. This correction is sufﬁciently accurate (to within a few percent) except at electron energies above ∼ 100 keV and approaching the “high-frequency limit” → E; in such regimes more elaborate expressions are appropriate. For more details, see [44]. It should be noted that the expression of Q( , E; θ ) is indeterminate at = E; in practice this can be handled in numerical computation by setting E slightly higher than .
Figure 8 shows the angular dependency of such a cross section. As we can see, for electron energies E , the form of Q( , E; θ ) is a decreasing function of θ ; there is a preference for photons to be emitted in the direction of the incoming electron. However, at electron energies comparable to the photon energy there must be a substantial scattering angle between the incoming and outgoing electrons, and hence the photons tend to be emitted preferentially at a signiﬁcant angle relative to the incoming electron velocity. Hence, at electron energies E slightly greater than , the cross section peaks not in the forward direction but rather at a modest angle (30◦–40◦—see Fig. 8).
5.3 Recovering the Electron Spectrum as a Function of Viewing- Angle
In order to analyse the effects of the anisotropic cross section with respect to the results obtained in Sect. 3 with several angle-averaged kernels, we keep on considering the photon spectrum corresponding to emission peak during the August 21, 2002 ﬂare (represented in Fig. 7(a)). The location of the ﬂare on the solar disk is x = 696 , y = −248 (with respect to a ﬁxed heliocentric coordinate system whose origin

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

135

Fig. 9 The uniform distribution of the target-averaged incoming electron velocities. The z-axis (labeled with the electron energy E) represents the mean direction of the electron

Fig. 10 The inﬁnitesimal solid angle d

is placed in the center of the Sun), which corresponds to a heliocentric angle of ∼ 50◦.
We assume that, for each energy, all the possible directions of the precollision electrons are uniformly distributed over a solid angle within a cone of half-angle α centered on a direction that makes an angle θ0 relative to the direction of photon emission, i.e. (with β as the polar angle relative to the axis of the cone—see Fig. 9),

h( ) = 1, β ≤ α, 0, otherwise.

(180)

With this form of h( ) the average cross section Q( , E) deﬁned in (179) is

Q( , E; θ0, α)

=

1

2π α
Q( , E; θ ) sin βdβdϕ, (181)

2π(1 − cos α) 0 0

where θ is the angle between the observer and an elementary electron beam oriented at polar coordinates (β, ϕ) relative to the axis of the cone, viz.,

cos θ = cos θ0 cos β + sin θ0 sin β cos ϕ.

(182)

We observe that in (181) we used the fact that

d

=

d r2

=

(r

sin βdϕ)(rdβ) r2

=

sin βdβdϕ

(183)

as we can see from Fig. 10.
Expression (181) for Q( , E; θ0, α) has been used to invert (178) (with the atomic number Z set equal to 1.2) and
regularized electron spectra F θ0,α(E) have been recovered for mean angles θ0 over the range from 0◦ (photon emission parallel to the direction of the incoming electron) to 180◦ (photon emission in the anti-parallel direction). Several values for the spread angle α (10◦, 30◦, 60◦, 90◦, and

Finigg.e1le1ctrFonθ0d,αir(eEct)iofonrs)θsh=ow13n0. ◦Tahneddathsehevdalcuuersvoe f(lαab(eslperdeaαd=in1i8n0c◦o)misthe spectrum obtained using the angle-averaged cross section Q( , E)
180◦) were used; from our reconstructions we found that angles α up to 10◦ provide essentially the same results. Moreover, we remark that the maximum value we chose (namely, α = 180◦) corresponds to an integration over the entire sphere, i.e., to the angle-averaged cross section Q( , E) described in Sects. 3 and 4. Following what we have done in Sect. 4.3, a “conﬁdence strip” of F θ0,α(E) forms based on different realizations of the (noisy) data was produced for each photon spectrum I ( ) and the mean of this conﬁdence strip was used for further analysis.
The August 21, 2002 ﬂare was located at a heliocentric angle of approximately 50◦. If we assume that the mean direction of the incoming electrons was vertically downward at this location, then the corresponding value of θ0 is 130◦. Figure 11 shows the reconstructed F θ0,α(E) corresponding to θ0 = 130◦ and various values of α, including α = 180◦, i.e., the solid-angle–averaged cross section.
It should be recognized that the assumption of a vertical mean incoming electron direction, and so the choice of θ0 = 130◦ is not rigorously justiﬁed. For comparison, therefore,

136

M. Prato

Fig. 12 As for Fig. 11, but for values of θ0 = 0◦, 45◦, 90◦, 135◦, and 180◦. The curves have the same signiﬁcance as in Fig. 11, with the uppermost solid curves corresponding to the lowest values of α
Fig. 12 shows the same results for values of θ0 ranging from 0◦ to 180◦ in 45◦ steps.
5.4 Results and Discussion
Let us consider the reconstructed F θ0,α(E) for the ﬁxed value θ0 = 130◦ of Fig. 11 and for different spread angles α. The case α = 0◦ corresponds to all the emission concentrated at θ = θ0. On the other hand, for values of α which are different from zero, the X-ray emission is spread over a range of θ from θ0 − α to θ0 + α. Owing to the asymmetry of the emission polar diagram which grows for increasing electron energies (see Fig. 8), the enhanced emission in the θ0 − α direction more than compensates for the decreased emission in the θ0 + α direction, so that fewer total electrons are required to produce a given photon ﬂux than for the unidirectional (α = 0◦) case. Figure 11 shows that the magnitude of the reconstructed F θ0,α(E) does indeed depend quite strongly on the range of incoming electron directions α, especially at high electron energies E (at 500 keV, the required electron ﬂux is less by a factor of ∼ 4). Since this effect becomes more important with increasing energy, the reconstructed F θ0,α(E) also becomes steeper with increasing α. Recognizing that the extreme maximum case α = 180◦ corresponds to the angle-averaged cross section used in the previous sections, we see that the use of a cross section that more realistically reﬂects the range of incoming electron velocity vectors always ﬂattens the high-energy part of the inferred electron spectrum relative to that found using the angle-averaged cross section Q( , E).
We discuss now the different forms of F θ0,α(E) when several viewing angles θ0 are used. For θ0 = 180◦ (Fig. 12), corresponding to vertically downward electrons in a diskcenter ﬂare, the recovered ﬂux F θ0,α(E) for modest values of α is, at high energies, signiﬁcantly (an order of magnitude or so) greater than the value of F θ0,180◦ (E), i.e., to the

Fig. 13 Spectral index variation for θ = 130◦ and values of α indicated
result using the angle-averaged cross section Q( , E). This is due to the very low values of the normalized cross section Q( , E; θ ) appropriate to this viewing angle (Fig. 8) and hence the inefﬁciency of photon production in such a direction. Such large ﬂuxes may introduce issues of beam stability. Conversely, for θ = 90◦ (corresponding to vertically downward electrons in a limb ﬂare; see Fig. 12), the enhancement over the angle-averaged (α = 180◦) case is less pronounced; indeed the recovered spectra are remarkably similar to that derived using the angle-averaged cross section, particularly at energies slightly greater than 30 keV.
Values of θ0 < 90◦ correspond to the case in which the mean velocity of the electrons has a component toward the observer, and therefore away from the Sun. In general, θ0 values in this ﬁrst quadrant lead to a decreased value of F θ0,α(E) relative to the angle-averaged result F θ0,180◦ (E), because of the preferential tendency for photons to be emitted in the forward hemisphere (relative to the incoming electron velocity) and hence the smaller number of electrons needed. However, at very low values of θ0 (slightly smaller than 30◦), the required F θ0,α(E) is, for low energies, greater than both the θ0 = 45◦ case and the angle-averaged case (dashed line); this is a consequence of the angular behavior of Q( , E; θ ) (in particular the low value near θ = 0◦) shown in Fig. 8 and noted at the end of Sect. 5.2. The greatest deviations between the correct spectrum and the one deduced using the angle-averaged cross section are thus achieved for θ0 45◦ (correct spectrum steeper) and for θ0 180◦ (correct spectrum ﬂatter).
Figure 13 shows the variation of the local spectral index δE as a function of E, for the θ0 = 130◦ spectra of Fig. 11. Compared to the results for the isotropic case (α = 180◦), the spectral indices for the anisotropic electron distributions are substantially smaller (ﬂatter spectrum) at low energies (between about 40 and 200 keV), and larger (steeper spectrum) at high energies (greater than about 200 keV). In all cases the value of δE increases with decreasing energy below ∼ 50 keV, indicative of the transition to a softer,

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

137

more thermal, spectrum. At low energies (smaller than about 25 keV), the value of δE changes to a form that increases with E, indicative of the general steepening trend associated with thermal spectra.
In conclusion, in determining the mean source electron population F (E) responsible for a given hard X-ray spectrum, it is very important to use a bremsstrahlung cross section Q that accurately represents the geometric relationship between the source and the observer. As we have shown in our analysis, the use of the correct, direction-dependent cross section can yield recovered mean source electron spectra of signiﬁcantly different shape than the results using the usual angle-averaged cross section.

5.5 Open Problems

The results obtained in the simpliﬁed cases of the previous sections are the ﬁrst attempt to face the problem of deeply understanding the angular dependency of the reconstructed electron spectrum. In the general case, the inversion of the anisotropic bremsstrahlung equation

I(

)=

nV 4π R2

∞

Q( , E; θ )F (E, )d dE,

(184)

where all the functions and constants are deﬁned in Sect. 5.1, leads to the reconstruction of the function of two variables F (E, ) from the knowledge of noisy measurements of the function of one variable I ( ). This kind of problem is known as bivariate problem and its resolution implies notable difﬁculties.
As described before, our results have been obtained under the simple assumption that F (E, ) is separable in E and
—see (177); furthermore, we chose a particular form for the function describing the angular dependence of F (E, ) and we reconstructed the mean electron spectrum F (E), which is the part of F (E, ) that depends only on the electron energy E. The results that we provided clearly pointed out that the angular dependency of F (E, ) cannot be neglected.
A distinct but important problem to study is now to assume the E-dependence of F (E, ) as known and still separable and to see if we can recover the θ -dependence from the photon spectrum.
From (184) and (183) we can write

I(

)=

nV 4π R2

∞ π 2π
f (E, β)
00

× Q ( , E; θ (β, ϕ, θ0)) sin βdϕdβdE

(185)

where f (E, β) = F (E, ) (we have not any dependence on ϕ since symmetries). If we deﬁne

nV 2π K( , E, β) = 4π R2 0 Q ( , E; θ (β, ϕ, θ0)) dϕ, (186)

(185) becomes

∞π

I( )=

f (E, β)K( , E, β) sin βdβdE.

0

(187)

The problem is now univariate in β if we assume for example f (E, β) = E−δG(β) (i.e., a power law behavior for the
E-dependence of f ). It follows that

∞π

I( )=

E−δG(β)K( , E, β) sin βdβdE

0

π

∞

= G(β)

E−δK( , E, β)dE sin βdβ

0

π
= G(β)H ( , β) sin βdβ,

0

(188)

where

∞
H ( , β) = E−δK( , E, β)dE.

(189)

The inversion of (188) is a Fredholm problem to ﬁnd G(β) from I ( ).
Both the problem univariate in E and the problem univariate in β have the drawbacks that they are not the right way to invert (184) and they need strong assumptions about the form of F (E, ). On the other hand, they have the great advantage to reduce the bivariate problem to simpler univariate problems whose resolution can be easily performed. Anyway, the main goal of the anisotropic problem is to invert (184) directly. To this aim, our idea is to perform a lexicographical re-arrangement of the variables followed by a standard zero-order regularization inversion. In other terms, we consider the discretized form of (187) (with weights included in K)

nel −1 nβ −1

Ii =

Kij kfj k

j =0 k=0

(190)

where nel and nβ are the number of the electron energies sampled and the number of the angles β considered re-
spectively. Then we merge indexes j and k in a single one (l = nβ · j + k, l = 0, . . . , nβ nel − 1), converting the matrix fjk in a vector fl and the tensor Kijk in a matrix Kil . It follows that (190) becomes

nel nβ −1

Ii =

Kil fl

l=0

(191)

so that we can recover fl and then deduce fjk from the knowledge of the one-to-one relationship between l and
(j, k).

138
6 Application to Solar Physics: Thermal Bremsstrahlung

6.1 The Differential Emission Measure

In the previous sections we proposed some regularization techniques to infer information on the electron distribution in the source starting from its measured X-ray emission, given a certain form for the probabilistic kernel which describes the emission process. Once that realistic electron spectra have been provided, the next step is to investigate to what extent the electron distribution responsible for the emission comprises

(a) non-thermal particles trapped in a low density plasma (thin-target);
(b) particles “injected into” and stopped in a dense plasma (thick-target);
(c) a spatial distribution of locally Maxwellian electrons with a location-dependent temperature (T ),

or some mixture of these three situations. Under a purely thermal interpretation of (100) [14], the
electron distribution is assumed to be locally Maxwellian, i.e. (with T in energy units)

F (E, r)

=

23/2 (π me)1/2

n(r)E [T (r)]3/2

e−E/T (r),

so that (100) becomes

(192)

I(

)=

1 4π R2

·

23/2 (π me)1/2

∞ 0

∞

n2(T

)

E T 3/2

× e−E/T Q( , E)dE dr dT , dT

(193)

and the photon spectrum then provides information on the differential emission measure loosely deﬁned, for stratiﬁed structures, by

ξ(T ) = n2(T ) dr . dT

(194)

A direct connection between I ( ) and ξ(T ) can be established by inserting expression (192) for the local electron distribution into the model-independent equation (100), with a Kramers approximation used for Q( , E) (see (107)). Because of the extreme simplicity of the Kramers form, the result is a Laplace-transform-like integral equation relating the photon spectrum (I ( )) directly to the differential emission measure (ξ(T )). This equation has been studied by Piana, Brown and Thompson [66] in the framework of regularization theory for inverse problems and applications to highresolution balloon data [56] have been considered. Craig & Brown [20] noticed that an analogous approximate equation could be obtained for the more complex Bethe-Heitler form

M. Prato

of the cross section Q( , E) (see (109)). However, both of these use a rather coarse approximation to the true cross section, which as well as being quite smooth, has a much more complex analytic form, like for example the highly relativistic formula given by equation (3BN) in [44] and reported in Sect. 3.1 (see (110)). As we showed in Sect. 4, the ill-posedness of the integral inversion problem implies that small changes in the kernel (Q here) can result in signiﬁcant changes in the solution [5, 42, 51] so that results using approximate representations of Q may not be reliable. It follows that a completely rigorous description of the thermal model requires the introduction of two integral equations. First, the (isotropic) source-averaged, effective electron ﬂux spectrum F (E), deﬁned in (102) is related to the photon spectrum I ( ) by means of the Volterra integral equation

nV ∞

I ( ) = 4π R2

F (E)Q( , E)dE,

(195)

with a fully correct form of Q( , E). Then, (192), (194) and (102) lead to the Fredholm integral equation relationship between F (E) and ξ(T ) [15]

F (E) =

1 nV

23/2E (π me)1/2

∞ 0

ξ(T ) T 3/2

e−E/T

dT

.

(196)

The aim of this section is to address the following two basic questions concerning the thermal model: 1) is the available photon spectrum compatible with a thermal interpretation, i.e., can the observed I ( ) be fully explained by a non-negative ξ(T )? And 2) if the answer to 1) is yes, what is the actual form of ξ(T ) for that particular form of I ( )? One way to see whether an entire I ( ), or even part of it, is compatible with a thermal model for the emission process, is to test whether the corresponding F (E) obtained by solving (195) satisﬁes criteria arising from (196), making allowances for the data-induced noise. One such test is the “derivative test” for thermality found by Brown and Emslie [15]. This follows directly by differentiating (196) (with both sides divided by E) i times and states that an electron spectrum (F (E)) is compatible with a purely-thermal interpretation if and only if the quantity F (E)/E is “completely monotonic”, i.e. its i-th derivative has sign (−)i at all E. This approach has a technical limitation. Equation (195) can be solved by using regularization techniques (see Sects. 3 and 4) but derived electron spectra are affected by noise in the photon spectra used. Successive derivatives in the thermality test therefore have rapidly escalating errors, due to the instability of numerical differentiation. It follows that the computation of only the ﬁrst two or three orders of derivative is reliable [28], with the higher-i terms in the “derivative test” too noisy to be useful. On the other hand, we can be conﬁdent that any F (E) clearly failing the “derivative test” at a high conﬁdence level, for given noise,

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

139

can be ruled out as entirely due to a thermal distribution with an everywhere non-negative ξ(T ). However, even if the F (E) does pass the “derivative” test, this itself does not tell us the (non-negative) form of ξ(T ) to which F (E) corresponds. Therefore, in principle a much more effective technique would be to solve (196), where F (E) is obtained by solving (195), thus describing the ξ(T ) corresponding to an observed photon spectrum: if ξ(T ) ≥ 0 for all T , then the photon spectrum can be reliably interpreted according to a thermal model for the bremsstrahlung emission; if ξ(T ) < 0 for some temperature interval, then at least part of the emission is certainly non-thermal. Furthermore the knowledge of possible features in such reconstructed form could yield important information on plasma heating and conduction processes [14, 16]. In recent years this inversion problem has achieved an unprecedented level of importance because of the high-resolution photon spectra ( 1 keV) obtained from the RHESSI mission [57]. Combined with optimization of computational methods for regularized solutions of the ill-posed inverse problem involved [19, 20, 41, 48, 49, 58, 63, 67, 78] it is now possible to infer mean source electron spectra with which speciﬁc physical source models can be compared [17].
A basic technical difﬁculty in the reconstruction of ξ(T ) is due to the fact that solving the second inverse problem (196) is extremely problematic. Simple changes of variables reduce this problem to a Laplace transform inversion problem with noisy data. There is a vast literature [22, 31, 81] showing that this problem is intrinsically highly pathological, due to the very broad ﬁltering action of the Fredholm-Laplace integral kernel (compared to that in the basic bremsstrahlung inverse problem (195) which is of Volterra type and not severely ﬁltering). Several regularization methods [7, 11, 61] have been introduced to handle this inversion by reducing the unphysical oscillations due to the presence of noise. For all of them two considerations are mandatory: ﬁrst, that, as stated by Davies and Martin [22] “[in the Laplace inversion problem with noisy data] no single method gives optimum results for all purposes. . .”, and therefore no general method exists which is effective at the highest level for all physical situations and all kinds of data; second, that, whatever method is applied, even with very accurate data, only a coarse resolution will be achieved in the recovered solution [6].
Most inversion methods for the real Laplace transform have been formulated within the framework of regularization theory for ill-posed inverse problems [3]. At the core of these approaches there is the search for an optimal tradeoff between stability against unphysical oscillations and accurate reproducibility of the data. Such an optimization result is obtained either by ﬁxing a real positive regularization parameter in Tikhonov-like methods (see Sect. 2.5) or by

applying some stopping rule to iterative procedures. However, the present application is particularly challenging owing to the particular nature of the solar spectral data involved here. Typical solar F (E) are characterized by a large dynamic range (at least three orders of magnitude for around one order of magnitude in the E range) and, more significantly, the corresponding ξ(T ) have completely different forms at low and high T : at small T , a near-thermal (δ function) component which differs from zero only in a small T range (narrow support); at high T , a monotonic component spread over a large interval. A consequence of this complexity in the source function is that regularization approaches may lose some (or most) of their effectiveness. For example, the reconstruction of ξ(T ) at low T with classical Tikhonov regularization may correctly reproduce the location of the temperature peak but typically presents ringing effects whose negative components, which are numerical artefacts, might suggest that the spectrum is not thermally interpretable. Negative ringing can be eliminated by applying a reconstruction method with a positivity constraint. However such an approach is not effective at recovering the hightemperature part of ξ(T ) which has a power-law-like behavior and requires regularization methods with more smoothing power. To deal with these kinds of difﬁculty, here we utilize the following approach: an iterative scheme with a positivity constraint is applied for the inversion of the lowenergy part of F (E), in order to eliminate unphysical ringing effects with negative oscillations in the reconstruction of the part of ξ(T ) characterized by a narrow support; then, a ﬁrst-order Tikhonov regularization method is applied for the inversion of the high-energy part of F (E), where an appropriate boundary condition constrains the reconstructed ξ(T ) to behave well (i.e., with a slope compatible with the spectral index of the photon data) at high T . The two reconstructed ξ(T ) are then connected together noting that the connection temperature is easily determined by the T value where the thermal ξ(T ) goes to zero, i.e. the high-T limit of the narrow support of the thermal ξ(T ). We observe that, as far as the inversion of the low energy part of F (E) is concerned, the use of the positivity constraint in the inversion makes the thermality test based on the veriﬁcation that the reconstructed ξ(T ) is positive at all T , inappropriate, since positivity is forcefully imposed in the inversion procedure. Therefore for this inverse problem the compatibility between the data and the thermal model is tested by checking whether the residuals in F (E) corresponding to the ξ(T ) recovered by exploiting the positivity constraint are statistically acceptable.
6.2 Regularization Methods
We could in principle proceed directly from (196) to see whether some F (E) could be wholly thermal in origin if we had a completely reliable inversion method: given a data

140
vector F (E), a wholly thermal interpretation of it is possible if and only if the ξ(T ) obtained by the inversion method has no statistically signiﬁcant negative values over any range of temperature (T ). Our aim here is to address this problem by means of two numerical algorithms based on regularization theory for ill-posed inverse problems, keeping in mind that the effectiveness of any regularized inversion approach in the present case is much weaker than for most other linear inverse problems due to the extreme numerical instability of the Laplace problem, with its very broad kernel. A quantitative estimate [34] for the instability of linear equations like (196) is given in terms of the condition number C of the kernel (cross section) matrix (see Sect. 2.1). It can be shown [20, 66] that, for typical solar data parameters, the condition number associated with the (Fredholm) equation (196) is of the order of 1010, which is much bigger, for similar parameters, than the condition number associated with the (Volterra) bremsstrahlung spectrum to electron spectrum inversion problem (195) (see Table 2). The actual consequences of ill-conditioning are highly signiﬁcant. A regularization algorithm essentially expresses the approximate smoothed solution as a truncated linear sum of some basis functions. In the basic bremsstrahlung spectrum inversion problem (195), F (E) can be expressed in terms of around ten basis functions for typical noise in the case of a data vector with around 100 points, while in the differential emission measure inversion problem (196) we ﬁnd that only two, or at most three, basis functions can be meaningfully included in the expansion of ξ(T ). Therefore, in the recovery of ξ(T ) it is necessary to introduce much more severe constraints than the one adopted in the F (E) inversion procedure described in Sects. 3 and 4. Even incorporating these constraints, it will be impossible to achieve a temperature resolution anywhere nearly comparable with the spectral resolution with which F (E) can be reconstructed through the solution of the bremsstrahlung equation (195) (cf. the analysis of the temperature resolution problem in [20]).
Adopting the change of variable y = 1/T , (196) becomes

K n¯V F (E) =

∞
f (y) exp(−Ey)dy,

E

0

(197)

where

K

=

√ π me/8

=

1.89

×

10−14

gm1/2

=

4.73 ×

10−10 keV1/2 cm−1 s and f (y) (cm−3 keV−1/2) is deﬁned

as

f

(y)

=

ξ (1/y ) y 1/2

,

(198)

with ξ(T ) in units of cm−3 keV−1. Equation (197) involves a continuous representation of the model (f (y)) and of the data (F (E)), while real data are discrete, truncated and affected by measurement and systematic noise. In reality,

M. Prato
therefore, the situation is described by the (ﬁnite rank linear) operator L : X → Y such that
∞
(Lf )n = f (y) exp(−Eny)dy, n = 1, . . . , N, (199)
0
where the {En}Nn=1 are the sampled electron energies, X is the functional space containing the solution and Y is the Euclidean space containing the data. Then our problem is to solve

Lf = g,

(200)

with data vector g in Y having components

gn

=

K

nV

F (En) , En

n = 1, . . . , N.

(201)

As already stated, (200) is a strongly ill-conditioned linear problem and the only way to obtain a realistic approximate solution in the presence of noise is some reconstruction technique based on regularization theory for linear inverse problems. One approach is the ﬁrst order Tikhonov method, which solves the minimization over f of

Lf

−g

2 Y

+λ

f

2 X

=

minimum,

(202)

where λ is the (real positive) regularization parameter. It can be proved [66] that under boundary conditions

f (0) = 0

(203)

and

lim f (y) = 0
y→∞

(204)

the analytical solution of (202) is

fλ(y)

=

N k=1

σk σk2 +

(g, vk)Y λ

uk (y ),

(205)

where the σk and vk are respectively the eigenvalues and eigenvectors of the Gram matrix (see Deﬁnition 6)

∞
Gnm = φn(y)φm(y)dy,
0

(206)

φn(y)

=

1 En2

and

1 − e−Eny

(207)

uk (y )

=

1 σk

N
(vk )n φn (y ).
n=1

(208)

For this problem, ﬁrst-order regularization is more effective than zero-order regularization for two basic reasons.

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

141

First of all, it prescribes a bound on the ﬁrst derivative of the regularized solution which, in this case of large numerical oscillations, is a sensible thing to do. Second, in this particular implementation, condition (203) constrains the regularized solution to behave well at y = 0 (T → ∞), thus improving the restoration accuracy for ξ(T ) at high T . It is also true, however, that condition (204) has no physical basis, and hence may yield artefacts at low T .
The main disadvantage of using Tikhonov regularization is that solutions with negative components can result from noisy data. In particular, in the reconstruction of the low temperature component of ξ(T ), typically characterized by a very narrow support, an effective method would allow us to constrain the restored solution to be positive, thus avoiding unphysical ringing effects due to the presence of noise on the data. The introduction of such a constraint [64] has the effect of increasing the resolution power of the inversion approach, allowing reconstruction of more details in the source function. The method with positivity applied in this section is the projected Landweber method, ﬁrst formulated by Lagendijk, Biemond and Boekee [52] for the image restoration problem. The mathematical properties of this method are discussed, for example, by Eicke [23] and an accelerated version has been provided by Piana and Bertero [65]. We ﬁrst consider the discretized version of (200)

g = Lf

(209)

where f comes from the sampling of (198) and L is the matrix with entries

where riλ is the i-th normalized regularized residual corresponding to the regularized ξ(T ). For completely uncorrelated noise, the normalized cumulative re√siduals exhibit a random walk with expected deviation 1/ k. In (212) the presence of the regularization parameter increasingly correlates the riλ for increasing values of λ. Therefore an optimal criterion to ﬁx λ is to look fo√r the largest value of λ such that |Sλ(k)| is bounded by 3/ k. An analogous procedure is followed for stopping the projected iterations, whereby, in this case, the regularization parameter is represented by the iteration number.
6.3 Simulations
In this section we wish to test the effectiveness of the regularization approach as introduced before. In particular, we describe the case of a power law with a low-energy cutoff showing that if the mean electron spectrum is sampled starting from energies bigger than the cutoff, the reconstruction is rather accurate (in fact, the problem becomes that of recovering a pure power law in a limited domain) while the reconstruction dramatically fails if the minimum sampled energy is smaller than the energy cut-off, in accordance with the fact that a power law with a low-energy cutoff is not compatible with thermal bremsstrahlung emission. Then, the temperature resolution achievable by the method is discussed, the performance of the method in reconstructing power laws is tested and, ﬁnally, a realistic form of F (E) obtained by regularized inversion of a synthetic photon spectrum is considered.

Lmn = exp(−Enym)δy

(210)

where the ym, m = 1, . . . , M are uniformly sampled and δy is an appropriate integration weight. The projected Landweber method provides reconstructions of f (y) (and therefore of ξ(T )) by optimally stopping the iteration

fk+1 = P+(fk + τ LT (g − Lfk)), f0 = 0,

(211)

where τ is a relaxation parameter, LT is the transpose matrix of L, and P+ sets to zero all the negative components at each iteration.
As already stated in the previous section, the regularization effects on the approximate solutions provided by Tikhonov ﬁrst-order regularization and by the projected Landweber method can be obtained by ﬁxing λ in (205) and the iteration number in (211). To this purpose many criteria have been introduced [21]; here we adopted the same approach as in Piana et al. [67], based on the analysis of the regularized cumulative residuals. For example, in the case of ﬁrst-order Tikhonov method we consider the function

Sλ(k) = 1 k

k

riλ,

k = 1, . . . , N

i=1

(212)

• Compatibility Test We want to verify whether an F (E) reconstructed from photon data I ( ) can be interpreted as consistent with a purely thermal model. The “derivative test” of Brown & Emslie [15] provides a possible approach, but does not yield information on the temperature structure of the source. A more informative approach is to apply a reconstruction method and to check if the reconstructed ξ(T ) is non-negative for all T . As an example, let us consider the case of a mean source electron spectrum

F (E) ∝

E −δ , 0,

E ≥ Ec, E < Ec,

(213)

with Ec a low-energy cutoff. Before performing the inversion, however, we discuss some informative analytic aspects of (213) in relation to the general expression (196) for F (E) from a purely thermal source, which we rewrite, ignoring constant factors, as

F (E) ∝ E

∞ 0

ξ(T ) T 3/2

e−E/

T

dT

.

(214)

First it is obvious that if ξ(T ) is greater than zero over any T interval then the corresponding F (E) is never zero at

142

Fig. 14 Reconstruction of the differential emission measure corre-
sponding to an electron spectrum in the form of a power law with
a low-energy cutoff, for two different values of the cutoff energy
(Ec) and of the minimum sampled energy (Emin). The reconstruction method is ﬁrst-order Tikhonov regularization with boundary conditions. If Emin ≥ Ec, the sampled electron spectrum does not include a cutoff and ξ(T ) is faithfully recovered (dotted line). If Emin < Ec, the sampled electron spectrum does include a cutoff and so is not com-
patible with a thermal interpretation. In this case the reconstruction
of ξ(T ) is unphysical (dashed line). The solid line is proportional to T −5/2 in the range 10–100 keV

any E. Thus the F (E) in (213) cannot be purely thermal

(it clearly fails the derivative test at E = Ec). Second, we

note that for a pure power law ξ(T ) ∝ T −α at all T with α

constant,

the

resulting

F (E)

is

proportional to

E−α+

1 2

at

all

E.

Consequently,

for

α

=δ

+

1 2

a

pure

(untruncated)

power-law ξ(T ) predicts F (E) in (213) perfectly in the

range E ≥ Ec but completely contradicts it in the range

E < Ec. Thus a thorough thermality test must be applied

to all E; failure (within the allowed uncertainties) at even

one value of E is enough to rule out a purely thermal

model.

A somewhat surprising result here is that a wholly-

thermal model is ruled out by the form of F (E) at low

rather than at high energies. We also emphasize that the

power-law relation between ξ(T ) and F (E) only holds (at

E ≥ Ec) for a complete power-law ξ(T ). If ξ(T ) is only

a power law over some ﬁnite range, say (T1, T2), the cor-

responding F (E) is not a power law at any E but rather,

with x = T /E,

F (E) ∝ E−α+1/2 T2/E x−α−3/2e−1/x dx.
T1 /E

(215)

We now show that application of our inversion method to (213) agrees well with these analytic results.
For the inversion, the data (213) is discretized according to uniform sampling starting from a minimum sampled energy (Emin), realistic Poisson noise is added to the corresponding photon spectrum and errors on F (E) are generated by inverting the noisy I ( ). We applied the ﬁrst-order Tikhonov inversion method for two possible experimental situations concerning the relative values of

M. Prato
the pair Ec, Emin, and for δ = 2 with the results shown in Fig. 14. When Ec ≤ Emin (i.e., the cutoff is not sampled), a stable differential emission measure is restored. There are some slight, long-wavelength oscillations in the recovered ξ(T ) of roughly the width of the kernel but the mean temperature spectral index is close to the theoretical value α = 2.5. On the other hand, when Emin < Ec (and so the cutoff is sampled), the reconstruction contains large negative ranges and is absolutely unphysical as expected. This behavior is consistent with the fact that a mean source electron spectrum with any cutoff is incompatible with a purely thermal interpretation of the emission (since any Maxwellian contains electrons of all E).
• Temperature Resolution. Heuristically, the effective temperature resolution achievable by our inversion method can be assessed by reconstructing ξ(T ) forms using the F (E) corresponding to input δ functions ξ(T ) ∼ δ(T − T0). The resulting reconstructed forms of ξ(T ) are characterized by ﬁnite Full Widths at Half Maximum (FWHM) which estimate the resolution achievable around T = T0. Therefore for inverse problems the resolution power depends on the reconstruction method. In Table 4 the FWHM values realized through application of the two reconstruction methods discussed in Sect. 6.2 to F (E) spectra are given for different values of T0. The averaged electron spectra are obtained by inverting the corresponding photon spectra (affected by realistic Poisson noise) and contain 50 points in the energy range 1–50 keV. Table 4 also contains values for the “centroid” temperature of the reconstructed distributions, deﬁned by T = T ξ(T )dT / ξ(T )dT . In the case of ﬁrst-order regularization these T are 10% or so higher than the Tmax at which the recovered ξ(T ) peaks because the ξ(T ) are skewed, and they compare well with the single input T0 of the originally-assumed δ functions. In the case of the iterative projected Landweber method, T and Tmax coincide in most cases and are very close to the theoretical T0. When ﬁrst-order regularization is applied in the case of multi-thermal sources, the FWHM values given in Table 4 may be overly optimistic estimates of the temperature resolution particularly when trying to separate narrow features. As an example, we consider reconstructions of two δ functions with both methods, where the ﬁrst is peaked at T1 = 2.5 keV and the second is peaked at T2 = 10, 7, 5, 4.5 keV, respectively—see Fig. 15. Also in this case for the reconstruction we considered F (E) sampled in the energy range 1–50 keV. We note that the use of the positivity constraint increases the resolution limit as explained for example in Piana and Bertero [64] by means of arguments based on the analytic continuation principle. Furthermore, the ξ(T ) reconstructed

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

143

Table 4 Full Widths at Half Maximum (FWHM) for the reconstruction of δ functions peaked at different temperatures. The maximum and the centroid values of T corresponding to the reconstructed ξ(T ) are also given. The inversion methods are ﬁrst-order regularization with boundary conditions and projected-Landweber method with positivity

Input T0 (keV)

2.5 3.5 4.5

5.5

6.5

Tmax (Tikhonov)

2.3 2.8

3.4

3.9

4.5

T (Tikhonov)

2.6 3.3

4.6

5.7

7.1

FWHM (Tikhonov) 1.5 2.0

3.3

4.2

5.6

Tmax (positivity)

2.5 3.6

4.6

5.6

6.6

T (positivity)

2.5 3.6

4.6

5.6

6.6

FWHM (positivity) 0.8 0.9

1.1

0.9

1.5

Input T0 (keV)

7.5 8.5

9.5 10.5 11.5

Tmax (Tikhonov)

5.2 6.1

7.3

7.9

8.6

T (Tikhonov)

8.5 9.0

9.6 10.5 11.6

FWHM (Tikhonov) 7.3 9.4 11.8 12.3 13.2

Tmax (positivity)

7.6 8.3

T (positivity)

7.6 8.5

FWHM (positivity) 1.3 2.6

9.6 10.6 11.6

9.7 10.8 11.6

2.3

2.3

2.4

Fig. 15 Reconstructions of two δ functions (solid line) by means of ﬁrst-order regularization (dashed) and projected-Landweber method (dotted). The averaged electron spectrum contains 50 points uniformly sampled in the range 1–50 keV and is obtained by inverting the corresponding photon spectrum with realistic Poisson noise added: (a) T1 = 2.5 keV, T2 = 10 keV; (b) T1 = 2.5 keV, T2 = 7 keV; (c) T1 = 2.5 keV, T2 = 5 keV; (d) T1 = 2.5 keV, T2 = 4.5 keV

by means of the Tikhonov method have unphysical negative components. We conclude that for the recovery of the low-temperature part of the differential emission measure the projected algorithm is signiﬁcantly more effective. We also observe that the energy range 1–50 keV, where F (E) was sampled for this inversion, is in some sense optimal, since it always includes the peak temperatures to be recovered. In real F (E), energies up to typically 10 keV must be avoided owing to the presence of (or problematical correction for) lines of non-bremsstrahlung origin or to systematic errors introduced by the hardware. In other words, a typical experimental situation is that F (E) is inverted from electron energies larger than the temperatures involved in the thermal process. In order to study the effect of this on the inversion method, we considered the test shown in Fig. 16.
The electron spectrum corresponding to an isothermal ξ(T ) with T0 = 7 keV is inverted for different electron energy sampling ranges: 2–20 keV (solid), 2–7 keV (dashed), 7–20 keV (dotted) and 20–70 keV (dot-dashed). We found that if T0 is higher than the energy range considered, the reconstruction preserves the symmetry of the δ function (so that Tmax and T more or less coincide) but the peak temperature is notably overestimated (almost 20%). If T0 is smaller than the sampled energies (which is the realistic situation), the reconstruction is rather skewed (in such a way that Tmax is bigger than T , as opposed to the case of Tikhonov regularization), presents a widened

144
Fig. 16 Reconstruction of a δ function peaked at T0 = 7 keV when the corresponding F (E) is sampled over different electron energy ranges: 2–20 keV (solid); 2–7 keV (dashed); 7–20 keV (dotted); 20–70 keV (dot-dashed). The reconstruction method is the projected-Landweber method with positivity
Fig. 17 Inversion of F (E) corresponding to the case ξ(T ) ∼ T −5/2: (a) F (E) uniformly sampled with N = 140 points in the energy range 50–189 keV; (b) theoretical ξ(T ) (solid), reconstruction given by ﬁrst order regularization (dashed), reconstruction given by the projected-Landweber method (dotted)
FWHM and the peak temperature is slightly underestimated: for example, if the selected range is 20–70 keV, the reconstructed Tmax is ∼ 5% smaller than the true one.

M. Prato

• Power Laws The situation is notably different for the recovery of the high temperature part of ξ(T ). In this range the typical behavior is close to a power law T −α, which generates a power-law shape E−δ in the corresponding high energy part of F (E) (the approximate relation α ∼ δ + 1/2 has been already discussed before). In this case, the small y (high T ) boundary condition (203) plays a constructive role in the recovery of ξ(T ) and makes ﬁrst-order regularization more effective than the projected iterative scheme (in this case the action of the positivity constraint is insigniﬁcant, since the ξ(T ) to be recovered has a wide support i.e. is everywhere far from zero and possible residual oscillations do not induce negative components). A test example is represented in Fig. 17, where we invert the spectrum F (E) corresponding to the input form

ξ(T ) ∼ T −5/2.

(216)

This electron spectrum has been obtained by inverting the corresponding photon spectrum given by (195) with the addition of realistic Poisson noise. F (E) in Fig. 17(a) has been uniformly sampled with N = 140 points in the range 50–189 keV (in the case of power laws the sensitivity of the reconstruction qualities on the energy range adopted for the inversion is not very signiﬁcant) and inverted in Fig. 17(b) by means of the ﬁrst-order regularization method and for the projected Landweber method. The results of this computation clearly show that ﬁrstorder regularization with the boundary condition (203) is particularly effective in this case. We ﬁnally note that for notably larger values of δ a certain deterioration of the reconstructions may occur, due to the fact that λ is a global regularization parameter which works in a less effective way when the function to reconstruct is steep. However this deterioration can be reduced by means of an appropriate rescaling of the power law (see [49]).

• More Realistic Spectra A mean electron ﬂux spectrum reconstructed from a real photon spectrum is often assumed [39] to comprise an isothermal component at low electron energies plus a power-law behavior at high energies. A simple example is given by the model electron spectrum (with T , E in keV)

F

(E

)

=

100T0−3/2E

e−

E T0

+

(δ + 1) 100

E −δ ;
50

(217)

application of the derivative test [15] shows that this spectrum is consistent with a wholly thermal source; indeed the corresponding differential emission measure is

ξ(T ) = 100δ(T − T0) + 0.5(50)δ−1T −(δ+0.5).

(218)

We discretized the F (E) form (217) with a uniform 1 keV sampling from 1 keV to 250 keV with T0 = 4 keV and

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy
Fig. 18 Inversion of F (E) corresponding to ξ(T ) given in (218) for T0 = 4 keV and δ = 2: (a) electron spectrum with N = 250 sampled energies in the range 1–250 keV; (b) theoretical ξ(T ) (solid) with the reconstructions given by ﬁrst-order regularization (dashed) and projected-Landweber method (dotted); (c) theoretical ξ(T ) (solid) and reconstruction (dashed) obtained by inverting the low-energy part of the electron spectrum with the projected-Landweber method and the high-energy part with ﬁrst-order Tikhonov regularization and by connecting the two restorations; (d) cumulative regularized residuals (solid) for the method with positivity (upper panel) and Tikhonov r√egularization (lower panel) compared to the statistical bound ±3/ k (dashed)

δ = 2. We then generated the corresponding photon spectrum (I ( )) using an exact (isotropic) cross section and added random Poisson noise, resulting in corresponding noise in F (E). Figure 18(a) shows the resulting simulated F (E) while Fig. 18(b) contains the restorations provided by the two methods. Both reconstructions present notable unphysical artefacts which are essentially due to the fact that neither method is able to fully restore the two completely different behaviors of the source function at low and high T . Therefore we considered an approach whereby the two different inversion methods are applied one to the low- and one to the high-energy part of the electron spectrum separately. More precisely, the projected Landweber method is applied to F (E) in the low-energy range (here we used 2–36 keV, which approximately corresponds to the range where the spectrum is optimally ﬁtted by the isothermal component). On the other hand, ﬁrst-order Tikhonov regularization with boundary conditions is applied to F (E) in the high-energy range (here we used 55–204 keV, which approximately corresponds to the range where the spectrum is optimally ﬁtted by a power-law). The two reconstructed ξ(T ) are connected together at the temperature where the thermal peak goes to zero and plotted in Fig. 18(c) while Fig. 18(d) shows that the regularized cumulative residuals (212) are statistically reliable for the chosen value of the iteration number and of the Tikhonov regularization parameter.

6.4 Application to RHESSI Data

In order to address the analysis of real spectra provided by RHESSI, we ﬁrst need to check the compatibility between condition (203) and the asymptotic behavior of the recorded photon spectrum at high energies. Such an issue can be addressed by simple integral computations showing that, if a function F (t), for t → 0, is

F (t) ∼ Atβ

(219)

with β > −1, then its Laplace transform (LF )(s), for t → ∞, is

(LF )(s) ∼ A

(β + 1) sβ+1 ,

(220)

with

∞

(z) =

e−t t z−1dt.

0

145
(221)

146
Fig. 19 August 21 2002 ﬂare recorded by RHESSI in the time interval 01:38:44–01:39:04 UT: (a) photon spectrum; (b) mean source electron spectrum reconstructed by zero-order Tikhonov regularization; ; (c) reconstruction obtained by inverting the low-energy part of the electron spectrum with the projected-Landweber method and the high-energy part with ﬁrst-order Tikhonov regularization and by connecting the two restorations; (d) cumulative regularized residuals (solid) for the method with positivity (upper panel) and Tikhonov r√egularization (lower panel) compared to the statistical bound ±3/ k (dashed)
Therefore condition (203) is compatible with the mean source electron spectrum with an asymptotic (E → ∞) electron spectral index δ > 0 (corresponding to a photon spectral index γ > 1).
The reconstruction procedure described in the previous section has been applied to three photon spectra observed by RHESSI corresponding to three different ﬂares. Figure 19(a) shows the photon spectrum corresponding to the August 21 2002 ﬂare in the time interval 01:38:44–01:39:04 UT, while Fig. 19(b) shows the corresponding averaged electron spectrum obtained by using zero-order Tikhonov regularization.
The low energy part of this spectrum (11–24 keV) has been inverted by means of the Landweber iterative scheme with positivity for 105 iterations while the high energy part (50–189 keV) has been inverted by using ﬁrst-order Tikhonov regularization with boundary conditions (again the two electron energy ranges correspond to the intervals where F (E) is optimally ﬁtted by a thermal component and a power-law respectively). The two reconstructed ξ(T ) are connected together and plotted in Fig. 19(c) while the cumulative residuals contained in Fig. 19(d) show that the reconstruction is statistically reliable. At small T , ξ(T ) presents a peak at T ∼ 2.9 keV, FWHM ∼ 1.5 keV and T ∼ 2.8 keV (the temperature provided by best-ﬁtting F (E) is 2.6 keV). In order to study the compatibility of this spectrum with a single-temperature thermal interpretation we produced a synthetic F (E) corresponding to a δ-function peaked at 2.9 keV and inverted it with the same projected Landweber method applied to the same electron energy range. The restoration presented a FWHM of around 1.5 keV showing that this ﬂare can be reliably interpreted according to an isothermal model. At higher temperatures, ξ(T ) presents a dip between 60 and 70 keV and an asymptotic powerlaw-like behavior with α ∼ 2.8 (this value is in accordance with the fact that the asymptotic electron spectral index is δ ∼ 2.3). In order to study the statistical relevance of the non-monotonic structure in the 60–70 keV temperature range, in Fig. 20 we constructed the conﬁdence strip for the regularized ξ(T ) [63, 67] by means of repeated inversions using different realizations of the data set and by superimposing the corresponding regularized solutions. The strip results to be notably large in correspondence with the dip, thus allowing to interpret this structure in terms of a ‘plateau’, a broken-power-law or even a simple power-law behavior.

M. Prato

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

147

Fig. 20 The conﬁdence strip for the regularized ξ(T ) at high T , corresponding to the August 21 2002 ﬂare in the time interval 01:38:44–01:39:04 UT. The strip has been obtained by repeated inversions of F (E) in Fig. 19(b) using 20 different random realizations of this data set. The inversion method is ﬁrst-order Tikhonov regularization
Fig. 21 November 3 2003 ﬂare recorded by RHESSI in the time interval 09:57:00–09:57:20 UT: (a) photon spectrum; (b) mean source electron spectrum reconstructed by zero-order Tikhonov regularization; (c) reconstruction obtained by inverting the low-energy part of the electron spectrum with the projected-Landweber method and the high-energy part with ﬁrst-order Tikhonov regularization and by connecting the two restorations; (d) cumulative regularized residuals (solid) for the method with positivity (upper panel) and Tikhonov r√egularization (lower panel) compared to the statistical bound ±3/ k (dashed)

An analogous procedure has been applied for the analysis of the photon spectrum in the time interval 09:57:00– 09:57:20 UT of the November 3 2003 ﬂare (see Fig. 21(a) for the photon spectrum and Fig. 21(b) for the inverted averaged electron spectrum). F (E) has been inverted with the positivity method in the 13.5–40.5 keV range and with ﬁrst-order regularization in the 56.5–180.5 keV range. The reconstructed ξ(T ) in Fig. 21(c) presents a peak at T ∼ 3.1 keV with FWHM ∼ 1.4 keV and T ∼ 2.8 keV (the best-ﬁtting temperature is 3.2 keV). As for the previous ﬂare, in this case a single-temperature interpretation of this part of the spectrum is acceptable. At higher T there is a feature in the range 70–80 keV, which is more pronounced than the one in the August 21 2002 ﬂare (although, also in this case, the conﬁdence strip at these temperatures is very wide). The asymptotic α is around 3.1, which must be compared with an asymptotic δ in F (E) of around 2.7 (once more, the asymptotic relation α ∼ δ + 0.5 is satisﬁed). The cumulative residuals in Fig. 21(d) show that these results are statistically reliable.
Things are notably different in the case of the photon and electron spectra in Fig. 22(a) and (b) respectively, corresponding to the time interval 00:30:00–00:30:20 UT of the July 23 2002 ﬂare. This F (E) fails the derivative test at several points in the low-energy range. We have computed

the ﬁrst ﬁve derivatives of F (E)/E and found failures of the test for different points in the second, third, fourth, and ﬁfth derivative. Figure 22(c) contains, for example, the third

148
Fig. 22 July 23 2002 ﬂare recorded by RHESSI in the time interval 00:30:00–00:30:20 UT: (a) photon spectrum; (b) mean source electron spectrum reconstructed by zero-order Tikhonov regularization; (c) numerical third derivative of F (E)/E with corresponding statistical errors; (d) cumulative regularized residual (solid) for the method with positivity (upper panel) in the case of 106 iterations and Tikhon√ov regularization (lower panel) compared to the statistical bound ±3/ k (dashed)
derivative which should be negative for a thermal spectrum and is in fact positive (with statistical signiﬁcance) at 16 and 19 keV. By applying the constrained-Landweber method to F (E) at low energies (for example between 12 and 21 keV) we found that the cumulative residuals never present the expected random walk, even for huge numbers of iterations (the residuals in Fig. 22(d), upper panel, correspond to 106 iterations). This behavior seems to suggest that a thermal interpretation of this photon spectrum could be problematic, although we also observe that this photon data set probably suffers a notable pulse pile-up, which may imply artefacts in the reconstruction of F (E). For the high-energy part of the spectrum, ﬁrst-order regularization provides a powerlaw-like ξ(T ) with α ∼ 2.7 at high T (δ for this spectrum is around 2).
6.5 Conclusions
The inference of differential emission measure functions ξ(T ) from observed photon spectra (I ( )) with a realistic bremsstrahlung cross section is substantially more difﬁcult than the single-step inversion analysis of Piana, Brown, and Thompson [66] based on an approximate Q. A proper procedure involves two inverse problems. The ﬁrst of these is the inversion of I ( ), through an exact solid-angle-averaged bremsstrahlung cross section kernel and a zero-order Tikhonov regularization method, to obtain the mean source electron spectrum F (E). The second uses an approach for inverting F (E) which involves the application of a projected algorithm with positivity constraint in the inversion of the low energy part of the spectrum and of a ﬁrst-order regularization method with boundary conditions in the inversion of the high energy part of the spectrum. The main ﬁndings are:
– the approach correctly identiﬁes certain properties of F (E) (such as bumps or energy cut-offs) as being inconsistent with any physical ξ(T ) ≥ 0;
– the use of the positivity constraint allows us to obtain a satisfactory temperature resolution in the recovery of δ functions while the use of ﬁrst-order regularization with boundary conditions provides reliable reconstructions for smooth forms such as power-laws;
– application of the method to observed RHESSI photon spectra has revealed two cases in which the recovered ξ(T ) is spectrally consistent with a roughly isothermal low-temperature plasma plus a very broad form of ξ(T )

M. Prato
at high temperatures. In a third case, a spectrum from the July 23 2002 ﬂare, the reconstruction method at low temperatures produces unacceptable large residuals. This re-

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

149

sult is in accordance with the fact that the same spectrum fails to satisfy the derivative test which veriﬁes the compatibility with a purely thermal interpretation. Possible physical motivations for this behaviour are still unclear and, for example, may be related to the fact that this ﬂare produced spectra which deviate from a power-law behaviour in a manner consistent with non-uniform ionization [47]. However we also observe that the spectrum used in our analysis suffers a notable pulse pile-up which may imply artefacts in the analysis results.
The availability of a reconstruction approach for addressing the difﬁcult inverse problem of restoring ξ(T ) from reconstructions of F (E) may have important consequences in the analysis and interpretation of RHESSI spectra. In future research we will apply the method to study the inﬂuence of albedo effects on the modiﬁcation of the differential emission measure and to deduce important physical properties on the thermal plasma from the reconstructed ξ(T ).
7 Application to Solar Physics: Imaging-Spectroscopy
7.1 From Spectroscopy to Imaging-Spectroscopy
The high X-ray energy resolution achievable with RHESSI’s hardware and the consequent measurements of the precise shape of the X-ray continuum, together with suitable mathematical tools to stably invert the bremsstrahlung equation, allow to provide unique information on the spectrum of the accelerated electrons and on the heated plasma. However, the purposes of the RHESSI mission go further: the new approach is to combine, for the ﬁrst time, high-resolution spectroscopy in X-ray and γ -ray with high-resolution images, so that a detailed energy spectrum can be obtained at each point of the map. In fact, RHESSI produces hard X-ray and γ -ray images with the ﬁnest angular and spectral resolutions ever achieved [57]; imaging spectroscopy analysis of this data is a powerful tool with which to explore the underlying physics of particle acceleration and transport in solar ﬂares. Traditional imaging spectroscopy methods (e.g., [30]) start by constructing two-dimensional maps of the source at different count energies by applying image processing algorithms (e.g., back projection, CLEAN, Maximum Entropy, Pixon). This results in a series of images which are consistent both with the broad assumptions of the particular algorithm used and with the imaging information contained in the data. Spatially resolved count spectra are then obtained from this set of images by selecting particular regions in the ﬁeld of view and comparing the intensity in those regions as a function of count energy. Finally, the corresponding spatially resolved electron spectra are constructed by applying regularized spectral inversion methods (e.g., [18]) to the spatially resolved count spectra.

In this section we introduce a new approach to imaging spectroscopy which is optimized to the distinctive way in which spatial information is encoded in the RHESSI data. The RHESSI instrument employs a rotation modulation collimator (RMC) imaging technique, in which rapid time variations of the detected counts are effected by the placement of a set of RMCs, each with a different pitch, in front of each detector. Spatial information is encoded in the temporal modulation of the detected ﬂux [40]. As the RMC rotates, the amplitude and phase of this pseudo-periodic modulation over a limited range of angles provides a direct, calibrated measurement of a single Fourier component of the source distribution. Such a Fourier components is termed a visibility [70], and is the same quantity provided by the correlated signal from a pair of antennas in a radio interferometer. In this case, the spatial frequency of the measured visibility is determined by the angular resolution of the RMC and its instantaneous orientation. Combining data from multiple RMC’s at a variety of orientations, the set of visibilities can then be used to reconstruct the spatial distribution of the source. Since visibilities can be summed linearly, this perspective on the data provides a convenient basis for combining data from multiple rotations into a tractable number of visibility measurements with well-deﬁned statistical errors.
The “traditional” approach to imaging spectroscopy, in which images at different count energies are “stacked” and compared, not only fails to take full advantage of the particular nature in which spatial information is contained in the RHESSI data, it also has two signiﬁcant drawbacks:
– while imaging algorithms can reduce statistical and pointresponse artifacts in each image, they are completely ineffective in smoothing along the count energy direction, so that recovered images corresponding to adjacent energy bins can exhibit substantial differences;
– owing to these energy-dependent ﬂuctuations, the determination of the count spectrum at a particular point (x, y) (or, more accurately, a particular region [x ± x, y ± y]) in the source image can be problematic, as is the determination of the statistical error on the count ﬂux. As is well-known (e.g., [20]), such noise-related spectral variations are greatly magniﬁed upon performing a spectral inversion to obtain the corresponding electron spectrum.
In addition, as with any indirect imaging technique, the incomplete spatial frequency sampling results in spatial sidelobes in the point response function which can cause contamination by neighboring sources. Further, the statistical noise from all source components contributes to the noise in each selected region of the source.
It is crucial to recognize that it is not the observed counts (or even photons) that are of interest per se, but rather the electrons that produce them: the real science goal is to obtain physically plausible (i.e., “sufﬁciently smooth”) electron

150
spectra throughout the source. Our new method of imaging spectroscopy analysis therefore involves an interchange of two steps in the data processing chain. First, one applies a count to electron inversion algorithm to obtain smoothed electron spectra at each point in the spatial frequency domain. Once such electron ﬂux visibility spectra have been obtained, they can be processed using standard image reconstruction techniques to yield electron ﬂux images for the entire ﬁeld of view. Since the electron ﬂux visibility spectra are regularized, so also are the corresponding electron ﬂux spectra at each location in the image. This renders these spatially-resolved spectra more suitable for further analysis.
We perform the count to electron inversion step using the familiar Tikhonov regularization technique that has proven so effective (see Sects. 3 and 4) in the inference of spatially-integrated electron spectra F (E) from observations of spatially-integrated count (or photon) spectra I ( ). Applied to visibilities, the Tikhonov regularization method forces smoothness in the inferred electron visibility spectra at each point in the spatial frequency domain and thus enhances real features that persist over a relatively wide energy band, while suppressing noise-related features that show up only over a narrow range of energies. The combination of visibility data and Tikhonov regularization methodology therefore allows us to derive the most robust information on the spatial structure of the electron ﬂux spectrum image, the key quantity of physical interest.

7.2 Methodology

Deﬁne a Cartesian coordinate system (x, y, z) such that
(x, y) (in units of arcseconds) represents a location in the
image plane and z (cm) represents distance along the line
of sight into the source. Let the local density of target parti-
cles along the line-of-sight depth (x, y) (cm) be n(x, y, z) (cm−3) and let the differential electron ﬂux spectrum (electrons cm−2 s−1 keV−1) at the point (x, y, z) in the source be F (x, y, z; E).
Since the source is optically thin, the relation between F (x, y, z; E) and the corresponding observed photon spectrum image I (x, y; ) (photons cm−2 s−1 keV−1 arcsec−2) is

I (x, y;

)

=

a2 4π R2

∞ (x,y)
n(x, y, z)
0

× F (x, y, z; E) Q( , E) dz dE,

(222)

where a = 7.25 × 107 cm arcsec−1 at R = 1 AU and Q( , E) (cm2 keV−1) is the cross section for emission of a photon at energy . Here we consider the isotropic bremsstrahlung cross section Q( , E) given by equation (3BN) in [44] and reported in Sect. 3.1 (see (110)).

M. Prato
We deﬁne the mean electron ﬂux spectrum F (x, y; E) (electrons cm−2 s−1 keV−1 at the Sun) by

1

(x,y)

F (x, y; E) =

n(x, y, z) F (x, y, z; E) dz,

N (x, y) 0

(223)

where the column density (cm−2) at each point (x, y) in the
image is given by N (x, y) = 0 (x,y) n(x, y, z) dz. Then, by (222) and (223), we may write

I (x, y;

)

=

a2 4π R2

∞
N (x, y) F (x, y; E) Q( , E) dE.
(224)

Next we introduce spatial frequencies u and v in the x-
and y-directions, respectively, and deﬁne the count visibility spectrum V (u, v; q) (counts cm−2 s−1 keV−1) as the two-
dimensional spatial Fourier transform of the count spectrum image J (x, y; q) (counts cm−2 s−1 keV−1 arcsec−2):

V (u, v; q) =

J (x, y; q) e2πi(ux+vy) dx dy.

xy

(225)

The count spectrum and photon spectrum images are related by the instrument’s detector response matrix. Hence we may write

∞

V (u, v; q) dq =

D(q, ) I (x, y; )

xyq

× e2πi(ux+vy) d dx dy,

(226)

where the dimensionless quantity D(q, ) is the differential element of the detector response matrix5 corresponding to the generation of a count with energy in the energy range [q, q + dq] from a photon in the energy range [ , + d ].
Combining (224) and (226) gives the rather formidable expression

V (u, v; q) dq

a2

∞

= 4π R2 x y q

∞
[N (x, y) F (x, y; E)]

5The range of corresponding to a count of energy q is taken to be [q, ∞); only photons of energy ≥ q can generate a count of energy q. This (linear) formalism therefore ignores the possibility of the creation of a count of energy q from the arrival at the same detector of two (or more) photons of energy < q within a very short time interval. This “pulse pileup” process is intrinsically nonlinear (the detector response matrix depends on the incoming photon ﬂux) and so cannot be readily accommodated within the present formalism. Our analysis will therefore be restricted to medium-ﬂux events for which pileup is not likely to be signiﬁcant.

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

151

× D(q, ) Q( , E) e2πi(ux+vy) dE d dx dy, (227)
which provides the formal relationship between [N (x, y) F (x, y; E)], the quantity of most direct physical interest, and the observed count visibility spectra V (u, v; q). We now introduce the count cross section K(q, E) (cm2 keV−1) through the expression

from energy bin to energy bin are suppressed. This technique therefore enhances spatial features (Fourier components) that persist over a wide range of energies, and suppresses (noise) features that exist over only a limited subset of energy bins. Once the electron visibility spectra have been determined, the electron ﬂux spectral image may be determined through the inverse Fourier transform of (229).

∞
K(q, E) dq = D(q, ) Q( , E) d
q

(228) 7.3 Application to Data

and the electron ﬂux visibility spectrum (electrons cm−2 s−1 keV−1)

W (u, v; E)

= a2

N (x, y) F (x, y; E) e2πi(ux+vy) dx dy. (229)

xy

With these deﬁnitions, and reversing the order of integration with respect to and E, (227) can be written as the straightforward integral equation

V

(u, v; q)

=

1 4π R2

∞
W (u, v; E) K(q, E) dE.
q

(230)

Equation (230) is formally identical to the relation (101) between the spatially-integrated photon spectrum and the source-integrated electron spectrum and so can be solved for the visibilities W (u, v; E) from the observed count visibility spectra V (u, v; q) by applying Tikhonov regularization, that have proven so effective in the solution of (101) for nV F (E) given I ( ) (see Sects. 3 and 4).
To brieﬂy summarize the Tikhonov methodology in the visibility case, (230) is ﬁrst discretized in both count and electron energy spaces to yield, at each sampled point (u, v) in the spatial frequency domain, the data visibility vector V[u,v] (the elements of which depend on count energy q) and the source visibility vector W[u,v] (the elements of which depend on electron energy E). These are related through the matrix equation

V[u,v] = K · W[u,v],

(231)

where K is the kernel matrix, the elements of which are formed from the values of K(q, E) at the discretized count and electron energy points. Then the zero-order regularization problem

V[u,v] − K · W[u,v] 2 + λ[u,v] W[u,v] 2 = minimum (232)

is solved for W[u,v] given the prescribed visibility vector V[u,v] at each sampled point in (u, v) space, using an appropriate value (see below) of the regularization parame-
ter λ[u,v]. This results in electron visibility spectra that are “smooth” in the sense that the large variations in W (u, v; E)

We illustrate the method by applying it to data obtained near the peak of the C7.5 ﬂare of February 20, 2002 (11:06:02– 11:06:34 UT), using visibilities from RHESSI RMCs 3 through 9, corresponding to spatial resolutions from ∼ 7 to ∼ 183 arcseconds. For comparison purposes, we ﬁrst construct count images by means of the visibility technique and apply the “traditional” imaging spectroscopy approach, in which count images in different energy bands are compared. After discussing the drawbacks of this “traditional” method, we use our new method to obtain more physically useful electron ﬂux maps of the ﬂare.
The “traditional” method begins by converting the X-ray count rate data to a set of visibilities. This requires preselecting the number of angular intervals (roll bins) per rotation. The number of roll bins should be maximized to avoid degradation of sensitivity near the edge of the ﬁeld of view. However, for this application, each roll bin must contain at least one complete modulation cycle to enable the visibility to be well-measured. Using an iterative technique, we maximized the number of roll bins for each detector subject to this constraint and then used a χ2 analysis to determine statistically acceptable visibilities. Then, since V (u, v) and V (−u, −v) are complex conjugates (see (225)), the visibilities measured at angles separated by 180 degrees are combined to improve the signal-to-noise ratio. Finally, the error bars on the real and imaginary parts of each visibility for each energy channel are computed by propagating the statistical error in the counts through to the calculation of each visibility. The resulting visibilities are used as input to the Maximum Entropy (“MEM-NJIT”) algorithm [10] as implemented in the Solar SoftWare (SSW) package, to produce 80 arcsec × 80 arcsec maps with 0.4 arcsec pixels. This was done for 16 4-keV wide energy intervals from 10 to 74 keV.
Figure 23 shows some of these count-based images. Two bright features, which we interpret as emission from chromospheric footpoints, are apparent. In addition, there is some evidence for a “strand” of emission linking the two bright features; this we interpret as emission from the coronal region of the magnetic loop linking the footpoints. The lower left panel of Fig. 23 shows the areally-averaged count spectrum (counts cm−2 s−1 keV−1 arcsec−2) for the northern footpoint region highlighted by the square in each image; this spectrum has been constructed by averaging, for

152

M. Prato

Fig. 23 Top panels: Count images for the 20 February, 2002
(11:06:02–11:06:34 UT) event, for the energy intervals shown, pro-
duced using the MEM-NJIT algorithm. Lower panels: Areallyaveraged count spectrum (counts cm−2 s−1 keV−1 arcsec−2;

left) and electron ﬂux conﬁdence strip spectra (electrons cm−2 s−1 keV−1 arcsec−2; right) for the footpoint region highlighted
in the images

each energy channel, the intensities of the pixels that constitute the highlighted region (to get the total count spectrum for the region [counts cm−2 s−1 keV−1], simply multiply by the area of the region, in this case 14.4 × 14.4 = 207.36 arcsec2). The error bars have been computed as the combination of a count (Poisson) error plus background noise. The lower right panel shows the recovered electron ﬂux spectrum conﬁdence strip (i.e., a series of realizations of the electron ﬂux spectrum, each based on a different noisy realization of the data; see Sect. 4.3) for this feature. Each electron spectrum realization was obtained by inverting the count spectrum using the zero-order regularization method applied in Sects. 3 and 4 for spatially integrated spectroscopy.
The count spectrum is conspicuously weak in the 26– 30 keV image. This leads to a relatively ﬂat count spectrum in this range (lower left panel of Fig. 23) and hence (since the electron spectrum is, crudely, related to the derivative of the photon spectrum—[12]) to a dip in the recovered electron spectrum (lower right panel) for this feature. Although such spectral dips have been inferred for spatially-integrated electron spectra (e.g., [67]), the spatially-integrated count spectra on which such spatially-integrated electron spectra are based are not subject to the imaging artifacts that render suspect the count spectra determined for a particular spatial region. Therefore it is possible, or indeed likely, that the feature in Fig. 23 is not real, but rather an artifact imposed by

isolating attention on a limited range of spatial coordinates, rather than on the overall patterns (Fourier components) of emission present in the spatially-integrated emission.
As discussed in Sect. 7.1, because spatial information is fundamentally encoded by RHESSI in Fourier components, rather than in “pixels,” a more cogent approach to imaging spectroscopy involves performing the count to electron inversion step in the spatial frequency domain, i.e., on the visibility data. By focusing on the information in distinct Fourier components, we remove the deleterious effects of imaging artifacts that are evident in the more “traditional” approach to imaging spectroscopy.
Figure 24 shows the amplitude (upper panels) and phase (lower panels) of the count visibilities V (u, v; q) (counts cm−2 s−1 keV−1) for the same event and time interval as Fig. 23, for three count energy ranges. The amplitude of the visibilities generally increases with increasing grid pitch (√decreasing value of the corresponding spatial frequency
u2 + v2). Highlighted by a red star in each plot in Fig. 24 is the (somewhat arbitrary) point (u∗ = −0.0042 arcsec−1, v∗ = −0.0422 arcsec−1); this point corresponds to a spatial periodicity 1/ u∗2 + v∗2 = 23.6 arcsec, which is the spatial periodicity corresponding to (i.e., twice the angular resolution of) RHESSI grid 4. The top panels of Fig. 25 show the amplitude |V (u∗, v∗; q)| and phase Arg(V [u∗, v∗; q]) of the differential count visibility spectrum for this representative point in the spatial frequency domain.

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

153

Fig. 24 (Color online) The observed count visibilities (amplitude and phase) for three representative energy bands. In each panel, the region between each pair of dotted vertical lines represents measurements with a single RMC at a ﬁxed spatial period, with the orientation of the measurement increasing from 0 to 180 degrees; successive regions correspond to different RMC’s. The angular resolutions for the re-

√ gions increase in a geometric progression (with a ratio of 3 between
successive regions), and span 7 arcseconds on the left to 183 arcseconds on the right. The point (u∗ = −0.0042 arcsec−1, v∗ = −0.0422 arcsec−1), highlighted with a red star in each plot, is used in the illus-
trative spectral plots of Fig. 25

In order to preserve the inherent linearity of the process, a polar to rectangular transformation was performed to convert the amplitude and phase information into real and imaginary components Re{V (u∗, v∗; q)} and Im{V (u∗, v∗; q)}. Each of these components was then subjected to the regularized inversion analysis of (232) to obtain the real and imaginary parts of the corresponding (regularized) electron visibility spectrum W (u∗, v∗; E) at the point (u∗, v∗). Through an inverse rectangular to polar transform, we then recover the amplitude and phase of the electron visibility spectrum W (u∗, v∗, E) at this particular point, as shown in the bottom panels of Fig. 25.
Repeating this regularized inversion process for each sampled point in the (u, v) plane (using a value of the regularization parameter λ[u, v] appropriate6 to each sampled
6The value of the regularization parameter λ[u∗,v∗] was chosen using the “3σ cumulative residual criterion” approach discussed in detail in Sect. 6.2; in general, such a procedure for determining λ[u,v] results in more faithful representations of electron ﬂux spectra than the commonly-used “discrepancy principle”.

(u, v) point), we arrive at complete information on the electron ﬂux visibility spectrum. This information is presented in Fig. 26 in the same format of Fig. 24.
We can now use the set of electron ﬂux visibility spectra to construct electron spectral ﬂux images in each energy range. Images of the electron ﬂux spectrum F (x, y; E), recovered by applying the MEM-NJIT algorithm, are shown in Fig. 27. These images represent the quantity of key physical interest.
Figure 27, like Fig. 23, shows evidence for two footpoints, again connected by a “strand” of coronal ﬂux. To the extent that variations in count intensity are a consequence of data noise, the regularization algorithm used to develop the electron ﬂux images of Fig. 27 removes such irregularities, resulting in a more coherent variation of source structure with energy. Consequently, the electron ﬂux images vary much more smoothly with energy, and the coronal “strand” is more persistent at low energies.
The footpoints in the electron images are seen to persist up to electron energies ∼ 75 keV, an energy signiﬁcantly greater than the maximum photon energy used. As pointed

154

M. Prato

Fig. 25 Top panels: Amplitude (left; counts cm−2 s−1 keV−1) and phase (right; degrees) of the count visibility spectrum V (u∗, v∗; q) at the point (u∗ = −0.0042, v∗ = −0.0422) in the spatial frequency
domain. Bottom panels: Amplitude (left; in units of 1050 electrons

cm−2 s−1 keV−1) and phase (right; degrees) of the corresponding electron spectrum visibilities W (u∗, v∗; E) at the same (u, v) point, ob-
tained through regularized inversion of (230) using the zero-order Tik-
honov method

out by Kontar et al. [48], information on the electron spectrum at high energies is indeed contained in the photon spectrum at lower energies, and can be faithfully extracted using the Tikhonov regularization procedure.
It is instructive to reconstruct the count images corresponding to the regularized electron spectral ﬂux images in Fig. 27 and compare them with the original spatial images obtained through processing of the raw count visibility data using the MEM algorithm. This comparison is presented in Fig. 28. The top row of ﬁgures shows the recovered count images at the energies shown, while the bottom row reproduces the original count-based images from Fig. 23. The original count-based images (lower row of images in Fig. 28) show evidence principally of a double-footpoint structure, with some additional evidence for an extension of the emission into the region between the footpoints (see, e.g., the 18–22 keV and 42–46 keV images). However, there is no clear systematic variation with count energy q, either of the intensity of this “strand” emission or of the relative intensity of the two footpoints. By contrast, the count images deduced from the regularized electron ﬂux images (upper row of images in Fig. 28) show much more clearly the evolution of the spatial structure with energy. The “strand” of emission between the footpoints is clearly evident up to 30 keV, but diminishes rapidly at higher energies, and the

relative intensity of the two footpoint sources is more independent of count energy q. These physically plausible enhancements in the image structure are recovered through use of the visibility-based regularized inversion technique, because of its inherent requirement that the source structure vary smoothly from one electron energy E to the next. This requirement in turn forces the count images to change more smoothly with count energy q than do the images deduced directly from the (noisy) data.
7.4 Physical Implications of the Results
The electron spectral ﬂux images of Fig. 27 are quite plausibly interpreted in terms of the collisional thick target model [12] of hard X-ray emission in solar ﬂares.
Consider three different spatial subregions in the source, labelled in Fig. 29. Two of these regions correspond to the footpoint sources visible at higher energies and the other one to similarly-sized regions located approximately midway between the two footpoints. The lower panel of Fig. 29 shows the areally-averaged7 electron ﬂux spectra (electrons
7To get the total count spectrum for each region [counts cm−2 s−1 keV−1], simply multiply the areally-averaged spectrum by the area of that region, viz. 14.4 × 14.4 = 207.36 arcsec2 (Footpoint 1),

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

155

Fig. 26 Recovered regularized electron ﬂux visibilities (scaled by 10−50), presented as in Fig. 24. The sharp minima in the amplitude
plots are real, and reﬂect the two-component nature of the source. The
y-intercept to the right is determined by the total ﬂux, while the rate

at which the amplitudes fall off (to the left) is determined by the size of the sources, and reﬂects, for example, the larger source size at 26– 30 keV. The broad similarity of the phase plots reﬂects the broadly similar location of the sources at various electron energies

cm−2 s−1 keV−1 arcsec−2), for each of these three subregions. The region labeled “Footpoint 1” is identical to the region highlighted in Fig. 23; comparison of the electron ﬂux spectra for this region (Fig. 23 and 29) show that the “dip” at ∼ (26−30) keV obtained using the “traditional” approach to imaging spectroscopy is indeed an artifact of the data truncation and overspill issues associated with identiﬁcation of the ﬂux in a local spatial region; the real electron ﬂux spectrum in this region is smooth and monotonically decreasing with energy E.
At low energies E ≈ 60 keV, the electron ﬂux at the more southern footpoint (Footpoint 2) is much smaller than that at the more northern footpoint (Footpoint 1). However, the spectrum of Footpoint 2 is very hard (δ 1) and by ∼ 60 keV the electron ﬂux at each footpoints has become roughly equal, as is apparent from the spatial images.
Above E ∼ 40 keV, the spectra corresponding to the two footpoint regions are visibly ﬂatter (harder) than that corresponding to the region between these footpoints. Such a re-
22.8 × 9 = 205.2 arcsec2 (Middle), and 14.4 × 14.4 = 207.36 arcsec2 (Footpoint 2), respectively.

sult is qualitatively consistent with the acceleration of electrons in a source midway between the footpoints, and the subsequent propagation of these electrons to the footpoints. To concentrate the observed degree of spectral hardening in the footpoints constrains the intervening column density to an upper limit N < E2/2K ∼ 2 × 1017[E(keV)]2 ∼ 3 × 1020 cm−2 (here K = 2π e4 2.6 × 10−18 cm2 keV2, where e is the electronic charge and the Coulomb logarithm). This in turn establishes an upper limit on the coronal density n ∼ N/d, where d is the distance between the coronal source and the footpoint parallel to the guiding magnetic ﬁeld. The plane-of-sky projected distance between the “middle” and “footpoint” sources is ∼ 10 arcsec ∼ 7 × 108 cm. Assuming a semicircular geometry for the loop connecting the footpoints, d ∼ (π/2) times this projected distance, i.e., ∼ 109 cm. We hence infer that the coronal density n < 3 × 1011 cm−3, an entirely reasonable constraint.
7.5 Summary
We have developed, and illustrated the effectiveness of, a new approach to solar hard X-ray imaging spectroscopy. In this approach, two-dimensional Fourier transforms of the

156

M. Prato

Fig. 27 Electron ﬂux spectral images corresponding to the regularized electron ﬂux spectral visibilities of Fig. 26, obtained through application of the MEM-NJIT algorithm [10]

Fig. 28 Top panels: Regularized count-based images corresponding to the electron ﬂux spectral images of Fig. 27, compared with (bottom panels) the original images from Fig. 23

image in the count domain are transformed, through a regularized inversion procedure that enhances features that persist over a range of energy channels, into Fourier transforms of the electron ﬂux maps. A ﬁnal image reconstruction based on an inverse Fourier transform then gives the electron ﬂux maps themselves. Because data obtained through rotating modulation collimator instruments such as RHESSI are con-

centrated into a relatively small number of discrete Fourier components (“visibilities”), this approach is highly effective at analyzing such data, and results in recovered spectra that are determined more precisely than with a method that involved regularized inversion of the count spectrum within a spatial subregion of the source (which necessarily involves a combination of spatial Fourier components). Application

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

157

Fig. 29 Top panels: Electron images in the energy ranges 22–26 keV and 42–46 keV, respectively. Three sub-regions of interest are labeled on each image. Two of these correspond to bright footpoint-like sources and one to a region midway between the footpoints. Bottom panel: Areally-averaged electron ﬂux spectra (electrons cm−2 s−1 keV−1 arcsec−2) for each of the three subregions shown

of the method to a ﬂare on February 20, 2002 yielded a series of electron ﬂux images. Varying smoothly with energy, these images in turn permitted recovery of smooth, regularized, electron ﬂux spectra at different regions in the source. Such smooth, regularized, electron ﬂux spectra contrast with those obtained using the more “traditional” approach to imaging spectroscopy (e.g., Fig. 23), in which unphysical features may result from focusing on a particular spatial region.
For the illustrative event studied, the electron spectra at the two bright chromospheric footpoints evident in the images were systematically harder than the spectrum obtained at similarly-sized region between the footpoints. Such a spectral hardening is broadly consistent with collisional modiﬁcation of an accelerated electron beam if the intervening density is less than 3 × 1011 cm−3.
In future papers, we intend to apply our new technique to a variety of ﬂare events. The resulting sample of electron ﬂux spectrum images provides the required input to the next stage of inquiry, wherein the nature of the physical processes affecting the bremsstrahlung-producing electrons is determined through analysis (e.g., [29]) of variation in the electron ﬂux spectrum throughout the source.

8 Conclusions
This paper wants to be a collection of recent results which show how the regularization theory for linear inverse problems can be applied to recover stable and physically meaningful solutions of real problems in solar X-ray and imaging spectroscopy. The leading wire of this kind of problems is the ill-posedness which characterizes them and the consequent inefﬁciency of any mathematical tool which does not account for this pathology. The regularization algorithms considered proved to be an effective way to clash the combined effect of the ill-posed nature of the models and the noise which affects the data measurements.
The opportunity to study the problems we faced in the paper has been given by the RHESSI mission, launched by NASA in 2002 with the aim of providing X-ray imaging and spectroscopy with unprecedented spatial and spectral resolution. The X-ray spectra provided by the spacecraft and the electron distribution in the Sun from which they have been produced via a collisional bremsstrahlung process are linked through an integral equation of Volterra type. Several probabilistic kernel have been analyzed, also accounting for the inﬂuence of anisotropies in the emission mechanism, and the differences in the corresponding reconstructed electron spectra have been discussed both in simulated and in real

158
cases to investigate the effects of a more or less realistic (and complicated) cross section.
The output of this model-independent problem has been used as input for a following Fredholm integral equation which, under a purely thermal interpretation of the bremsstrahlung process, describes the electron distribution in the source as a combination of Maxwellian functions weighted by the differential emission measure. This problem can be traced to the inversion of the Laplace transform, which is an extremely challenging problem due to the huge ﬁltering effect of the exponential kernel. In the paper we proposed to split the input electron spectrum in two parts and to apply to each of them a different regularization algorithm which better perform on recovering the corresponding solutions. As in the previous case, tests on both synthetic and real spectra have been provided.
In the last part of the paper we discussed the imaging spectroscopy item, which is one of the main goals of the RHESSI mission as it allows to obtain detailed energy spectrum for different portions of the solar chromosphere. While traditional imaging spectroscopy methods typically require notable computational efforts and affect the reconstructed local electron spectra with unphysical artifacts, we proposed a new method which avoids both drawbacks. The algorithm involves the regularized inversion of spectra of count visibilities, which are calibrated measurements of spatial Fourier components of the source distribution, followed by the application of a Fourier-based imaging reconstruction technique.
The computational methods described in the present paper, all based on the regularization theory for ill-posed problems, represent one possible approach to data analysis in solar X-ray imaging and spectroscopy. However, other techniques can in fact be applied. Inversion by means of statistical methods, for example, may explicitly account for the statistical properties of the noise affecting the data; or, more in general, the use of a Bayesian framework may allow encoding a priori information on the solution or on the model in the probability density functions introduced in the game. In the speciﬁc case of RHESSI measurements, a realistic strategy for imaging, could be the use of multiple deconvolution approaches, whereby the data provided by the nine RHESSI detectors are simultaneously processed to produce reconstructions of the source with enhanced spatial resolution. Finally, from the physical viewpoint, RHESSI data, characterized by such a notable observational quality, can be used to investigate more sophisticated astrophysical processes like free-bound recombination or electron propagation in the plasma during ﬂares. The results of these investigations, based on the application of computational tools like the ones described in this paper, may provide new fundamental insights toward the comprehension of the physical processes in the solar atmosphere and may represent the

M. Prato
conceptual basis of new solar missions like Solar Orbiter, in preparation by ESA for the near future.
Acknowledgements The author is most grateful to Prof. Michele Piana and Dr. Anna Maria Massone for the scientiﬁc collaboration. This work is partially supported by the Italian national research project Inverse Problems in Medicine and Astronomy, under contract PRIN 2006018748, the grant I/015/07/0 of the Italian ASI/INAF, the Gruppo Nazionale di Calcolo Scientiﬁco (GNCS) and the International Space Science Institute (ISSI) in Bern, Switzerland.
References
1. Abramowitz M, Stegun IA (1965) Handbook of mathematical functions. Dover, New York
2. Balakrishnan AV (1976) Applied functional analysis. Springer, New York
3. Bertero M (1989) Linear inverse and ill-posed problems. In: Hawkes PW (ed) Advances in electronics and electron physics. Academic, New York
4. Bertero M, De Mol C (1996) Super-resolution by data inversion. In: Wolf E (ed) Prog optics XXXVI. Elsevier, Amsterdam
5. Bertero M, De Mol C, Viano GA (1980) The stability of inverse problems. In: Baltes HP (ed) Inverse scattering problems in optics. Topics in current physics, vol 20. Springer, Berlin, pp 161–214
6. Bertero M, Boccacci P, Pike ER (1982) On the recovery and resolution of exponential relaxation rates from experimental data: a singular-value analysis of the Laplace transform inversion in the presence of noise. Proc R Soc Lond A Mat 383:15–29
7. Bertero M, Brianzi P, Pike ER (1985) On the recovery and resolution of exponential relaxation rates from experimental data: Laplace transform inversion in weighted spaces. Inverse Probl 1:1–15
8. Bertero M, De Mol C, Pike ER (1985) Linear inverse problem with discrete data. I: General formulation and singular system analysis Inverse Probl 1:301–330
9. Bertero M, De Mol C, Pike ER (1988) Linear inverse problem with discrete data. II: Stability and regularisation. Inverse Probl 4:573–594
10. Bong SC, Lee J, Gary DE, Yun HS (2006) Spatio-spectral maximum entropy method I: Formulation and test. Astrophys J 636:1159–1165
11. Brianzi P, Frontini M (1991) On the regularized inversion of the Laplace transform. Inverse Probl 7:355–368
12. Brown JC (1971) The deduction of energy spectra of non-thermal electrons in ﬂares from the observed dynamic spectra of X-ray bursts. Sol Phys 18:489–502
13. Brown JC (1972) The directivity and polarisation of thick target X-ray bremsstrahlung from solar ﬂares. Sol Phys 26:441–459
14. Brown JC (1974) On the thermal interpretation of hard X-ray bursts from solar ﬂares. In: Newkirk G (ed) Proc IAU symp on coronal disturbances, vol 57, pp 395–412
15. Brown JC, Emslie AG (1988) Analytic limits on the forms of spectra possible from optically thin collisional bremsstrahlung source models. Astrophys J 331:554–564
16. Brown JC, Melrose DB, Spicer DS (1979) Production of a collisionless conduction front by rapid coronal heating and its role in solar hard X-ray bursts. Astrophys J 228:592–597
17. Brown JC, Emslie AG, Kontar EP (2003) The determination and use of mean electron ﬂux spectra in solar ﬂares. Astrophys J 595:L115–L117
18. Brown JC, Emslie AG, Holman GD, Johns-Krull CM, Kontar EP, Lin RB, Massone MA, Piana M (2006) Evaluation of algorithms for reconstructing electron spectra from their bremsstrahlung hard X-ray spectra. Astrophys J 643:523–531

Regularization Methods for the Solution of Inverse Problems in Solar X-ray and Imaging Spectroscopy

159

19. Conway AJ, Brown JC, Eves BA, Kontar EP (2003) Implications of solar ﬂare hard X-ray “knee” spectra observed by RHESSI. Astron Astrophys 407:725–734
20. Craig IJD, Brown JC (1986) Inverse problems in astronomy: a guide to inversion strategies for remotely sensed data. Adam Hilger, Bristol
21. Davies AM (1992) Optimality in regularization. In: Bertero M, Pike ER (eds) Inverse problems in scattering and imaging. Adam Hilger, Bristol, pp 393–410
22. Davies B, Martin B (1979) Numerical inversion of the Laplace transform: a survey and comparison of methods. J Comput Phys 33:1–32
23. Eicke B (1992) Iteration methods for convexly constrained illposed problems in Hilbert spaces. Numer Funct Anal Opt 13:413– 429
24. Elwert G (1939) Accurate calculation of intensity and polarisation in continued X-ray spectra. Ann Phys-Berlin 34:178–208
25. Elwert G, Haug E (1970) On the polarization and anisotropy of solar X-radiation during ﬂares. Sol Phys 15:234–248
26. Elwert G, Haug E (1971) Anisotropy of solar hard X-radiation during ﬂares. Sol Phys 20:413–421
27. Emslie AG (1992) Overview of solar ﬂares. In: Schmelz JT, Brown JC (eds) The Sun: a laboratory for astrophysics. Dordrecht
28. Emslie AG, Coffey VN, Schwartz RA (1989) Is the ‘superhot’ hard X-ray component in solar ﬂares consistent with a thermal source? Sol Phys 122:313–317
29. Emslie AG, Barrett RK, Brown JC (2001) An empirical method to determine electron energy. Astrophys J 557:921–929
30. Emslie AG, Kontar EP, Krucker S, Lin RP (2003) RHESSI hard X-ray imaging spectroscopy of the large Gamma-ray ﬂare of 2002 July 23. Astrophys J 595:L107–L110
31. Essah WA, Delves LM (1988) On the numerical inversion of Laplace transform. Inverse Probl 4:705–724
32. Gluckstern RL, Hull MH (1953) Polarization dependence of the integrated bremsstrahlung cross section. Phys Rev 90:1030–1035
33. Gluckstern RL, Hull MH, Breit G (1953) Polarization of bremsstrahlung radiation. Phys Rev 90:1026–1029
34. Golub G, van Loan C (1996) Matrix computations. Johns Hopkins, London
35. Hadamard J (1923) Lectures on Cauchy’s problem in linear partial differential equations. Yale University Press, New Haven
36. Haug E (1972) Polarization of hard X-rays from solar ﬂares. Sol Phys 25:425–434
37. Haug E (1997) On the use of nonrelativistic bremsstrahlung cross sections in astrophysics. Astron Astrophys 326:417–418
38. Hénoux JC (1975) Anisotropy and polarization of solar X-ray bursts. Sol Phys 42:219–233
39. Holman GD, Sui L, Schwartz RA, Emslie AG (2003) Electron bremsstrahlung hard X-ray spectra, electron distributions, and energetics in the 2002 July 23 solar ﬂare. Astrophys J 595:L97– L102
40. Hurford GJ, Schmahl EJ, Schwartz RA, Conway AJ, Aschwanden MJ, Csillaghy A, Dennis BR, Johns-Krull C, Krucker S, Lin RP, McTiernan J, Metcalf TR, Sato J, Smith DM (2002) The RHESSI imaging concept. Sol Phys 210:61–86
41. Johns CM, Lin RP (1992) The derivation of parent electron spectra from bremsstrahlung hard X-ray spectra. Sol Phys 137:121–140
42. Kato T (1980) Perturbation theory for linear operators. Springer, New York
43. Keller J (1976) Inverse problems. Am Math Mon 83:107–118 44. Koch HW, Motz JW (1959) Bremsstrahlung cross section formu-
las and related data. Rev Mod Phys 31:920–955 45. Kontar EP (2001) Dynamics of electron beams in the inhomoge-
neous solar corona plasma. Sol Phys 202:131–149 46. Kontar EP, Pecseli HL (2002) Nonlinear development of electron
beam driven weak turbulence in a inhomogeneous plasma. Phys Rev E 65:066408

47. Kontar EP, Brown JC, Emslie AG, Schwartz RA, Smith DM, Alexander RC (2003) An explanation for nonpower-law behavior in the hard X-ray spectrum of the 2002 July 23 solar ﬂare. Astrophys J 595:L123–L126
48. Kontar EP, Piana M, Massone AM, Emslie AG, Brown JC (2004) Generalized regularization techniques with constraints for the analysis of solar bremsstrahlung X-ray spectra. Sol Phys 225:293– 309
49. Kontar EP, Emslie AG, Piana M, Massone AM, Brown JC (2005) Determination of electron ﬂux spectra in a solar ﬂare with an augmented regularization method: application to RHESSI data. Sol Phys 226:317–325
50. Kontar EP, Emslie AG, Massone AM, Piana M, Brown JC, Prato M (2007) Electron-electron bremsstrahlung emission and the inference of electron ﬂux spectra in solar ﬂares. Astrophys J 670:857–861
51. Kress R (1989) Linear integral equations. Springer, New York 52. Lagendijk R, Biemond J, Boeckee D (1988) Regularized itera-
tive image restoration with ringing reduction. IEEE Trans Acoust Speech Signal Process 36:1874–1888 53. Langer SH, Petrosian V (1977) Impulsive solar X-ray bursts. III— Polarization, directivity, and spectrum of the reﬂected and total bremsstrahlung radiation from a beam of electrons directed toward the photosphere Astrophys J 215:666–676 54. Leach J, Petrosian V (1983) The impulsive phase of solar ﬂares. II—Characteristics of the hard X-rays Astrophys J 269:715–727 55. Lin RP (1974) The ﬂash phase of solar ﬂares—Satellite observations of electrons. Space Sci Rev 16:201–221 56. Lin RP, Schwartz RA (1987) High spectral resolution measurements of a solar ﬂare hard X-ray burst. Astrophys J 312:462–474 57. Lin RP, Dennis BR, Hurford GJ, Smith DM, Zehnder A, Harvey PR, Curtis DW, Pankow D, Turin P, Bester M, Csillaghy A, Lewis M, Madden N, van Beek HF, Appleby M, Raudorf T, McTiernan J, Ramaty R, Schmahl E, Schwartz R, Krucker S, Abiad R, Quinn T, Berg P, Hashii M, Sterling R, Jackson R, Pratt R, Campbell RD, Malone D, Landis D, Barrington-Leigh CP, Slassi-Sennou S, Cork C, Clark D, Amato D, Orwig L, Boyle R, Banks IS, Shirey K, Tolbert AK, Zarro D, Snow F, Thomsen K, Henneck R, Mchedlishvili A, Ming P, Fivian M, Jordan J, Wanner R, Crubb J, Preble J, Matranga M, Benz A, Hudson H, Canﬁeld RC, Holman GD, Crannell C, Kosugi T, Emslie AG, Vilmer N, Brown JC, Johns-Krull C, Aschwanden M, Metcalf T, Conway A (2002) The Reuven Ramaty High-Energy Solar Spectroscopic Imager (RHESSI). Sol Phys 210:3–32 58. Massone AM, Piana M, Conway AJ, Eves B (2003) A regularization approach for the analysis of RHESSI X-ray spectra. Astron Astrophys 405:325–330 59. Massone AM, Emslie AG, Kontar EP, Piana M, Prato M, Brown JC (2004) Anisotropic bremsstrahlung emission and the form of regularized electron ﬂux spectra in solar ﬂares. Astrophys J 613:1233–1240 60. Massone AM, Piana M, Prato M (2008) Regularized solution of the solar bremsstrahlung inverse problem: model dependence and implementation issues. Inverse Probl Sci Eng 16:523–545 61. McWhirter JG, Pike ER (1978) On the numerical inversion of the Laplace transform and similar Fredholm integral equations of the ﬁrst kind. J Phys A, Math Gen 11:1729–1745 62. Mertz LN (1967) A dilute image transform with application to an X-ray star camera. In: Fox J (ed) Modern optics, proc symp modern optics, March 22–24 1967, pp 787–791 63. Piana M (1994) Inversion of bremsstrahlung spectra emitted by a solar plasma. Astron Astrophys 288:949–959 64. Piana M, Bertero M (1996) Regularized deconvolution of multiple images of the same object. J Opt Soc Am A 13:1516–1523 65. Piana M, Bertero M (1997) Projected Landweber method and preconditioning. Inverse Probl 13:441–463

160
66. Piana M, Brown JC, Thompson AM (1995) Thermal bremsstrahlung hard X-rays and primary energy release in ﬂares. Sol Phys 156:315–335
67. Piana M, Massone AM, Kontar EP, Emslie AG, Brown JC, Schwartz RA (2003) Regularized electron ﬂux spectra in the July 23, 2002 solar ﬂare. Astrophys J 595:L127–L130
68. Piana M, Massone AM, Hurford GJ, Prato M, Emslie AG, Kontar EP, Schwartz RA (2007) Electron ﬂux spectral imaging of solar ﬂares through regularized analysis of hard x-ray source visibilities. Astrophys J 665:846–855
69. Prato M, Piana M, Brown JC, Emslie AG, Kontar EP, Massone AM (2006) Regularized reconstruction of the differential emission measure from solar hard X-ray spectra. Sol Phys 237:61–83
70. Prince TA, Hurford GJ, Hudson HS, Crannell CJ (1988) Gammaray and hard X-ray imaging of solar ﬂares. Sol Phys 118:269–290
71. Ramaty R, Paizis C, Colgate SA, Dulk GA, Hoyng P, Knight JW, Lin RP, Melrose DB, Orrall F, Shapiro PR (1980) Energetic particles in solar ﬂares. In: Solar ﬂares: a monograph from Skylab Solar Workshop II. Colorado Associated University Press, Boulder, pp 117–185
72. Reed M, Simon B (1972) Methods of modern mathematical physics, vol I: Functional analysis. Academic, New York

M. Prato
73. Rudin W (1991) Functional analysis, 2nd edn. McGraw-Hill, New York
74. Schnopper HW, Thompson RI, Watt S (1968) Predicted performance of a rotating modulation collimator for local celestial X-ray sources. Space Sci Rev 8:534–542
75. Smith DM, Share GH, Murphy RJ, Schwartz RA, Shih AY, Lin RP (2003) High-resolution spectroscopy of gamma-ray lines from the X-class solar ﬂare of 2002 July 23. Astrophys J 595:L81–L84
76. Sneddon IN (1951) Fourier transforms. McGraw-Hill, New York 77. Sweet PA (1969) Mechanisms of solar ﬂares. Annu Rev Astron
Astrophys 7:149–176 78. Thompson AM, Brown JC, Craig IJD, Fulber C (1992) Infer-
ence of non-thermal electron energy distributions from hard X-ray spectra. Astron Astrophys 265:278–288 79. Tikhonov AN (1963) Regularization of incorrectly posed problems. Sov Math Dokl 4:1624–1627 80. Tikhonov AN (1963) Solution of incorrectly formulated problems and the regularization method. Sov Math Dokl 4:1035–1038 81. Varah JM (1983) Pitfalls in the numerical solution of linear illposed problems. SIAM J Sci Comput 4:164–176