calculate_AICコマンドマニュアル

(The documentation of calculate_AIC command)

Last Update: 2023/2/16

◆機能・用途(Purpose)

観測データとモデルによる予測値との残差を与えて赤池情報量基準(AIC; Akaike, 1974)の値を計算する。波形インバージョンだけでなく一般的な逆問題に対して利用できる。
Compute the value of Akaike Information Criterion (AIC; Akaike, 1974) given the residual between observed data and values predicted by a model. The use of this program is not limited to waveform inversion; this program is broadly applicable to various inverse problems.

◆ソースコード(Source code)

$YMAEDA_OPENTOOL_DIR/winv/src/calculate_AIC.c

◆使用方法(Usage)

コマンドを引数無しで実行する。必要なパラメータを聞かれるので対話的に入力する。
Simply execute the command (no command-line arguments) and enter the parameters interactively.

◆動作(Behaviour)

このコマンドを実行すると残差、データサンプル数、モデルサンプル数の入力が順に求められる。これらを入力していけばAICの値が標準出力に表示される。
When you execute this command, you are sequentially asked the residual value, the number of data samples, and the number of model parameters. After answering these questions, the program displays the AIC value.

◆使用例(Example)

以下で緑はユーザが入力する。
Green below are entered by the user.

calculate_AIC
残差を入力して下さい。なお、残差は√(残差ベクトルの2乗和/観測データベクトルの2乗和)で定義される小数値とします(%には直さない) : 0.5
データサンプル数を入力して下さい : 100
モデルパラメータ数を入力して下さい : 10
AIC = -1.186294e+02

calculate_AIC
Enter the residual defined as the square root of the square summation of the residual vector to that of the observed data vector. Make sure to enter it as a decimal value (do not convert to percentage) : 0.5
Enter the number of data samples : 100
Enter the number of model parameters : 10
AIC = -1.186294e+02

◆計算式(Formula)

観測データベクトルを$\myvector{d^{obs}}$、理論データベクトルを$\myvector{d^{syn}}$とし、残差を \[\begin{equation} E\equiv\sqrt{\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2} {(\myvector{d^{obs}})^2}} \label{eq.E} \end{equation}\] と定義する($\sqrt{\hspace{0.5em}}$を取っていることに注意)。データサンプル数を$N$、モデルパラメータ数を$M$とするとき、 AICは次式で計算される。 \[\begin{equation} AIC=2N\ln E+2M \label{eq.AIC.use} \end{equation}\] Let $\myvector{d^{obs}}$ and $\myvector{d^{syn}}$ be the observed and synthetic data vectors, respectively. Let the residual $E$ be defined by eq. (\ref{eq.E}); note that $\sqrt{\hspace{0.5em}}$ is taken. Let $N$ and $M$ be the numbers of data samples and model parameters, respectively. Then AIC is calculated by eq. (\ref{eq.AIC.use}).

●導出(Derivation)

AICは最大尤度$L_{max}$を用いて \[\begin{equation} AIC\equiv -2\ln L_{max}+2M \label{eq.AIC.definition} \end{equation}\] で定義される。観測データベクトルと理論データベクトルの $i$番目の成分をそれぞれ$d_i^{obs}$、$d_i^{syn}$と表すことにし、その残差が平均0、標準偏差$\sigma$の正規分布 \[\begin{equation} P(d_i^{obs}-d_i^{syn}) =\frac{1}{\sqrt{2\pi}\sigma} \exp\left[-\frac{(d_i^{obs}-d_i^{syn})^2}{2\sigma^2}\right] \label{eq.P} \end{equation}\] に従うとき、尤度は \[\begin{eqnarray} L &=& \frac{1}{(2\pi)^{N/2}\sigma^N} \exp\left[-\frac{1}{2\sigma^2}\sum_{i=1}^N(d_i^{obs}-d_i^{syn})^2\right] \nonumber \\ &=& \frac{1}{(2\pi)^{N/2}\sigma^N} \exp\left[-\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2} {2\sigma^2}\right] \label{eq.L} \end{eqnarray}\] と書ける。これより対数尤度は \[\begin{eqnarray} \ln L &=& -\frac{N}{2}\ln(2\pi)-N\ln\sigma -\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2}{2\sigma^2} \label{eq.lnL} \end{eqnarray}\] となる。尤度の最大値を与える$\sigma$は \[\begin{equation} \PartialDiff{(\ln L)}{\sigma} =-\frac{N}{\sigma}+\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2}{\sigma^3} =0 \label{eq.Lmax.condition} \end{equation}\] から求められ、 \[\begin{equation} \sigma^2=\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2}{N} \label{eq.sigma2} \end{equation}\] である。(\ref{eq.sigma2})を(\ref{eq.lnL})に代入すると最大対数尤度は \[\begin{eqnarray} \ln L_{max} &=& -\frac{N}{2}\ln(2\pi) -N\ln\sqrt{\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2}{N}} -\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2} {2\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2}{N}} \nonumber \\ &=& -\frac{N}{2}\ln(2\pi) -\frac{N}{2}\ln\frac{(\myvector{d^{obs}}-\myvector{d^{syn}})^2}{N} -\frac{N}{2} \nonumber \\ &=& -\frac{N}{2}\ln(2\pi) -\frac{N}{2}\ln\left[(\myvector{d^{obs}}-\myvector{d^{syn}})^2\right] +\frac{N}{2}\ln N -\frac{N}{2} \label{eq.lnLmax.derive} \end{eqnarray}\] となり、(\ref{eq.E})式を代入して \[\begin{eqnarray} \ln L_{max} &=& -\frac{N}{2}\ln(2\pi) -\frac{N}{2}\ln\left[(\myvector{d^{obs}})^2E^2\right] +\frac{N}{2}\ln N -\frac{N}{2} \nonumber \\ &=& -\frac{N}{2}\ln(2\pi) -\frac{N}{2}\ln\left[(\myvector{d^{obs}})^2\right] -\frac{N}{2}\ln(E^2) +\frac{N}{2}\ln N -\frac{N}{2} \nonumber \\ &=& -\frac{N}{2}\ln(2\pi) -\frac{N}{2}\ln\left[(\myvector{d^{obs}})^2\right] -N\ln E +\frac{N}{2}\ln N -\frac{N}{2} \nonumber \\ &=& -\frac{N}{2}\ln\frac{2\pi(\myvector{d^{obs}})^2}{N} -\frac{N}{2} -N\ln E \label{eq.lnLmax} \end{eqnarray}\] を得る。(\ref{eq.lnLmax})を(\ref{eq.AIC.definition})に代入すると \[\begin{equation} AIC=N\ln\frac{2\pi(\myvector{d^{obs}})^2}{N}+1+2N\ln E+2M \label{eq.AIC.exact} \end{equation}\] となり、モデルや残差によらない定数部分を無視すれば (\ref{eq.AIC.use})式が得られる。
AIC is defined by eq. (\ref{eq.AIC.definition}), where $L_{max}$ is the maximum likelihood. Let $d_i^{obs}$ and $d_i^{syn}$ be the $i$th components of observed and synthetic data vectors, respectively, and let us assume that the residual between them obeys the normal distribution of average 0 and a standard deviation $\sigma$ (eq. \ref{eq.P}). Then the likelihood is given by eq. (\ref{eq.L}), and the log likelihood is given by eq. (\ref{eq.lnL}). The likelihood becomes maximal when eq. (\ref{eq.Lmax.condition}) is satisfied. Solving this equation for $\sigma$ gives eq. (\ref{eq.sigma2}). Inserting this equation into (\ref{eq.lnL}) results in (\ref{eq.lnLmax.derive}), and inserting eq. (\ref{eq.E}) into this relation gives eq. (\ref{eq.lnLmax}). Inserting this result into (\ref{eq.AIC.definition}), we obtain (\ref{eq.AIC.exact}). Ignoring constant terms which do not depend on the model and residual, eq. (\ref{eq.AIC.use}) is obtained.

なお、厳密な式(\ref{eq.AIC.exact})を用いる方が一見良さそうに思えるが、データベクトルは通常、物理量の単位を持っており、その単位の取り方によって(\ref{eq.AIC.exact})式の値も変わってくる。そのため単位系依存の無い(\ref{eq.AIC.use})式を採用している。
Apparently, the exact relation of eq. (\ref{eq.AIC.exact}) seems better. Indeed, the data vector usually has a unit of physical quantity, and the choice of the unit affects the value of eq. (\ref{eq.AIC.exact}). To avoid it, eq. (\ref{eq.AIC.use}) is adopted, which does not depend on the choice of the physical unit.

◆引用文献 (References)

Akaike H (1974) A new look at the statistical model identification, IEEE Trans Autom Control 19(6), 716-723. https://doi.org/10.1109/TAC.1974.1100705

calculate_AICコマンド マニュアル