ITTGNN_blog - Toy example / Algorithm

https://www.researchgate.net/profile/Liang-Liu-42/publication/339714816_Regression_multiple_imputation_for_missing_data_analysis/links/607f43e32fb9097c0cf91255/Regression-multiple-imputation-for-missing-data-analysis.pdf

https://arxiv.org/pdf/1511.03512.pdf

https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=576cd01f8e2c289651ae0a86374ac6d6abb787c1

모두 undirected data이고, 데이터셋 중 한 개만 directed(Wiki)임.

Model	Node_feature	Edge Index	Edge Weight	Edge Type
GConvGRU	O, Node features	O	O	X
GConvLSTM	O, Node features	O	O	X
GCLSTM	O, Node features	O	O	X
LRGCN	O, Node features	O	O	O
DyGrEncoder	O, Node features	O	O	X
EvolveGCNH	O, Node embedding	O	O	X
EvolveGCNO	O, Node embedding	O	O	X
TGCN	O, Node features	O	O	X
DCRNN	O, Node features	O	O	X

Define Spatio Temporal Graph data

Notations	Definitions or Description
${\cal G}$	Input Graph
$T$	the length of the time interval
$\cal{V}$	a set of Verteces
$V$	a set of an index node
$N$	the number of Verteces
$\cal{E}$	a set of undirected Edges
$\textbf{y}$	a graph signal, a function defined on the vertices of the graph $\cal{G}$
$\textbf{W}$	a weighted adjacency matrix

Graph ${\cal G} = (\cup_{t \in {\cal T}} {\cal V}_t, \cup_{t \in {\cal T}} {\cal E}_t, \textbf{W})$ ,${\cal T}:=\{1,\dots,T\}$
- $T$ is the length of the time interval.
- $\cal{V}$ is a set of Verteces. ${\cal V} = \{v_0,v_1,\cdots,v_{N-1}\}$
- $V$ is a set of an index node of $V = \{ 0, 1, \dots, N-1 \}$
  - $N$ is the number of Verteces. $|| = $$N$
- $\cal{E}$ is a set of undirected Edges
  - ${E} { ( x,y ) | x,y $$\text{ and } x \ne y \}$
a graph signal $\textbf{y}$: ${\cal V}_t \to \mathbb{R}^N$
- $\textbf{y}$ is a function defined on the vertices of the graph $\cal{G}$
  - $\textbf{y} = [y_{0,1},y_{0,2}, \cdots, y_{1,1},y_{1,2}, \cdots , y_{v,t}]^T$, $v \in V$, $t \in {\cal T}:=\{1,\dots,T\}$
$\textbf{W}_{T \times T}$ is a weighted adjacency matrix and interpreted as a graph shift operator.
- Suppose Graph $\cal{G}$ is a undirected cyclic graph and has time series periodic data, which allows to consider graph adjacency matrix as $\begin{bmatrix} 0 & 1 & \cdots & 0 \\ 1 & 0 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & \cdots & 1 & 0 \end{bmatrix}$

Define Complete Data

Define ${\cal Y} = ({\cal O} , {\cal M})$

A incomplete data ${\cal O}$
- ${\cal Y} = \{ y_{v,t}: t \in {\cal T}, v \in V\}$
- ${\cal Y} = \textbf{Y}_{N \times T}$
- ${\cal Y} = \textbf{y}$
A missing data ${\cal M} = \{ y_{v,t_v}: t_v \in {\cal M}_v, v \in V \}$
- ${\cal M}_{v_1}$: $v_1$ 노드에서 결측인 $t$들의 집합
A observed data ${\cal O} = \{ y_{v,t_v}: t_v \in {\cal O}_v, v \in V \}$.
- ${\cal O}_{v_1}$ : $v_1$ 노드에서 관측된 $t$들의 집합

Conduct the linear interpolation and Update ${\cal M}$ by $\tilde{{\cal Y}} = ( {\cal O} , \tilde{ {\cal M} } )$

${\cal M}$ is imputated by $\tilde{{\cal M}}$.

Repeat steps Laplacian, Eigenvalue Decomposition, Graph Fourier Transform, Ebayes Thresholding, Inverse Graph Fourier Transform, respectively

Define nomalized Laplacian matrix $\textbf{L}$ from graph shift operator $W$.

$\textbf{L} = \textbf{D}^{-1/2} (\textbf{I} - \textbf{W}) \textbf{D}^{-1/2} = \textbf{I} - \bar{\textbf{W}}$

Calculate Eigenvalue Decomposition

$\textbf{L} = \mathbf{V} \Lambda \mathbf{V}^H$

Apply Graph Fourier Transform

${{\mathbf V}}^H {\bar {\textbf Y}} = \tilde{\textbf{Y}}$
- Transforming a function from the time domain to the frequency domain.
- $p = E([{{\mathbf V}}^H {\bar {\textbf Y}}]^2)$

Estimate power of $\bar{\textbf{Y}}$

$\hat{p} := \frac{1}{{\cal R}} \sum^{\cal R}_{r=1} |\bar{y}_r|^2$
- estimation of $p$
- $\bar{Y}$ is a set of vectors $_{i} : $$\to \mathbb{R}^N$, $i \in T$
- ${\cal R}$: a finite set of realization of the process $\bar{y}$

Get $\hat{\textbf{p}}_{tr}$ by thresholding $\hat{p}$ with ebayes thresholding

Estimate ${\hat p}_{tr}$ by getting median of this posterior distribution of ${\hat p}$

Calculate $\hat{\textbf{Y}}_{tr}$ from Inverse Graph Fourier Transform

Get $\hat{\cal{M}}$ from values of a set of Indexed missing data on $\hat{\textbf{Y}}_{tr}$
Define $\hat{{\cal Y}} = ({\cal O}, \hat{{\cal M}})$

Learn STGCN with $\hat{{\cal Y}}$, and Get $\hat{{\cal Y}}_{stgcn}$

STGCN function $:= f(\cdot)$
Get $\hat{{\cal M}}^{(1)}$ from values of a set of Indexed missing data on $\hat{{\cal Y}}_{tgnn}$
Imputatation from $\hat{{\cal M}}$ to $\hat{{\cal M}}^{(1)}$
- $\hat{{\cal Y}}^{(1)} = ({\cal O}, \hat{{\cal M}}^{(1)})$

Repeat every steps

Get $\hat{{\cal Y}}^{({\cal L})} = ({\cal O}, \hat{{\cal M}}^{({\cal L})})$
${\cal L}$ is the iteration.
- ${\cal L} :=\{ 2,\cdots , L\}$

Calculate MSE

Caculate MSE = $\frac{1}{NT} \sum^N_{v=1} \sum^T_{t=1} (y_{v,t} - (\hat{y}_{v,t})^{(L)}) ^2$

$\hat{{\cal Y}}^{(L)} = ({\cal O}, \hat{{\cal M}}^{(L)})$
${\cal Y}_{com} = \{ y_{v,t}: v \in {\cal V}, t \in {\cal T} \}$

Algorithm 1: GFT, EbayesThresh and STGCN

Input $\tilde{\cal{Y}} = ( {\cal O} , \tilde{ {\cal M} } )$($={\tilde{\textbf Y}}$)

Output $\hat{{\cal Y}}^{(L)}$

for $l$ =1, ${\cal L} :=\{ 1,2,\cdots , L\}$ do

Apply GFT ${\mathbf V}^H {\tilde {\textbf Y}} = \bar{\textbf{Y}}$

$\hat{p} \leftarrow \frac{1}{{\cal R}} \sum^{\cal R}_{r=1} |\bar{y}_r|^2$

EbayesThresh $\breve{\textbf{p}} \leftarrow \hat{\textbf{p}}$

${\breve {\textbf Y}} = \sqrt{\breve{\textbf{p}}}$

Apply Inverse GFT ${\mathbf V} {\breve {\textbf Y}} = \breve{\textbf{Y}}_{tr}$

${\cal{\tilde{M}}} \leftarrow \breve{ {\cal M}_{tr} }$

$\bar{\cal{Y}} = ( {\cal O} , \breve{ {\cal M}_{tr} } )$

$\text{ STGCN } (\bar{\cal{Y}}) \leftarrow \bar{{\cal Y}}^{(l)}_{gcn}$

${\bar {\cal M} }^{(l)}_{gcn} \leftarrow \hat{ {\cal M} }$

end for

Algorithm 2: Self Consistency

Input $\hat{g}_{obs}(\cdot) = \tilde{\cal{Y}}$

Output $\hat{g}_{com}(\cdot) = {\cal \bar{Y}}^{(l)}$

$E(\bar{g}_{com} |{\cal O}; g = \bar{g}_{obs}) = \bar{g}_{obs}$

${\cal \bar{Y}}^{(l)} \leftarrow \tilde{\cal{Y}}$

Algorithm 3: EbayesThresh

Input $\hat{\textbf{p}}$

Output $\breve{\textbf{p}}$

$\hat{\textbf{p}} = \textbf{p}_{pp} + \textbf{p}_{ac}$

$\breve{\textbf{p}} \leftarrow \textbf{p}_{pp}$

Toy Example

Define data

Graph ${\cal G} = (V, E,$$\textbf{W})$

$\cal{V}$ is a set of Verteces. ${\cal V} = \{ v_0, v_1, v_2 \}$, $|{\cal V}| = 3$, $V = \{ 0,1,2\}$
$\cal{E}$ is a set of undirected Edges. ${\cal E} =\{ ( 0, 1), ( 0, 2) \}$
a graph signal $\textbf{y}$: ${\cal V}_t \to \mathbb{R}^3$
$T = 10$
${\cal Y} = \{y_{0,1},y_{0,2},\cdots,y_{2,9},y_{2,10} \}$
- ${\cal Y} = \textbf{y}$
$\textbf{W}_{10 \times 10}$ is a weighted adjacency matrix.

$\textbf{W}_{10 \times 10} = \begin{bmatrix} 0 & 1 & 0 &0 & 0 & 0 &0 & 0 & 0 & 0 \\ 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0 &0&0 \\ 0 & 1 & 0 & 1 & 0 & 0 & 0 & 0 &0&0\\ 0 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 \end{bmatrix}$

$y_{v,t} \to y_{v,t-1}$이 이어져 있다고 가정하면 위와 같이 $\textbf{W}$가 정의 됌.
$v$ 자기 자신끼리는 time series peridic data로 이어져 있다고 가정

Incomplete data

${\cal y} = ( {\cal O}, {\cal M} )$, 노드 3개(0,1,2), 타임 10 $\therefore 3 \times 10= 30$

A incomplete data ${\cal Y} = \{ y_{0,0},y_{0,1},\cdots, y_{1,0},y_{1,1},\cdots, y_{2,0},y_{2,1},\cdots \}$
- ${\cal Y} =$ $\textbf{Y}_{3 \times 10}$
A missing data ${\cal M} = \{ y_{0,3}, y_{0,4},\cdots, y_{1,3},y_{1,8},\cdots,y_{2,4},y_{2,7},\cdots \}$ -> 임의로 노드별로 50%의 결측값이 뿌려져 있다고 가정
A observed data ${\cal O} = \{ y_{0,0},\cdots,y_{1,0},\cdots,y_{2,0}\cdots \}$. -> 결측값이 아닌 모든 관찰된 값

Conduct the linear interpolation $\bar{{\cal Y}} = ( {\cal O} , \tilde{ {\cal M} } )$

${\cal M}$ is imputated by $\bar{{\cal M}}$.

Repeat

행렬 확인차 여기 예제에서만 기호 오른쪽 아래에 행 * 열 입력

그래프 라플라시안

$\textbf{L}_{10 \times 10} = \textbf{D}^{-1/2}_{10 \times 10} (\textbf{I}_{10 \times 10} - \textbf{W}_{10 \times 10}) \textbf{D}^{-1/2}_{10 \times 10} = \textbf{I}_{10 \times 10} - \tilde{\textbf{W}}_{10 \times 1|0}$

고유값 분해

$\textbf{L}_{10 \times 10} = \mathbf{V}_{10 \times 10} \Lambda_{10 \times 10} \mathbf{V}^H_{10 \times 10}$

그래프 푸리에 변환

${\mathbf{V}}^H_{10 \times 10}$ $\bar{\textbf{Y}}_{10 \times 3}$ $= \tilde{\textbf{Y}}_{10 \times 3}$

이베이즈

$\tilde{y}^2$ = $\hat{p}$ and $\hat{p}_{tr}$ = $10 \times 3$ matrix
노드 수만큼 3번 반복

역 그래프 푸리에 변환

$\textbf{V}_{10 \times 10} \tilde{\textbf{Y}}^{'}_{10 \times 3} = \hat{\textbf{Y}}_{tr,10 \times 3}$
$\hat{{\cal Y}}_{tr} = \hat{\textbf{Y}}_{tr,10 \times 3}$
여기서 구한 $\hat{\cal{Y}}_{tr}$ = $(\hat{\cal{O}},\hat{\cal{M}})$ 기존에 구한 $\bar{{\cal Y}} = ( {\cal O} , \tilde{ {\cal M} } )$
결측값 부분만 imputation $\hat{{\cal Y}} = ({\cal O}, \hat{{\cal M}})$

모델로 학습

$\hat{{\cal Y}}$로 TGNN 학습 후 $\hat{{\cal Y}}_{tgnn}$ 얻고 거기서 $\hat{{\cal M}}^{(1)}$ 얻기
$\hat{{\cal M}}$을 $\hat{{\cal M}}^{(1)}$로 대체
- $\hat{{\cal Y}}^{(1)} = ({\cal O}, \hat{{\cal M}}^{(1)})$

epoch = 20 가정

$\hat{{\cal Y}}^{(20)} = ({\cal O}, \hat{{\cal M}}^{(20)})$ 을 얻음

$\hat{{\cal Y}}^{(20)} = ({\cal O}, \hat{{\cal M}}^{(20)})$ 를 이용해서 에러 구하기

MSE = $\frac{1}{30} \sum^3_{v=1} \sum^{10}_{t=1} (y_{v,t_v} - (\hat{y}_{v,t_v})^{(20)}) ^2$