1. intro

- 초록색은 나쁘지 않음. 하지만 아래의 내용을 보완하는게 좋음.

분야의 예시로 신경과학, 환경데이터, 교통자료가 있는데 우리가 실제로 분석한 자료들이 사용된 논문을 찾아보며 예시를 들것 (Chickenpox, …) 사용하지 않더라도 예시를 들것.
이러한 자료를 분석하는것이 왜 어려운지 설명할 것. 즉 단순히 시계열로 해석하거나 공간자료로 해석하면 어떠한 문제가 있는지 간단히 서술할 것. (1~2문장) 레퍼런스 찾을것. (torch_geometric_temporal 의 도입부분 활용)

기존

In recent years, the field of spatiotemporal datasets has emerged, enabling the simultaneous con- sideration of both the time and space dimensions. The examples include neuroscience(Atluri et al., 2016), environmental data(Thompson et al., 2014), traffic dynamics(Castro et al., 2013), and more. Specifically, traffic dynamics is a prevalent spatiotemporal dataset and is crucial because examining traffic data from both spatial and temporal perspectives can lead to advancements in traffic control. The incorporation of both spatial and temporal aspects enables a comprehensive understanding of complex phenomena, making spatiotemporal datasets invaluable for various applications and en- hancing the accuracy of predictive models.

참고

PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models

At the same time the existing geometric deep learning frameworks operate on graphs which have a fixed topology and it is also assumed that the node features and labels are static. Besides limiting assumptions about the input data, these off-the-shelf libraries are not designed to operate on spatiotemporal data.

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting

Classic statistical and machine learning models are two major representatives of data-driven methods. In time- series analysis, autoregressive integrated moving average (ARIMA) and its variants are one of the most consolidated approaches based on classical statistics [Ahmed and Cook, 1979; Williams and Hoel, 2003]. However, this type of model is limited by the stationary assumption of time sequences and fails to take the spatio-temporal correlation into account. Therefore, these approaches have constrained representabil- ity of highly nonlinear traffic flow. Recently, classic statistical models have been vigorously challenged by machine learning methods on traffic prediction tasks.
Due to the high nonlinearity and complexity of traffic flow, tradi- tional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often ne- glect spatial and temporal dependencies.

수정

In recent years, the field of spatiotemporal datasets has emerged, enabling the simultaneous consider- ation of both the time and space dimensions. The examples include health data(Rozemberczki et al., 2021b), customer data(Rozemberczki et al., 2021a), energy data(Rozemberczki et al., 2021a), neu- roscience(Atluri et al., 2016), environmental data(Thompson et al., 2014), traffic dynamics(Castro et al., 2013), and more. Specifically, traffic dynamics is a prevalent spatiotemporal dataset and is crucial because examining traffic data from both spatial and temporal perspectives can lead to ad- vancements in traffic control. Classic time-series statistical methods to analyze those kind of data already exist, but they are limited by certain conditions, such as assumptions about the data. Specif- ically, these classic methods cannot account for spatiotemporal correlations and are not designed to work with spatiotemporal data(Yu et al., 2017; Rozemberczki et al., 2021a). In result, when we analize spatiotemporal data to use enough information, we can improve accuracy during us- ing appropriate geometric deep learning frameworks.

- 붉은부분

의도는 좋으나 sparse data 는 올바르지 않은 표현임. missing, irregulary observed data 등으로 설명할 것.
이러한 자료가 왜 발생하는지 설명할 것. (이부분은 레퍼런스 필요) 이러한 자료를 처리하는 것이 어려운 이유를 설명할 것.[1]
우리의 아이디어는 “호모지니우스하지 않은 그래프 -> 호모지니우스화 시킴” 인데 이러한 방식은 이상한방식이 아님. Yu et al. (2017) and Guo et al. (2019) Bai et al. (2020), Li et al. (2019), Zhao et al. (2019) 이 우리와 비슷한 연구를 했음.

-기존

However, when dealing with spatiotemporal datasets, sparse data is a common occurrence, which is unpredictable. For example, the sensor data from machines representing a spatiotemporal dataset may contain missing values due to unexpected events like sensor malfunction or temporal factors such as distance or time delay. It is a simple way to use interpolation methods like linear, nearest, etc. However, these methods can occasionally be imprecise in producing estimates. Moreover, in a method of learning spatiotemporal data Yu et al. (2017) and Guo et al. (2019) try to learn data after making it to be complete, i.e., allocate to other values from missing data with linear interpolation. Graph Convolution Network(GCN) is also a needed interpolation method before learning. Furthermore, Bai et al. (2020), Li et al. (2019), Zhao et al. (2019) tried to fill missing values by linear interpolation.

- 참고

Traffic Speed Prediction with Missing Data based on TGCN

In addition, there usually contains missing values in the collected data of traffic sensors due to the electronics unit failure. As is shown in Fig.1, There exist a lot of missing values during 22:00-24:00. This can decrease the prediction accuracy of aforementioned prediction models.
For the proposed model, if the input time series contains missing values, the model will produce failure because of the missing values can not be computed during the training process.

Missing Data: Our View of the State of the Ar

Why do missing data create such difficulty in scientific research? Because most data analysis procedures were not designed for them. Missingness is usually a nuisance, not the main focus of inquiry, but handling it in a principled manner raises conceptual difficulties and computational challenges

LSTM-based traffic flow prediction with missing data

Nevertheless, due to missing data, irregular sampling, and varying length, the data remain difficult to explore with high efficiency. In a traffic environment, this problem becomes even worse because the traffic sensors are often controlled manually.

Graph neural networks: A review of methods and applications

Homogeneous/Heterogeneous Graphs. Nodes and edges in ho- mogeneous graphs have same types, while nodes and edges have different types in heterogeneous graphs. Types for nodes and edges play important roles in heterogeneous graphs and should be further considered.

T-GCN: A Temporal Graph Convolutional Network for Traffic Prediction

Since the Los-loop dataset contained some missing data, we used the linear interpolation method to fill missing values.

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting

The linear interpolation method is used to fill missing values after data cleaning. In addition, data input are normalized by Z-Score method.

Adaptive Graph Convolutional Recurrent Network for Traffic Forecasting

Data Preprocess: The missing values in the datasets are filled by linear interpolation. Then, both datasets are aggregated into 5-minute windows, resulting in 288 data points per day.

- 수정

Dealing with spatiotemporal datasets often presents a common challenge, which is the frequent occurrence of irregularly observed data. For instance, as highlighted by (Ge et al., 2019), traffic sensor data commonly suffers from missing observations due to electronic unit failures, which can significantly impact prediction accuracy. The difficulty in handling irregular data arises for several reasons. First, many traditional data analysis procedures were designed for datasets with complete observations Schafer & Graham (2002). Second, when dealing with time-series datasets containing missing data, attempting to learn from such data can lead to challenges as it may result in the failure to capture certain time points Ge et al. (2019); Tian et al. (2018). That’s the reason why it’s important to transform incomplete data into complete data before conducting any learning or analysis.

Before move on to introduce our purpose, we need a definition of graph signal. To describe the geometric structures of data domain, graphs are well known as generic data representation forms(Shuman et al., 2013). So in this paper, we interpret data as Gt = (Vt,Et), V means ver- tics and E means edges. On specific Gt, it has a finite collection of samples and we call it as a graph signal(Shuman et al., 2013). Now, we would like to show the purpose of this paper that is mak- ing complete data when we approach the irregularly data. To satisfy this condition, we recognize the data not the heterogeneous graph, but the homogeneous graph by interpolation. Homogeneous graphs have same types of nodes and edges, and hetero geneous graphs have different types of them(Zhou et al., 2020). In a method of learning spatiotemporal data, Bai et al. (2020); Zhao et al. (2019); Yu et al. (2017); Guo et al. (2019) try to learn data after making it to be complete, i.e., allocate to other values from missing data with linear interpolation. In our proposed method, it is crucial to rightly estimate the underlying function when training spatiotemporal dataset because the functions define the expected pattern of the data. And that pattern would affect to read the trend of datasets. However it can be hard to estimate when it has many percentage of missing date.

- 아래식은 틀렸음. 이건 회귀모형이 아님.. GNAR의 notation을 사용하여 모형을 다시표현해볼것..

이부분이 아주 클리어 해야함
사용하는 대부분의 Notations들이 정리되어야함.
intro에 쓰는 것이 부담스러우면 제외해도 무방
뒤에 self consistence estimator에 사용할 Notation을 함께 고려

- 빨간부분 삭제후 다시 작성 (혹은 공부할 것)

- 초록색부분은 나쁘지 않음

기존

After interpolation to learn dataset, we can write a model as \[y_i =f(x_i)+ε_i,\] f(xi) represents the underlying function, and εi is thought to follow a normal distribution. In this paper, we try to train yi as eliminate sparse strong signal of εi to get lower mean square error between test data and predicted data. In other word, we study to remain εi without points which can consider heavier tails. In our proposed method, it is crucial to rightly estimate the underlying function f when training spatiotemporal dataset because the functions define the expected pattern of the data. And that pattern would affect to read the trend of datasets. However it can be hard to estimate f when it has many percentage of missing date.

수정

삭제함

2. Related works

- 2.1과 2.2를 왜 리뷰하는지 설명이 필요함

- 2.1에서 왜 Convolution Operator에 집중하는지 설명이 필요

- 2.2에서 왜 Dynamic graphs에 집중하는지 설명이 필요

- 전체적으로 이름은 related works인데 뭐가 related 되어있길래 이런것들을 소개하는지 클리어하지 않음. (솔직히 저도 저 방법들이 우리랑 뭔 관련있는지 잘모르겠어요)

- 기존

RELATED WORK

2.1 PROPAGATION MODULE

To start build the model with the simple graph structure, we can use computational modules which are the propagation module, the sampling module, and the pooling module. Especially, the prop- agation module is a commonly used computational module. It utilizes convolution and recurrent operators to aggregate information about neighbors. The skip operation is a rule of gathering infor- mation from past representations and mitigating the over-smoothing problem. It can be divided into two types: convolutional and recurrent operator(Zhou et al., 2020) and we focus on colvolutional operator. 2Under review as a conference paper at ICLR 2024

Convolutional Operator The convolutional operator can be considered a combination of spectral and spatial methods. First, there are a few classic models, which are spectral approaches: Spec- tral Network, ChebNet, and Adaptive Graph Convolution Network(AGCN). The spectral network is proposed by Bruna et al. (2013), which is defined as the characteristics of convolutions in the Fourier domain, which are determined by the eigendecomposition of the graph Laplacian. ChebNet, suggested by Defferrard et al. (2016), employed the K-localized convolution to construct a convolu- tional neural network that could avoid calculating eigenvectors of Laplacian. AGCN(Li et al., 2018) follows the relationship of the spatial aspect, at the same time, uses the residual graph Laplacian, and Li et al. (2018) called it an Adaptive graph. Next are the spatial approaches. The concept of Neural Frames Per Second (Neural FPS) is introduced by Duvenaud et al. (2015). They utilize dif- ferent weight matrices based on nodes with different degrees, but this approach may not be scalable to handle large-scale data. There is a model called Patchy-san proposed by Niepert et al. (2016). In the first step of this model, they select k numbers of neighbors of nodes. After normalizing around k neighbors, the model functions as a receptive field. The Diffusion-Convolutional Neural Net- works(DCNNs) of Atwood & Towsley (2016) are also considered the neighbor between nodes and can be used in classification by changing edges and adjacency matrix. DCNN uses the metrics of transition to get the neighborhood for nodes. The Dual Graph Convolutional Networks (DGCN) pro- posed by Zhuang & Ma (2018) consider local and global consistency. Gao et al. (2018) proposed the Learnable Graph Convolutional Networks (LGCN), which is based on the Learnable Graph Convo- lutional Layer, and the layer transforms the graph into a 1-D format, taking into account the number of nodes for definition.

2.2 GRAPH TYPE AND SCALE

It is important to consider there is not the only simple type graphs. So, we can approach to face variant grape types for real world data which is complex. The graphs’ classification categories can be directed/undirected, Homogeneous/heterogeneous, and static/dynamic graphsZhou et al. (2020). The directed graph can be called when edges of graph are connected, and the undirected graph means the opposite. The directed graph is better than the undirected graph because the first one has more information than the second one. The homogeneous graph has the same types of nodes and edges; however, the heterogeneous graph has different types. That means that information on nodes and edges is important when we analyze the heterogeneous graph. We can call a dynamic graph if the input features or graph topology change. It is reasonable that time points should be considered carefully there rather than a static graph. Zhou et al. (2020) also propose a classification of graphs based on their scale and type, which includes directed, heterogeneous, dynamic, hypergraph, signed, and large graphs.

Dynamic graphs Among them, we focus on the dynamic graph. Spatial and temporal informa- tion is collected on DCRNN(Diffusion Convolution Recurrent Neural Network)(Li et al., 2017) and STGCN(Spatio-temporal graph convolutional networks)(Yu et al., 2017). In detail, DCRNN gets the spatial data by GNN and then transfer the output to sequence-to-sequence or the sequence model such as RNN to consider temporal dependency and STGCN stacks multiple statio-temporal con- volutional blocks which are consisted one spatil graphconvolutional layer and two temporal gate convolutional layers. On the other hand, Structure-RNN(Jain et al., 2016) and ST-GCN(Yan et al., 2018) simultaneously capture spatial and temporal messages. To enable the application of traditional GNNs on the extended graphs, both Structural-RNN and ST-GCN expand the static graph structure by incorporating temporal connections. Structual-RNN adds edges between consecutive time steps, representing nodes and edges with nodeRNNs and edgeRNNs in a bipartite graph. ST-GCN involves constructing spatiotemporal graphs by stacking graph frames from each time step. However, Pareja et al. (2020) argue that using node features in learning can impact the model’s performance and propose EvolceGCN, a method designed for dynamic graphs.

- 참고

snapshot이 homogeneous가 아닌데 missing 부분을 채워 넣어 homogeneuos graph 로 해석하고 분석

위에서 언급한 저자들 입력

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting(Bing Yu, Haoteng Yin, Zhanxing Zhu)

The linear interpolation method is used to fill missing values after data cleaning. In addition, data input are normalized by Z-Score method.

Graph Markov network for traffic forecasting with missing data

We denote the completed state by, in which all missing values are filled based on historical data

아예 full로 데이터가 존재한다고 가정하고 homogenous graph 로 보고 제시된 방법론

Scalable Spatiotemporal Graph Neural Networks(Andrea Cini, Ivan Marisca, Filippo Maria Bianchi, Cesare Alippi)

The first dataset contains data coming from the Irish Commission for Energy Reg- ulation Smart Metering Project (CER-E; Commission for Energy Regulation 2016), which has been previously used for benchmarking spatiotemporal imputation methods (Cini, Marisca, and Alippi 2022);however, differently from previ- ous works, we consider the full sensor network consisting of 6435 smart meters measuring energy consumption ev- ery 30 minutes at both residential and commercial/industrial premises.

처음부터 heterogeneous graph를 input data로 가정하며 만들어진 방법론

Learning Deep Representation from Big and Heterogeneous Data for Traffic Accident Inference(Quanjun Chen, Xuan Song, Harutoshi Yamada, Ryosuke Shibasaki)

By mining big and hetero- geneous data, we aim to understand and develop a general model to estimate traffic accident risk. With the input of real- time GPS data, our model can simulate traffic accident risk on a large scale.
We extract hierarchical feature representation of meshed human mobility data from Stack denoise Autoencoder (SdAE), for a more efficient and precise prediction of risk levels in supervised learning.

ISTD-GCN: Iterative Spatial-Temporal Diffusion Graph Convolutional Network for Traffic Speed Forecasting(Yi Xie, Yun Xiong, Yangyong Zhu)

Therefore, we can model such heterogeneous spatial-temporal structures as a homogeneous process ofdiffusion

- 수정

As we mentioned, irregular spatiotemporal data is often encountered in the real world. It is well- known that neural networks are better suited for regular data. Therefore, many attempts have been made to transform data with different structures into the same structure through snapshots. If we interpret this as a graph, we can divide it into homogeneous graphs and heterogeneous graphs. Ho- mogeneous graphs have the same types of nodes and edges, while heterogeneous graphs do not(Zhou et al., 2020). There are Numerous methods to address the challenge of dealing with this issue. For instance, to fill missing values, Bai et al. (2020); Yu et al. (2017); Guo et al. (2019) employ linear in- terpolation, while Cui et al. (2020) utilize historical data. All of them tried to convert heterogeneous graph into homogeneous graph. However, if we create regular data using interpolation methods, the result may have low accuracy. Additionally, Cini et al. (2023) assume that the input data is originally complete, which is equivalent to interpreting the data as a homogeneous graph from the beginning. Furthermore, Chen et al. (2016); Xie et al. (2020) proposed a general model that treats input data as a heterogeneous graph, assuming a lack of supported sensing data. It might be efficient to handle data with a heterogeneous structure in each snapshot. But the real data often represents homogeneous graph and the missing values transforms it into a heterogeneous graph, that means the structures of every snapshot are not different.

3. Backgrounds

- 좋아요

- 자잘한건 제가 수정하면 될 듯합니다.

4.

- 내용을 좀 더 팬시하게 쓸 필요가 있어보임

- 아래부분을 정리하여 알고리즘화 해야함.

- 기존

5. Experiments

- 아직 덜 읽어봄

- 데이터 설명은 Appendix에, 실험결과와 Fig는 본문에 있는게 좋음

[1] 보통 결측없이 모두 관측한상태에서는 모형이 잘 동작함, 대부분의 spatio temporal data는 각각의 스냅샷마다 동일한 그래프구조를 가진다는 가정을 사용함. 스냅샷마다 그래프구조가 다른 경우를 가정하는 모형도 있음. 그러한 모형의 예시는 A,B,C,…. 등이 있음. 하지만 이러한 연구는 애초에 데이터가 스냅샷마다 non-호모지니우스하게 생겼으면 효율적일 수 있으나, 실제true model은 스냅샷마다 그래프구조가 동일하다고 여겨지지만 결측치로 인하여 스냅샷마다 호모지니우스가 깨지는 경우는 효율적이지 않을 수 있음. 우리는 이 부분에 초점을 맞추었음. 우리의 아이디어는 호모지니우스 하지 않은 그래프를 A,B,C, 등을 이용하여 그대로 처리하는것 보다 missing을 처리하여 호모지니우스하게 강제로 만들고 그 자료를 분석하자는 아이디어임.