VIETNAM NATIONAL UNIVERSITY
UNIVERSITY OF SCIENCE
----------
PHẠM VŨ ĐÔNG
APPLICATION OF REMOTE SENSING, GIS AND
DEEP LEARNING IN OPEN DATA CUBE
PLATFORM FOR LANDSLIDE SUSCEPTIBILITY
MAPPING FOR LARGE MOUNTAINOUS REGIONS
IN VIETNAM
(Ứng dụng viễn thám, GIS và học sâu trên dữ liệu mở để
thành lập bản đồ trượt lở đất đối với vùng đồi núi rộng lớn ở
Việt Nam)
Field: Cartography, Remote Sensing and Geographic
Information System
Code: 8440211.01
MASTER THESIS
HANOI – 2020
VIETNAM NATIONAL UNIVERSITY
UNIVERSITY OF SCIENCE
---------PHẠM VŨ ĐÔNG
APPLICATION OF REMOTE SENSING, GIS AND DEEP LEARNING
IN OPEN DATA CUBE PLATFORM FOR LANDSLIDE
SUSCEPTIBILITY MAPPING FOR LARGE MOUNTAINOUS
REGIONS IN VIETNAM
(Ứng dụng viễn thám, GIS và học sâu trên dữ liệu mở để thành lập bản đồ
trượt lở đất đối với vùng đồi núi rộng lớn ở Việt Nam)
Field: Cartography, Remote Sensing and Geographic
Information System
Code: 8440211.01
MASTER THESIS
Người hướng dẫn khoa học: PGS.TS. Bùi Quang Thành
XÁC NHẬN HỌC VIÊN ĐÃ CHỈNH SỬA THEO GÓP Ý CỦA HỘI
ĐỒNG
Giáo viên hướng dẫn
Chủ tịch hội đồng chấm luận văn
thạc sĩ khoa học
PGS.TS. Bùi Quang Thành
PGS.TS. Đinh Thị Bảo Hoa
Hà Nội, 2020
STATUTORY DECLARATION
I herewith formally declare that I myself have written the submitted Master’s Thesis
independently. I did not use any outside support except for the quoted literature and
other sources mentioned at the end of this paper.
Author
Pham Vu Dong
i
ACKNOWLEDGE
This work was supported by the Domestic Master/PhD Scholarship Programme of
Vingroup Innovation Foundation.
This work was supported by the Asia Research Center, Vietnam National
University - Hanoi and the Korea Foundation for Advanced Studies under Grant
CA.19.8A.
I also would like to express my thanks to my dissertation supervisor, Associate
Professor - Dr. Bui Quang Thanh, for creating all the conditions, wholeheartedly
guiding and helping me to complete this thesis well. My deep understanding of
science, as well as my experience, are the prerequisites for me to gain valuable
achievements and experiences!
Author
Pham Vu Dong
ii
CONTENTS
STATUTORY DECLARATION.................................................................................i
ACKNOWLEDGE ..................................................................................................... ii
LIST OF TABLES ...................................................................................................... v
INTRODUCTION ....................................................................................................... 1
1. The need of research ........................................................................................... 1
2. Research objectives ............................................................................................. 2
3. Proposing research procedure ............................................................................. 3
4. Approaches and methods .................................................................................... 3
CHAPTER I. OVERVIEW OF METHODS USING DEEP LEARNING TO
ANALYZE LANDSLIDE WITH REMOTE SENSING DATA ............................... 5
1.1. Landslide research and data overview ............................................................. 5
1.1.1. Landslide controlling factors .................................................................... 5
1.1.2. Geospatial data sets and digital elevation model data .............................. 7
1.2. Landslide susceptibility research in the world ................................................. 8
1.3. Landslide susceptibility research in Vietnam .................................................. 9
1.4. Landslide analysis with remote sensing data and deep learning .................... 10
1.5. Study area ....................................................................................................... 11
1.5.1. Geographic location ................................................................................ 11
1.5.2. Terrain characteristics ............................................................................. 12
1.5.3. Geology and terrain ................................................................................. 12
1.5.4. Provinces in the Northwest region .......................................................... 13
1.5.5. Residential ............................................................................................... 14
1.6. Landslide susceptibility model ....................................................................... 14
1.6.2. Landslide susceptibility data processing pipeline ................................... 14
1.6.3. Landsat 8 surface reflectance data .......................................................... 16
CHAPTER II. MULTISPECTRAL REMOTE SENSING IMAGERY AND DEEP
LEARNING METHOD FOR IMAGE ANALYSIS ................................................ 17
2.1. Data collection ............................................................................................... 17
2.1.1. Landsat 5 ................................................................................................. 17
2.1.2. Landsat 7 ................................................................................................. 17
iii
2.1.1. Landsat 8 ................................................................................................. 18
2.2. Deep learning method .................................................................................... 18
2.2.1. Supervised machine learning algorithm .................................................. 18
2.2.2. Convolutional Neural Network Architecture .......................................... 20
2.2.3. Learning in Convolutional Neural Network ........................................... 26
2.2.4. Backward run .......................................................................................... 30
2.2.5. Parameter updates ................................................................................... 38
CHAPTER III. INTEGRATING REMOTE SENSING, GIS AND DEEP
LEARNING IN OPEN DATACUBE PLATFORM FOR LANDSLIDE
SUSCEPTIBILITY MAPPING ................................................................................ 40
3.1. Building Deep Convolutional Neural Network for land cover classification 40
3.1.1. Landsat imagery normalization ............................................................... 40
3.1.2. Data processing ....................................................................................... 42
3.1.3. Deep learning architecture ...................................................................... 44
3.1.4. Training and testing phase ...................................................................... 47
3.2. Landslide susceptibility mapping ................................................................... 50
3.2.1. Open data cube data ingestion ................................................................ 50
3.2.2. Land cover classification ........................................................................ 52
3.3. Results analysis .............................................................................................. 57
3.4. Field surveying and evaluating ...................................................................... 58
CHAPTER IV. CONCLUSION................................................................................ 61
REFERENCES .......................................................................................................... 62
iv
LIST OF TABLES
Table 1.1. Landslide factors according to ................................................................... 6
Table 1.2. Provinces in the Northwest region ........................................................... 13
Table 3. 1. Different wave-length (µm) concerning reflectance bands among
different Landsat types .............................................................................................. 41
Table 3.2. Land cover classes by pixel values .......................................................... 43
Table 3.3. Scene accuracy ......................................................................................... 48
Table 3.4. Landslide susceptible spots by provinces ................................................ 57
Table 3.5. Landslide levels proportion by provinces ................................................ 57
Table 2.1. Landsat 5 (TM) reflectance band’s features………………………….17
Table 2.2. Landsat 7 (ETM+) reflectance band’s features ........................................ 18
Table 2.3. Landsat 8 (OLI and TIRS) reflectance band’s features ........................... 18
Table 2.4. Activation functions ................................................................................. 24
Table 2. 5.Different types of loss functions .............................................................. 25
Table 3.1. Different wave-length (µm) concerning reflectance bands among
different Landsat types .............................................................................................. 41
Table 3.2. Land cover classes by pixel values .......................................................... 43
Table 3.3. Scene accuracy ......................................................................................... 48
Table 3.4. Landslide susceptible spots by provinces ................................................ 57
Table 3.5. Landslide levels proportion by provinces ................................................ 57
v
LIST OF FIGURES
Figure 1.1. Geographic location of north-west mountainous region in Vietnam ..... 12
Figure 1.2. Landslide susceptibility areas detection model ...................................... 15
Figure 1.3. Landsat surface reflectance path-row specification................................ 16
Figure 2.1. Artificial Neural Network (Source: VIASAT) ....................................... 19
Figure 2.2. A CNN sequence to classify handwritten digits (source:
Towardsdatascience.com) ......................................................................................... 20
Figure 2.3. Convoluting a 5x5x1 image with a 3x3x1 kernel to get a 3x3x1
convolved feature ...................................................................................................... 20
Figure 2.4. Forward run in fully connected layer ..................................................... 28
Figure 2.5. Forward run in a neuron at
layer .................................................... 29
Figure 3.1. Normalization across reflectance bands ................................................. 42
Figure 3.2. Landsat 5 Surface Reflectance training data with data for training (pink)
and testing (green) ..................................................................................................... 43
Figure 3.3. . Skip connection in Resnet .................................................................... 44
Figure 3.4. Proposed deep learning architecture ....................................................... 46
Figure 3.5. . Loss function graph .............................................................................. 47
Figure 3.6. Accuracy graph ....................................................................................... 48
Figure 3.7. Visualization of land cover classification on 2 test images path-row:
127-051, 128-048 ...................................................................................................... 49
Figure 3.8. Visualization of land cover classification on 2 test images path-row:
129-050, 130-049 ...................................................................................................... 49
Figure 3.9. Data cube ingesting process ................................................................... 50
Figure 3.10. Data in study area in ODC environment .............................................. 51
Figure 3.11. Data processing in ODC ....................................................................... 51
Figure 3.12. Adding padding pixel value of 0 .......................................................... 52
Figure 3.13. Land cover map of northwest region in Vietnam 2017 ........................ 53
Figure 3.14. Land cover map of northwest region in Vietnam 2019 ........................ 54
Figure 3.15. Slope map of northwest region in Vietnam .......................................... 55
Figure 3.16. Landslide susceptibility map for northwest region in Vietnam............ 56
Figure 3.18. Susceptible landslide spots distribution by provinces .......................... 58
Figure 3.19. Field surveying at Tan Uyen Distric (Lai Chau, Vietnam) .................. 58
Figure 3.20. Field surveying at Sin Ho (Lai Chau, Vietnam) ................................... 59
Figure 3.21. Field surveying at Muong Te (Lai Chau, Vietnam) ............................ 59
Figure 3.22. Highly susceptible landslide spots in Lao Cai (Vietnam) .................... 60
Figure 3.23. Highly susceptible landslide spots in Hoa Binh (Vietnam).................. 60
vi
LIST OF SYMBOLS
DCNN
Deep Convolutional Neural Network
ODC
Open Data Cube
RS
Remote Sensing
UAV
Unmanned Aerial Vehicles
LS
Landslide Susceptibility
DEM
Digital Elevation Model
NASA
National Aeronautics and Space Administration
SRTM
Shuttle Radar Topography Mission
GIS
Geographic Information System
ANN
Artificial Neural Network
SVM
Support Vector Machine
AHP
Analytical Hierarchy Process
vii
INTRODUCTION
1. The need of research
Landslide is one of the most common hazards in the world and in Vietnam. Over
75% of total Vietnam area are mountainous regions with steep slopes. Because of
the mismanagement in economic and social planning, natural hazards such as
landslide, flood have occurred frequently. In recent years, these disasters have been
becoming more frequent and causing serious damages, typically in some provinces
in Vietnam such as Son La, Lai Chau, Dien Bien, Yen Bai, Lao Cai, Ha Giang, Cao
Bang, Thanh Hoa, Nghe An, etc. In the world, research in geological disaster was
developed from the early stage, numbers of advanced scientific technology have
been applied for monitoring landslide phenomenon. In Vietnam, this kind of study
just have gained focuses by researchers for about 15 years when natural hazards
have occurred more frequently. However, studies about landslide in Vietnam were
normally carried out for a small scale with qualitative forecast. That is why there
has been needs for a research that can be applied for large scale analysis that
effectively support the government planning, warning and handling natural hazards
in the context climate change over the world.
One of the popular application is to establish landslide susceptibility mappings for
research area. That is to say, this technique is effective to monitor landslide
processes that occurred in the past and to predict vulnerable areas that is prone to
landslide phenomenon in the future. Detecting landslide in satellite images can be
archived by analyzing features and shapes of object, however, traditional image
analysis required expert knowledge and field surveying which can be intensively
time consuming.
In the past, detecting landslide areas on satellite images using traditional methods
requires expert knowledge and field surveys that can be intensively time
consuming. In recent years, machine learning techniques has been widely applied in
many automation tasks. Especially, deep learning algorithms such as Deep
Convolutional Neural Network (DCNN) has gained huge popularity because of it
1
stunning performance in computer vision tasks in general and image analysis in
particular.
With the rapid development of remote sensing technology, multi temporal satellite
image has been a significant data source that provides data for machine learning
algorithms in order to identify landslide spots in satellite images. Beside of image
analyzing processes, data collecting for large region is a challenging task. Open
Data Cube (ODC) platforms was developed to solve this problem. The platform is
an ecosystem of approaching, managing and analyzing big remote sensing data.
ODC allows researchers to approach, manage and analyze pre-processed remote
sensing data without considering original data storing task. This is a pioneered
research in Vietnam that integrates deep learning algorithm and ODC platform for
automatically analyze big data to enhance the potential of landslide monitoring.
In recent years, machine learning techniques have been widely used to detect
special objects in satellite image. These techniques can be applied in land cover
classification [1] [2] and fire forest monitoring [3]. In addition, these techniques are
also applied to landslide researches by detecting special landslide objects quickly
and accurately. In the researches at Hoa Binh province, [4] [5] [6] experimented
probability and Bayes statistic models, decision tree, machine learning model in
order to estimate landslide probability with high accuracy. As can be seen, machine
learning techniques has shown stunning performance in remote sensing analysis.
Beside image processing, data collecting for large scale analysis is a challenging
task. The Open Data Cube (ODC) is an open source solution for accessing,
managing, and analyzing large quantities of Geographic Information System (GIS)
data – namely Earth observation (EO) data. It presents a common analytical
framework composed of a series of data structures and tools which facilitate the
organization and analysis of large gridded data collections.
2. Research objectives
Research in integrating remote sensing, GIS and Deep Learning algorithms with
Open Data Cube platform to establish landslide susceptibility mapping for large
2
mountainous regions in Vietnam. The research objective can be divided into small
main objectives:
Analyzing and evaluating features and properties of landslide phenomenon in
research area
Collecting and formatting training data to support supervised machine
learning algorithms
Integrating deep learning model with ODC platform to establish landslide
mapping
3. Proposing research procedure
Field surveying on research area
Collecting data in categories: natural condition, economy and society
Collecting landslide data in research area
Collecting and pre-processing remote sensing data
Researching in supervised machine learning and building a deep learning
architecture
Integrating deep learning model with ODC platform
Optimizing deep learning model
Evaluating model performance
4. Approaches and methods
Approaches
- System approach: The object of landslide will be considered as a natural whole,
phenomena influenced by a set of natural factors.
- Spatial approach: allows integration of deep learning, remote sensing and GIS
data in analyzing and modeling landslide susceptibility spots in remote sensing data.
Research Methods:
- Remote sensing method:
3
Remote sensing imagery data collecting for study: hyperspectral data
and digital elevation model data
Image calibration, bands combination
Mosaicking images and extract raster by studied regions
Image reclassification for land cover
- Cartography method:
Projecting map tiles
Map designing and symbolization for landslide susceptibility map
- Statistical methods
Imagery data statistics
Land cover changes statistics
- Modeling method
Integrating deep learning algorithm to build end-to-end model
Land cover changes statistics
- Verification method of field survey
Evaluating susceptible landslide spots on the fields
Estimating the accuracy
- Methods of spatial analysis using GIS
GIS database for data management
Mapping results to studied regions
- Professional solution
4
CHAPTER I. OVERVIEW OF METHODS USING DEEP LEARNING TO
ANALYZE LANDSLIDE WITH REMOTE SENSING DATA
1.1. Landslide research and data overview
Landslide are considered as one type of the most serious natural disasters around the
world. The safety of local residents and property is frequently destroyed by some
triggered landslides [7]. Landslide occurrence depends on complex interactions
among a large number of partially interrelated factors. These parameters, according
to [8] can be grouped into two categories: (1) preparatory variables including slope,
soil properties, elevation, aspect, land cover, lithology, etc; and (2) the triggering
variables such as heavy rainfall and glacier outburst. A field survey, conventionally,
is the most exact method to assess landslide susceptibility (LS). However, analyzing
landslide potential that might occur in a large area is very difficult and expensive in
terms of time and money. This is especially true in developing countries where
expensive ground observation networks are prohibitive and in mountainous areas
where access is difficult. In many countries, remote sensing information may be the
only possible source available for such studies. Currently available satellite data
may provide useful and accurate information on earth surface features and dynamic
processes involved in landslide occurrence.
1.1.1. Landslide controlling factors
Landslide occurrence depends on complex interactions among a large number of
factors such as: geologic setting, geomorphic feature, soil property, land cover
characteristics, and hydrological and human impacts. According to [8], these factors
can also break down into two interactive categories: static and dynamic factors.
Factors that trigger mass movements are called dynamic factors, mainly rainfall and
earth quakes. Basic surface-related characteristics that are related to sliding are
called static factors or primary factors [9]. Static factors are the determinants of
landslide susceptibility, and can be derived from surface characteristics.
According to [10], in every slope there are forces which ten to promote downslope
movement and opposing forces which tend to resist movement. A general definition
5
of the factor of safety, of a slope results from comparing the downslope share stress
with the shear strength of the soil, along an assumed or known rupture surface. The
great variety of slope movements reflects the diversity of conditions that cause the
slope to become unstable and the processes that trigger the movement. It is more
appropriate to discuss causal factors (including both “conditions” and “processes”)
than “causes” per se alone. Thus ground conditions (weak strength, sensitive fabric,
degree of weathering and fracturing) are influential criteria but are not causes. They
are part of the conditions necessary for an unstable slope to develop. It does not
matter if the ground is weak as such – failure will only occur as a result if there is an
effective causal process which acts as well. A particular causal factor may perform
either or both function: 1) Preparatory causal factors which make the slope
susceptible to movement without actually initiating it and thereby tending to place
the slope in a marginally stable state. 2) Triggering causal factors which initiate
movement. The causal factors shift the slope from a marginally stable to an actively
unstable state. Although it may be possible to identify a single triggering process, an
explanation of ultimate causes of a landslide invariably involves a number of
preparatory conditions and processes. Based on their temporal variability, the
destabilizing processes may be grouped into slow changing (e.g. weathering,
erosion) and fast changing processes (e.g. earthquake, drawdown).
Table 1.1. Landslide factors according to [10]
1. GROUND CONDITIONS
(1) Plastic weak material
(2) Sensitive material
(3) Collapsible material
(4) Weathered material
(5) Sheared material
(6) Jointed or fissured material
(7) Adversely oriented mass discontinuities (including bedding, schistosity,
cleavage)
(8) Adversely oriented structural discontinuities (including faults, unconformities,
flexural shears, sedimentary contacts)
(9) Contrast in permeability and its effects on ground water contrast in stiffness
(stiff, dense material over plastic material)
2. GEOMORPHOLOGICAL PROCESSES
(1) Tectonic uplift
6
(2) Volcanic uplift
(3) Glacial rebound
(4) Fluvial erosion of the slope toe
(5) Wave erosion of the slope toe
(6) Glacial erosion of the slope toe
(7) Erosion of the lateral margins
(8) Subterranean erosion (solution, piping)
(9) Deposition loading of the slope or its crest
(10) Vegetation removal (by erosion, forest fire, drought)
3. PHYSICAL PROCESSES
(1) Intense, short period rainfall
(2) Rapid melt of deep snow
(3) Prolonged high precipitation
(4) Rapid drawdown following floods, high tides or breaching of natural dams
(5) Earthquake
(6) Volcanic eruption
(7) Breaching of crater lakes
(8) Thawing of permafrost
(9) Freeze and thaw weathering
(10) Shrink and swell weathering of expansive soils
4. MAN-MADE PROCESSES
(1) Excavation of the slope or its toe
(2) Loading of the slope or its crest
(3) Drawdown (of reservoirs)
(4) Irrigation
(5) Defective maintenance of drainage systems
(6) Water leakage from services (water supplies, sewers, stormwater drains)
(7) Vegetation removal (deforestation)
(8) Mining and quarrying (open pits or underground galleries)
(9) Creation of dumps of very loose waste
(10) Artificial vibration (including traffic, pile driving, heavy machinery)
1.1.2. Geospatial data sets and digital elevation model data
Remote sensing products can be utilized for deriving various parameters related to
landslide controlling factors. Several geospatial data sets were used in this study.
The basic digital elevation model (DEM) dataset used in this study includes
National Aeronautics and Space Administration (NASA) Shuttle Radar Topography
Mission (SRTM) dataset. The SRTM data are a major breakthrough in digital
mapping of the world (with 30 m horizontal spatial resolution and vertical error less
7
than 16 m), and provides a major advance in the availability of high quality
elevation data for large portions of the tropics and other areas of the developing
world. SRTM data are distributed in two levels: SRTM1 (for the U.S. and its
territories and possessions) with data sampled at one are-second interval in latitude
and longitude, and SRTM3 sampled at three arc-seconds. The horizontal resolution
of SRTM1 has about 30-meter resolution and SRTM3 has 90-meter resolution in
equator areas.
DEM data can be used to derive topographic factors, other than simply elevation
including slopes, aspects, hill shading, slope curvature, slope roughness, slope area
and qualitative classification of landforms [11]. DEM data can be also used to
derive hydrological parameters (flow direction, flow path, and basin and river
network basin)
1.2. Landslide susceptibility research in the world
In [12] research, the authors present a landslide susceptibility model for Collazzone
area, central Italy. The landslide susceptibility model was obtained through
discriminant analysis of 46 thematic environmental variables and using presence of
shallow landslides obtained from a multi-temporal inventory map as the dependent
variable for statistical analysis. By comparing the number of correctly and
incorrectly classified mapping units, it is established that the model classifies 77%
of 894 mapping units correctly. In other research, the importance of
geomorphological expert knowledge is evaluated in the generation of landslide
susceptibility map using GIS supported indirect bivariate statistical analysis [13].
The analysis indicated that the use of detailed geomorphological information in the
bivariate statistical analysis raised the overall accuracy of the final susceptibility
map considerably. In [14], the authors analyze the susceptibility of landslides and
the effect of landslide –related factors at Penang in Malaysia using the geographic
information system (GIS) and remote sensing data have been evaluated. Landslide
hazardous areas were analyzed and mapped using the landslide-occurrence factors
employing the probability-frequency ratio method using the all factors. When it
8
comes to learning based techniques, in [15] used one-dimensional convolutional
network and Bayesian optimization for assessment of landslide in South Korea. The
study shows that convolutional network could outperform ANN and SVM owing to
it complicated architecture and handling of spatial correlations through convolution
and pooling operations. In situations where some variables make a non-linear
contribution to the occurrence of landslide, the method suggested could thus help
develop landslide susceptibility maps. In [16], the authors investigates a
convolutional neural network framework for landslide susceptibility mapping in
China. The experimental results demonstrated that the proportions of highly
susceptible zones in all of the CNN landslide susceptibility maps are highly similar
and lower than 30%, which indicates that these CNNs are more practical for
landslide prevention and management than conventional methods.
1.3. Landslide susceptibility research in Vietnam
Over years, many landslide susceptibility studies in Vietnam have been carried out.
For example, in [17], the authors investigate a potential application of the Adaptive
Neuro – Fuzzy Inference System and the Geographic Information System (GIS) as a
relatively new approach for landslide susceptibility mapping in the Hoa binh
province of Vietnam. The results of this study show that landslide susceptibility
mapping in the Hoa Binh province of Vietnam using the ANFIS approach is viable.
As far as the performance of the ANFIS approach is concerned, the results appeared
to be quite satisfactory, the zones determined on the map being zones of relative
susceptibility. In other research, the authors evaluate the susceptibility of landslides
in the Lai Chau province of Vietnam using Geographic Information System (GIS)
and remote sensing data to focus on the relationship between tectonic fractures and
landslides [18]. For machine learning approaches, in [19] the authors develop a
machine learning method that hybridizes the Support Vector Machine (SVM) and
Bat Algorithm fir spatial prediction of shallow landslide. To construct and verify the
hybrid method, a Geographic Information System (GIS) database for the study area
of Lang Son province has been employed. The method is used to separate data
samples in the GIS database into two categories on non-landslide (negative class)
9
and landslide (positive class). The Bat Algorithm metaheuristic is employed to
assist the LSSVC model selection progress by fine-tuning its hyper-parameters: the
regularization coefficient and the kernel function parameter. Experimental results
point out the hybrid BA-LSSVC can help to achieve a desired prediction with an
accuracy rate of more than 90%. For another methods, in [20], the authors combine
data such as extreme rainfall (obtained from Regional Frequency Analysis), soil,
land covers, vegetation densities (NDVI) and terrain slope using GIS-based toolkit
(SAGA) with estimated weight from Analytical Hierarchy Process (AHP). The
obtained results have identified areas of landslide susceptibility at different levels
corresponding to distinct scenarios.
In perspective of deep learning, [21] describes the development and validation of a
spatially explicit deep learning neural network model for the prediction of landslide
susceptibility. A geospatial database was generated based on 217 landslide events
from the Moung Lay district (Vietnam), for which a suit if nine landslide
conditioning factors was derived. The Relief-F feature selection method was
employed to quantify the utility of the conditioning factors for developing the
landslide predictive model. A comparative analysis using the Wilcoxon signed-rank
tests revealed a significant improvement of landslide prediction using the spatially
explicit deep learning model over other models.
1.4. Landslide analysis with remote sensing data and deep learning
Remote sensing (RS) have helped human to gain knowledge about Earth [22] [23].
Over years, hyperspectral remote sensing data has been widely used by researcher
to perform different tasks such as vegetation analysis [24], water assessment [25],
image classification [26], etc.
When it comes to landslide analysis, over years, many researches have been carried
with different approaches with the variety of data usage such as aerial photograph,
multi-temporal aerial photograph, color aerial photograph [27], etc. In [28], the
authors used unmanned aerial vehicles (UAV) to monitor landslide of mountainous
region in France. In other research [29], a medium resolution satellite image is used
10
to detect landslide using Maximum Likelihood classifier method. In addition, an
effective method that combines GIS and landslide’s factors such as slope, soil types
in [30] to generate global landslide susceptibility mapping.
However, traditional landslide analysis methods require expert knowledge and
fieldwork and they are mostly applied for a particular research region. To carry out
automated applications, some studies that contain machine learning (ML)
techniques [31] [32] [33] [34] [35]. From the mentioned studies, they show that ML
techniques are highly potential for analyzing the remote sensing data. There are,
however, limitations in these studies which is the developed ML techniques heavily
focus on improving the accuracy compared to traditional remote sensing and GIS
based methods. The improvements have been shown marginally and lack of
generalization.
1.5. Study area
1.5.1. Geographic location
The Northwest mountainous region in Vietnam shares the border with Laos and
China. This is one of three minor natural regions of north Vietnam (other two are
north-east and Red river delta regions).
The geographic space of Northwest region has not been agreed yet. Some ideas
have been arguing that this is the south (river bank) of the Red River. Geographer
Le Ba Thao argued that the Northwest region was limited to the east by Hoang Lien
Son mountain and to the west by the Song Ma mountain [36].
11
Figure 1.1. Geographic location of north-west mountainous region in Vietnam
1.5.2. Terrain characteristics
The Northwest terrain is high and deeply divided, with many blocks and high
mountains running from the Northwest to the Southwest. Hoang Lien Son mountain
range is up to 180 km long and 30 km wide, with some height peaks from 2800 to
3000 m. The Ma River Mountains is 500 km long with peaks above 1800 m.
Between these two mountainous areas is a low area of Da River basin (also known
as the Da River trough). In addition to the large Da River, the Northwest has only
small rivers and streams including the upper Ma River. In the trough of the Da
River, there is also line of limestone plateau running from Phong Tho to Thanh
Hoa, and can be subdivided into the plateaus of Ta Phinh, Moc Chau and Na San.
There are also basins such as Dien Bien, Nghia Lo and Muong Thanh.
1.5.3. Geology and terrain
Although the general climate is not much different between regions, its appearance
is not the same horizontally and vertically. The Hoang Lien Son mountain range
runs one block in a northwest-southeast direction, playing the role of a bulwark to
prevent winter winds (northeast-southwest) from passing into Northwestern
territory without greatly weakened, as opposed to the Northeast, there is a system of
12
arcs extending in a fan shape so that cold waves can follow down to the Red River
Delta and further south. Therefore, unless due to the influence of elevation, the
Northwestern climate is generally warmer than the Northeast, the difference can be
2-3 OC. In the mountains, the slope direction of the slopes plays an important role
in the heat - humidity regime, the windward (east) side receives heavy rainfall while
the west side facilitates the "fluttering" wind (or accustomed to) called "waterpipe")
formed when blowing down valleys, most notably in the Northwest. In general, in
terms of midland and mountainous areas, climate study is very important because
the deformation of the climate occurs on each small area. The climatic events in the
mountains are extreme, especially in the context of reduced forest cover, and
degradation of the soils. Heavy and concentrated rains cause floods but flash floods
combine with some conditions; droughts often occur in the dry season but
sometimes droughts are beyond the tolerance of plants.
Administratively, the Northwest region currently consists of 6 provinces with an
area of over 5,645 million ha (10.5% of the total area of the country) with 4,713,048
people (rate of 15.5% compared to the total national population), on average about
88 people per square kilometer.
1.5.4. Provinces in the Northwest region
The population and area entries are based on statistics from the General Statistics
Office of Vietnam on Wikipedia pages of provinces in Vietnam [37].
Table 1.2. Provinces in the Northwest region
Index
Province
name
1
2
Hoa Binh
Son La
3
Dien Bien
4
5
6
Lai Chau
Lao Cai
Yen Bai
County
City
Hoa Binh city
Son La city
Dien Bien
Phu city
Lai Chau city
Lao Cai city
Yen Bai city
1
1
Town
District
Population
(people)
Area
(km2)
10
11
808.200
1.195.107
4.608,7
14.174,4
Population
density
(people/
km2)
175
84
1
1
8
557.400
9.541
58
1
1
1
1
1
7
7
7
436.000
684.300
800.100
9.069,5
6.364
6.887,6
48
108
116
13
Parts of Phu Tho and the two provinces of Lao Cai and Yen Bai are located on the
right bank of the Red River, because the river flows between these provinces, but
the administrative area of the Northwest does not include Phu Tho, sometimes Lao
Cai and Yen Bai provinces are also classified in the Northeast. However, at present,
the headquarters of the Northwest Steering Committee is located in Yen Bai city,
the provincial capital of Yen Bai province.
1.5.5. Residential
Basically, the Northwest is the cultural space of the Thai people, famous for the
typical dance which is a very popular. Muong is the ethnic group with the largest
population in the region. In addition, there are about 20 other ethnic groups such as
H'Mong, Dao, Tay, Kinh, Nung, ... Those who have been to the Northwest cannot
forget the image of Thai girls with really colorful dresses. represents the Northwest.
The northwest is the region with a very high distribution of the population: the high
ridge (the top of the mountain) is home to the ethnic groups of the Mong - Dao,
Tibetan and Burmese languages, with the production labor method. Mostly shifting
cultivation is dependent on nature; the middle region (mountain slope) is the
residence of ethnic groups of the Mon - Khmer language group, the main
production methods are dry rice cultivation, animal husbandry and some
handicrafts; In the valley, the foothills are inhabited by ethnic groups of Vietnamese
- Muong, Thai - Kadai languages, more favorable natural conditions to develop
agriculture and other professions. Differences in living conditions and production
methods also cause great cultural differences! Although the subject culture and
characteristics are Muong ethnic culture.
1.6. Landslide susceptibility model
1.6.2. Landslide susceptibility data processing pipeline
To detecting probable landslide areas over a period time, this study collected multitemporal Landsat 8 surface reflectance for both pre-landslide and post-landslide
periods. After that, data is then ingested to Open Data Cube environment in ordered
to be processed in parallelism with cluster of computers. Next, both pre-landslide
14
and post-landslide are classified using pre-trained deep CNN model to detect
vegetation areas in pre-landslide data and bare-land (probable vegetation loss) in
post landslide data.
DEM (digital elevation model) data is collected at the same time to generate slope
map for study area. In [38], the authors showed that the area with slope with high
angle is more likely to have landslide hazard occurring. Combining the vegetation
loss and slope factors, landslide susceptibility areas detection model can be
illustrated at.
Figure 1.2. Landslide susceptibility areas detection model
The proposed method is adapted from two main objectives of studies in Landslide.
- Firstly, in the study of landslide by monitoring land cover change in [39], the
processes of land cover change caused changes in the geological environment,
mainly in three aspects: (i) steepening slopes by under cutting and backfilling
during construction of infrastructures and residential suctures on the hill slopes, (ii)
destruction of cultivated and forest lands due to local mining activities and (iii)
construction of hydropower facility near the urban area. In this study, we adapt the
results of this study to exploit the relationship between land cover change and
15
landslide susceptibility. As indication, land cover changes are responsible for
landslide susceptibility that preventive measures can be implemented from the
beginning.
- Secondly, the proposed method is also inherited from the slope analysis for
landslide in [40] where the Slope Units offers and ideal presentation of space for
which a Slope Unit approximates the morph dynamic response of a slope to a
landslide. As a result, the slope angle contributes largely to the susceptibility of
landslide
1.6.3. Landsat 8 surface reflectance data
Figure 1.3. Landsat surface reflectance path-row specification
In this study, a collection of Landsat 8 surface reflectance between 2 years 2017 and
2019 (freely offer at was used. For each period 7 scenes of
Landsat 8 surface reflectance were combined to cover the whole study area.
Collected scenes will be integrated into ODC environment to perform further
analysis.
16