Internship | Mysite

Internship topic: Combining Pixel and Object level information of Sentinel 2 time series data to classify land use land cover in Dordogne region in France.

Supervisors: Dino Lenco, Yagowan Jean Eudes

Institute : National Research Institute for Agriculture, Food and the Environment INRAE France

Introduction

This report forms a summary of the aims and results and conclusions of an internship conducted remotely with the National Research Institute for Agriculture, Food, and the Environment (INRAE) France. The internship period was 3 months during the summer period that is between July and September 2020. The internship was co-supervised by Dr. Dino Lenco and Yagowan Jean Eudes.

The task of the internship was to perform land use land cover by use of the Random Forest algorithm while taking in to consideration of both the pixel and object-level information.

Objectives

To perform exploratory data analysis of the time series satellite data
To compute spectral indices such as MSAVI, NDVI, and EVI
To perform land use land cover classification while taking consideration of both the pixel- based and object-level information.

Study area

Figure 1 Study Area

The study area chosen was the Dordogne area in France as shown in figure 1. The study area covers 418 000 hectares of land with various landscapes such as vineyards, cliffs, meadows , valleys, conifers, caves, and numerous hills.

Data

The data used for this study is time series Sentinel 2 images with 8 different dates with a spatial resolution of 10m. The bands for sentinel 2 used for this study are red, green, blue, and near infra-red band. The projection used for the images was EPSG:2154 – RGF93 Lambert -93 projection.

Methods

Figure 2 shows the overview of the methodology .

Figure 2 Methodology

Data Processing

The data was first explored by checking the number of classes for each land use class. Orfeo Toolbox is used to perform a linear gap filling on the time series data. This helps to replace the invalid pixels by interpolation using valid dates in time series.

Computation of Indices

Several indices were computed which include Normalized Difference Vegetation Index(NDVI), Soil Adjusted Vegetation Index(SAVI), and Enhanced Vegetation Index (EVI) as shown in the formula below.

The NDVI was calculated using the red, and near-infrared band(NIR) as shown in the following formula:

NDVI = NIR – RED / NIR + RED

SAVI = ( NIR - RED ) / ( NIR + RED + L ) * ( 1+ L )

EVI = ( NIR – RED) / (NIR + C1 * RED – C2 * Blue +L) * G

Where L=1, C1 = 6, C2 = 7.5, and G (gain factor)

Classification

To perform pixel based classification Random Forest algorithm(RF) is used to train the spectral features of different bands and the vegetation indices. Here, by use of the RF algorithm the statistical properties of the training data with ground reference data are used to estimate the probability of the classes.

In the object-based approach, the image is first segmented into objects. The segmentation was done by the use of the Simple Linear Iterative Clustering (SLIC) algorithm(Zhang et al., 2017).After various experiments, we decide to use 10000 segments and a default scale of 0.01 to appropriately segment all the land cover classes. The segmentation training objects and associated land cover labels together with the pixel-level information were then passed to the Random Forest classifier. The Random Forest was then used to train the data and making predictions on the test data. Afterward, the predictions were saved in raster tiff format and viewed in QGIS software.

Results and discussion

From the results, the pixel-based classification had an overall accuracy of 0.90, kappa value of 0.85, and F1 score of 0.78. On the other hand, the combined pixel an object information classification results had an overall accuracy of 0.98, kappa value of 0.97 and F1 score of 0.87. These results clearly show that the combined pixel and object-level information classification results had better accuracies as compared to only pixel-based classification. Table 1 shows the performance per class and from this table, we can see that the built-up class had a very low F1 score compared to other classes this is because the class had little sample distribution as compared to the other classes. Random Forest algorithm seems to be sensitive to class imbalance. Figure 4 and 5 show the resulting maps for combined and pixel based classifications respectively.

Table 1 F1 score per class

Figure 3 Combined based classification

Figure 4 Pixel based classification

Conclusion

This study illustrates land use land cover classification by use of both pixel and object level information for time series Sentinel 2 data. The results of this study show the combined pixel and object-level classification had higher accuracy as compared to using pixel-level information only. We have also seen that random forest is very sensitive to imbalanced classes since the class-specific accuracies were not good for the built up class which had fewer samples compared to other classes. We can conclude that the pixel-based image classification can be improve by adding object-level information. The objects can be used as a basis to develop a structural level grouping of proportional values of various components for a better classification output.

Reference

Zhang, Y., Li, X., Gao, X., & Zhang, C. (2017). A Simple Algorithm of Superpixel Segmentation with Boundary Constraint. IEEE Transactions on Circuits and Systems for Video Technology, 27(7), 1502–1514. https://doi.org/10.1109/TCSVT.2016.2539839