Delve into the inaugural release of SWOT satellite data, including a quick assessment of the water mask provided by Level-2 products from the KaRIn instrument
Introduction
The Surface Water Ocean Topography mission (SWOT) was launched on Dec 16, 2022, with the promise to be a game-changer for water level (sea and inland) monitoring. For the first time a space mission is meant to study nearly all of the water on the Earth's surface by providing water height measurements with greater precision than ever before.
Despite its launch back in 2022, the First Public Release has been announced on Dec 5, 2023 (here), almost one year later, and at a "very early stage" and with "known limitations", as mentioned in the release note. Up to now, I've seen no assessment of these first products released. Therefore, the objective of this post is to provide a preliminary examination, as well as a quick tutorial on assessing this data.
To streamline environment configurations, this post was entirely development on a Google Colab notebook, which will be made available at the end.
Area of Interest
First step is to define an Area of Interest (AOI) for our analysis. In this example, we will focus on a region of the the Amazon Forest known as Anavilhanas, located within a Brazilian national park along the Rio Negro. One efficient method to quickly obtain the desired coordinates is by using the site geojson.io. This platform offers an interactive map interface, allowing users to select their region of interest directly on the map. As a result, the corresponding GeoJSON data is displayed in the side panel (refer to Figure 1).
The coordinates can then be converted to a shapely geometry through the following code:
Downloading SWOT Data
To download the swot products we are relying on the earthaccess package provided by NASA's PODAAC. This package automates the image discovery and manages the download of the data in a programmatic way, which is more efficient than manually downloading bulky images through the interactive portal.
The earthaccess is the only package that's not preinstalled within Colab, so let's start by doing so.
%pip install earthaccess --quiet --no-cache-dir
After installation, we will need to sign up for earthaccess on its portal to register a username and password. These credentials are required for the next step, where we will search for the products within our Area of Interest (AOI) and desired time frame, set as November, 2023, due to a significant drought in the region.
Code output:
['SWOT_L2_HR_PIXC_005_548_159R_20231101T060159_20231101T060210_PGC0_01',
'SWOT_L2_HR_PIXC_005_548_159L_20231101T060159_20231101T060210_PGC0_01',
'SWOT_L2_HR_PIXC_005_548_160R_20231101T060209_20231101T060220_PGC0_01',
'SWOT_L2_HR_PIXC_006_395_149R_20231116T153403_20231116T153414_PGC0_01',
'SWOT_L2_HR_PIXC_006_395_149L_20231116T153403_20231116T153414_PGC0_01',
'SWOT_L2_HR_PIXC_006_395_150R_20231116T153413_20231116T153424_PGC0_01',
'SWOT_L2_HR_PIXC_006_395_150L_20231116T153413_20231116T153424_PGC0_01',
'SWOT_L2_HR_PIXC_006_548_159R_20231122T024704_20231122T024715_PGC0_01',
'SWOT_L2_HR_PIXC_006_548_159L_20231122T024704_20231122T024715_PGC0_01',
'SWOT_L2_HR_PIXC_006_548_160R_20231122T024714_20231122T024725_PGC0_01']
Now, let's download the first granule to the /tmp folder and make sure it is saved correctly:
Opening the Pixel Cloud File
To open the pixel cloud file, we are going to use the XArray package. Fortunately, it's already available in Colab. The file is loaded as a Dataset, where each variable is stored separately as flatten vectors, along with latitude and longitude for each sampled points. This data structure is common in pixel cloud data.
Once loaded into in the XArray.Dataset, we will transform the points into a GeoPandas GeoDataFrame, with just three variables (height, classification and coherent power), to make it easier to manipulate them. Here is the snippet for it:
height | classification | coherent_power | latitude | longitude | geometry | |
0 | 40.027229 | 1 | 116655.476562 | -2.294394 | -60.529366 | POINT (-60.52937 -2.29439) |
1 | 39.953144 | 1 | 398273.156250 | -2.294673 | -60.531261 | POINT (-60.53126 -2.29467) |
2 | 40.459209 | 1 | 799950.062500 | -2.295139 | -60.534424 | POINT (-60.53442 -2.29514) |
3 | 39.948395 | 1 | 195623.703125 | -2.295220 | -60.534981 | POINT (-60.53498 -2.29522) |
4 | 40.965137 | 1 | 99289.179688 | -2.295783 | -60.538799 | POINT (-60.53880 -2.29578) |
Now it's time to have a first glimpse into the contents of the pixel cloud. The classification legend , according to the SWOT documentation is as follows:
Land
Land near water
Water near land
Open water
Dark water
Low coherence water near land
Open low coherence water
Considering the file has more than 4 Millions pixels, we are going to display just a random subset of 300,000 points to provide a manageable overview:
Visual Assessment
Now that we displayed the contents of the SWOT Pixel Cloud, let's conduct a quick visual assessment, comparing a small portion around the main lake northwest of the scene with an RGB Sentinel-2 image from the same date.
To begin, let's zoom in to our new area of interest:
To grab a Sentinel-2 RGB image from this same area, instead of downloading the entire tile, that's normally zipped and includes several bands we are not going to use, let's use a cloud-based approach to download just this small portion from the Microsoft Planetary Computer using STAC. If you are not familiar with STAC catalogs and COGs (Cloud Optimized GeoTiffs) images, I strongly suggest you to follow the series "Are Still Downloading Satellite Images? STOP and STAC", available here on GeoCorner.net.
In the next snippet we start by installing the three necessary packages that unfortunately are not available off-the-shelf on Colab:
pystac_client
stackstac
planetary_computer
In the first part (lines 8 to 23), we search for Sentinel-2 items in the Planetary Computer catalog considering the AOI bounds and within our time of interest. The returned item is "S2A_MSIL2A_20231101T142711_R053_T20MQC_20231101T214423", that corresponds to the same day of our SWOT image.
Then, in lines 26 to 34 we use stackstac to download just the RGB bands within the AOI and rescale them (line 34). In the resulting image, we can see some areas that seem to be sand banks, due to the drought in this region at this date, that SWOT classified as dark water (red box in the output figure). In the next section, we will get a more scientific comparison.
Quantitative Assessment
To conduct a more meaningful comparison, we will utilize the Modified Normalized Water Index (MNDWI), proposed by Xu, H. (2006) to compute a expedited water mask for this region. That's an expedited approach, since there are more reliable water masking algorithms such as the waterdetect [Cordeiro et al., 2021] (Project available at: https://github.com/cordmaur/WaterDetect), that uses multidimensional clustering with bands and indices. However, for the purposes of this first-look, the expedited approach will be considered.
Since our image contains some cloud cover, we need to address this issue. For that, we will use the SCL (Scene Classification Layer) that comes as default in the Sentinel-2 L2A produced by the Sen2Cor engine. The SCL is far from perfect, but, again, we will consider this an expedited assessment.
The first step (lines 3 to 16) involves downloading the necessary bands from the Planetary Computer. Note that we cannot apply the scaling to the SCL band, because it is of type unassigned integer. Then, in line 22 we compute the cloud mask, by masking medium and high probability clouds and cloud shadows. Next, the NDWI is computed (lines 25 and 26) and the water mask is derived from the NDWI raster, where values are above 0.0 (lines 29 and 30).
Assign Reference Mask to the Pixels
To complete our quantitative assessment, we need to compare the resulting water mask with the water pixels from the SWOT pixel cloud. While comparing two regular raster grids would be straightforward, our SWOT data is given in pixels and the mask is a raster grid. To quantify the differences, the easiest way here is to copy the water mask classes to the pixel cloud points. This operation can be done using the sample method from rasterio.
So, in the first part of our code, from lines 9 to 21, we save the water mask as uint8 to a temporary file and re-open it in a rasterio dataset.
Then, from lines 23 to 28, we perform a sampling operation. To load the water mask value for each coordinate. Finally, in lines 30 to 32, we copy the water mask values to our GeoDataFrame and convert the column to category type.
SWOT water mask performance
To evaluate the performance of the SWOT classification, let's assume the water mask produced by MNDWI is our ground truth. With this assumption, we can compute the confusion matrix and calculate true positives, negatives, recall and precision metrics. First, let's display the Omission and Commission errors in the map.
First, in lines 1 to 13, we start by mapping the swot multiple classes to just land (0) and water (1) and filter the no-data (255) values (line 16).
In the sequence, we create a status column with the label for each point.
From the image, we can observe several commission errors highlighted in red, indicating instances where SWOT classified a pixel as water, but it is likely land according to the optical classification using MNDWI. To quantify this discrepancy, we can calculate precision, recall and F1-score, based on the metrics computed above, like so:
Conclusion
In this exploration, we provided an initial assessment of the SWOT pixel cloud product, released to the public since December 2023, focusing on its internal classification. We demonstrated the process of automatically searching and downloading the SWOT file using the earthaccess package provided by NASA, and converting its contents to a GeoPandas GeoDataFrame.
A simple visual comparison is provided against a Sentinel-2 image sensed at the same date. To quantify the result, a water mask was derived from the optical S2 imagery using the Modified Normalized Difference Water Index (MNDWI). It' s important to acknowledge all the simplifications and limitations involved in this expedited approach. However, the nature of this post is not to be a peer-reviewed publication, but rather give us a first "feeling" about the product.
Based on our findings, it appears (subject to further verification through robust research) that the SWOT algorithm tends to overestimate water pixels, resulting in a low precision rate in this regard. As the SWOT products continue to evolve and undergo refinement, driven by ongoing updates to core algorithms, it is foreseeable that future iterations will offer enhanced accuracy and reliability. This highlights the dynamic nature of remote sensing technology and underscores the importance of continued research and development efforts.
References
[Xu, H., 2006] Xu, H. (2006). Modification of normalized difference water index (NDWI) to enhance open water features in remotely sensed imagery. International Journal of Remote Sensing, 27(14), 3025–3033. https://doi.org/10.1080/01431160600589179
[Cordeiro et al., 2021] Cordeiro, M. C. R.; Martinez, J.-M.; Peña-Luque, S. Automatic Water Detection from Multidimensional Hierarchical Clustering for Sentinel-2 Images and a Comparison with Level 2A Processors. Remote Sensing of Environment 2021, 253, 112209. https://doi.org/10.1016/j.rse.2020.112209.
Looks interesting thanks - was there meant to be a link to a Colab notebook?