Evaluating resampled and fused Sentinel-2 data and machine-learning algorithms for mangrove mapping in the northern coast of Qeshm island, Iran

Mangrove forests, as an essential component of the coastal zones in tropical and subtropical areas, provide a wide range of goods and ecosystem services that play a vital role in ecology. Mangroves are globally threatened, disappearing, and degraded. Consequently, knowledge on mangroves distribution and change is important for effective conservation and making protection policies. Developing remote sensing data and classification methods have proven to be suitable tools for mapping mangrove forests over a regional scale. Here, we scrutinized and compared the performance of pixel-based and object-based methods under Support Vector Machine (SVM) and Random Forest (RF) algorithms in mapping a mangrove ecosystem into four main classes (Mangrove tree, mudflat, water, and sand spit) using resampled and fused Sentinel-2 images. Additionally, landscape metrics were used to identify the differences between spatial patterns obtained from different classification methods. Results showed that pixel-based classifications were influenced heavily by the effect of salt and pepper noise, whereas in object-based classifications, boundaries of land use land cover (LULC) polygons were smoother and visually more appealing. Object-based classifications, with an excellent level of kappa, distinguished mudflat and sand spit from each other and from mangrove better than the pixel-based classifications which obtained a fair-to-good level of kappa. RF and SVM performed differently under comparable circumstances. The results of landscape metrics comparison presented that the classification methods can be affected on quantifying area and size metrics. Although the results supported the idea that fused Sentinel images may provide better results in mangrove LULC Nature Conservation 52: 1–22 (2023) doi: 10.3897/natureconservation.52.89639 https://natureconservation.pensoft.net Copyright Ali Reza Soffianian et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. RESEARCH ARTICLE Launched to accelerate biodiversity conservation A peer-reviewed open-access journal


Introduction
Mangroves offer a considerable array of ecosystem goods and services including habitat and nursery for plant and animal, biodiversity, water quality maintenance, storm buffering, flood and flow control, fisheries, recreation, tourism, and so forth (Kuenzer et al. 2011;Salem and Mercer 2012;Savari et al. 2022). With an area of about 140,000 km 2 , they account for less than 1% of all tropical forests and less than 0.4% of the total global forest estate (Giri et al. 2011). Conservation and management of mangrove ecosystems ties significantly with quantifying and valuing their ecosystem services using a wide variety of information, especially remotely sensed-derived land use land cover (LULC) data (Grêt-Regamey et al. 2015). One of the most frequently raised questions in this research area is to find out which combinations of image(s) and processing technique(s) can likely elicit the desired results for a specific issue (Ho et al. 2016;Olmanson et al. 2016;Sanli et al. 2017), especially for LULC classification, as one of the most primary and important applications of remote sensing (Lillesand et al. 2014). The majority of studies dealing with this concern focused their attention on certain landscapes such as impervious urbanized areas (Li et al. 2014;Poursanidis et al. 2015), while this agenda has not been well addressed yet in other highly dynamic and heterogeneous landscapes such as mangrove ecosystems, facing researchers with the challenge of selecting the apt imagery data and classification techniques in their research. A range of low-to high-resolution aerial images (Brown et al. 2018), hyperspectral images (Kuenzer et al. 2011), Synthetic Aperture Radar (SAR) data (Zhang et al. 2018), and Light Detection and Ranging (LiDAR) data (Olagoke et al. 2016) have been used to map the extent and distribution of mangrove cover classes. In the past decade, data have become available from Very High-Resolution (VHR) satellites, such as Worldview-2 and Pléiades-1, leading to improved mapping of mangrove cover classes (Proisy et al. 2018). However, the main limiting factor is the high cost of data acquisition.
As our literature review on English articles published since 2011 ( Fig. 1) indicates, most studies on LULC classification of mangrove ecosystems narrowed their image selection down to Landsat mainly because it offers free and open access to its data archive (Toosi et al. 2019). Such a policy recently applied to some space missions, especially the new Sentinel program by the European Space Agency (ESA) (Wulder et al. 2012), enabled scholars to access a broader range, and even better, free image resources. The Multi-spectral Instrument (MSI) onboard Sentinel-2 is equipped with the same push-broom instrument as Landsat 8 but has a higher spatial, temporal and spectral resolution (Sothe et al. 2017). In addition to the type of sensor, the selection of the right classification method is another challenging task influencing the classification outcomes. These methods are broadly categorized into pixel-based and object-based methods which are envisaged to perform differently under similar circumstances (Hussain et al. 2013). In pixel-based methods, each pixel is considered as the numerical basis for image classification, while objectbased methods rely on a number of contiguous pixels with homogeneous spectral and spatial attributes as the basic classification element (known as objects or segments) to perform a categorization task (Blaschke et al. 2008;Blaschke 2010). Object-based methods were found to be more effective in classifying high spatial resolution images such as IKONOS, GeoEye, QuickBird, and SPOT 6/7, however, recent literature showed their promising utility for classifying medium resolution images such as Landsat and Sentinel (Blaschke et al. 2008). Fig. 1 displays the classification methods for the mapping of mangrove ecosystems. In this Figure, we divided the classification methods into main types: (1) classification technique, (2) classification algorithms. As given in fig. 1, the majority of studies on mangrove LULC classification relied on pixel-based methods and a very limited, but increasingly growing, attention has been given to object-based methods (Wang et al. 2004;Vo et al. 2013).
Furthermore, the algorithm by which a classifier is developed is also another key factor affecting the classification results. These algorithms are divided into parametric and nonparametric groups. Nonparametric methods allow training data to more robustly participate in the process of image classification and have a higher potency than nonparametric ones (Mountrakis et al. 2011;Ranaie et al. 2018). According to the literature review ( Fig. 1), Maximum Likelihood (ML) from the group of parametric algorithms and Support Vector Machine (SVM) and RF from the group of nonparametric algorithms have been more interested in mapping mangrove ecosystems. Recently, neural networks and deep learning as machine learning algorithms have been developed in the research field (Pham et al. 2019a). Landscape metrics are powerful tools for quantifying spatial patterns that can help better understand the relationships between patterns and ecosystem processes. According to many studies, landscape metrics have various applications which can be categorized into the following six groups: a) biodiversity and habitat assessment, b) Water quality estimation, c) analysis of spatial pattern and its changes d) evaluation of urban landscape pattern and road network e) aesthetics of landscape f ) management (Bozorgi et al. 2020). They are also useful for sustainable landscape planning and monitoring (Bihamta Toosi et al. 2012). Thematic maps resulting from image processing are commonly used to evaluate spatial patterns (Forman and Godron 1986).
Differences in spatial patterns identified in thematic maps affect the results of land use metrics (Uuemaa et al. 2013). This study aims to add to the small but growing body of research currently available in this area by comparing the performance of a combination of methods, algorithms, and imagery data in classifying mangrove ecosystems of Qeshm Island in the Persian Gulf, Iran. Additionally, to identify the differences between spatial patterns that were obtained from pixel-based and object-oriented classification methods, we used landscape metrics at different levels for the classification maps with the highest overall accuracy.

Study area
The study area of this research encompasses part of mangrove ecosystems located in the northwest of Qeshm Island in the Persian Gulf, Iran. This island is situated 20 km from the Bandar-Abbas Port in the Strait of Hormuz (Hormozgan Province) and spans over 26°40'N-27°00'N longitude and 55°20'E-55°50'E latitude. With an area of approximately 1490 km 2 , Qeshm is the largest island in the Persian Gulf (2.5 times larger than the second biggest Island: Bahrain Country). About 90% of Iranian mangrove forests are located in the margins of the island's northwestern estuaries (Zahed et al. 2010). In addition to patchy mangrove trees, muddy areas which are formed by the deposition of alluvium along the estuaries (mudflat), groundwater seepage-impacted bare soils (sand spit) and shallow waters are another important land cover type found in these mangrove ecosystems. Mangrove ecosystems of the region are designated as a biosphere reservoir (Hara Protected Area) by UNESCO's Man and Biosphere Program (MAB) and as an international wetland according to Ramsar Convention (Zarezadeh et al. 2017;Bihamta Toosi et al. 2020). Fig. 2 shows the layout of the study area.

Data
The workflow of this study is designed to utilize two Sentinel-2 MIS images (table 1). The onboard MSI consists of 13 spectral bands with four bands at 10 m, six bands at 20 m and three bands at 60 m spatial resolution (Wang et al. 2016).
One Sentinel-2 MIS image acquired at the time of the highest tidal level (December 10, 2017) was used to produce a radiometric mask distinguishing the study area from the surrounding bare lands and the other one obtained at the time of the lowest tidal level (November 15, 2017) was used for mangrove LULC classification. In this study 10 m and 20 m resolution bands were utilized for classification. In addition to imagery data, a total of 170 GPS reference points (Garmin 629sc) were obtained close to the image acquisition time to evaluate the classification accuracy during the field survey. In doing so, a special attention was given to the positional error which was frequently regarded as one of the main sources of uncertainty in assessing the classification accuracy (Lunetta and Lyon 2004;Congalton and Green 2008). Additionally, some points were selected from Spot 6/7 and Worldview-2 data based on visual image analysis to raise the number of reference points for different types of LULC.

Image preprocessing
To enhance the quality of images and perform more accurate image classifications, atmospheric and radiometric correction was conducted by the Fast Line-of-sight Atmospheric Analysis of Hypercubes (FLAASH) module in ENVI 5.4. This module relies on MODTRAN (Moderate resolution atmospheric Transmission) which is a radiative atmospheric propagation model for the spectral range of 200-100,000 nm (Ballestrín and Marzo 2012). A radiometric mask was then produced using the Normalized Difference Water Index (NDWI) to distinguish the mangrove ecosystem areas from the surrounding bare soils. This index was proposed by McFeeters (1996) and has been successfully applied in detecting changes in water bodies by integrating the reflectance of objects in the green ( ρ green ) and near infrared ( ρ NIR ) spectrum (Eq. 1). This index is also sensitive to the water content of vegetation canopies (Xu 2006). Given the reflective behavior of water in the infrared spectrum, NDWI obtains negative values in dry unvegetated bare soils (minimum of -1) and increases with the rise in the amount of water and vegetation intensity (maximum of +1) (Xu 2006). In order to define the classification extent, all of the four main LULCs constituting the mangrove ecosystems (mangrove tree, mudflat, sand spit, and water) were well discriminated from the surrounding bare soils by assigning a threshold NDWI value of 0.00.

Image fusion and resampling
This task enhances the quality and interpretation capability of the images and, in turn, may produce more accurate classification results (Jiang et al. 2013;Qadri et al. 2017). A variety of data fusion techniques have been developed to date which can be broadly classified as Component Substitution (CS) and Multi-Resolution Analysis (MRA) methods (Wang et al. 2016). These methods (known as PAN-sharpening methods) rely on a single FW image and are more applicable for fusion of, for example, Landsat which collects a single FW band (known as the panchromatic). Sentinel-2, however, obtains four fine pixel size bands and requires a different fusion approach to PAN-sharpening techniques. In this study, the method of Gašparović and Jogun (2018) was employed for this purpose. In doing so, the 60 m resolution bands were first excluded from the analysis. 20 m bands were then placed into two groups in terms of spectral range. Band 8a, 11, and 12 fell into one group and band 5, 6, and 7 into another. Band 8 and the linear combination of band 4 and 8 were considered as the FW band for the first and second group, respectively (for a more detailed description of the intuition behind this method, readers are referred to Gašparović and Jogun (2018)). The Intensity-Hue-Saturation (HIS) method was then used to perform data fusion. According to this method, CN bands were first transformed from RGB (Red-Green-Blue) to HIS color space. The hue and saturation bands were then resampled to FW band and, ultimately, the resulting bands were re-transformed to the RGB space (Choi et al. 2006). In addition to fusion, image resampling was also carried out on the original bands using the nearest neighbor method and the second-order polynomial transformation to downscale 20 m resolution bands to 10 m using 28 control points which achieved an acceptable RMSE value of 0.2. Hereafter, these two sets were referred to as the fused and resampled images.

Spectral indices
NDWI (see section 2.3.1), Normalized Difference Vegetation Index (NDVI), and Normalized Difference Built-up Index (NDBI) were produced to support the image bands through mangrove LULC classification. Several studies used the NDVI to improve monitoring of the mangroves (Pham et al. 2019b). A sufficient number of studies suggest that a combination of image bands with spectral indices can improve the accuracy of image classification, especially in object-based classifications (Tong Yang et al. 2015). The results of studies have shown that the use of NDVI for discriminating mangrove species using different remote sensing data is affected by spatial resolution. The higher spatial resolutions produce higher accuracy (Lee and Yeh 2009;Pham et al. 2019b). This index is computed using Eq. 2 where ( ρ red ) and ( ρ NIR ) are the spectral reflectance acquired in the red and near-infrared spectrum, respectively. NDBI has been used frequently for identifying bare and semi-bare soils and built-up lands in various studies (As-Syakur et al. 2012;Bihamta et al. 2019) by substituting the spectral reflectance of objects in short wave infrared ( ρ SWIR ) and near infrared spectrum in Eq. 3.

Image segmentation
Image segmentation splits the image into small polygons in alignment with the objects of interest in the study area (Blaschke et al. 2008). It performs by defining three tuning parameters: shape, compactness, and scale (Srivastava et al. 2015). Image segmentation was carried out using the Multiresolution Segmentation Algorithm (MRS) embedded in the eCognition software. MRS is the most popular and important algorithm for this purpose (Drăguţ et al. 2014) which basically relies on a key controlling factor known as Scale Parameter, allowing users to select the best-performing values for the abovementioned three parameters (shape, compactness, and scale) by trial-and-error (Benz et al. 2004;Drăguţ et al. 2014).

Image classification
A number of 356 polygons were sampled through visual interpretation of the color composites of the Sentinel images and a Spot image scene acquired on December 20, 2016. In doing so, a special attention was also given to the mean reflectance of each segment whereby those which most correspond to the pre-defined spectral signature of the respective LULC class were chosen as sample polygons. Sample polygons of each LULC class were randomly split into two groups, each containing 78 samples for mangrove tree, 54 for mudflat, 25 for sand spit, and 21 for water. One group which was used to train the algorithms was evaluated in terms of the divergence between the classes and the other group was used for accuracy assessment. Given the promising results of non-parametric algorithms (see the introduction section), RF and SVM as the two well-known and commonly used classifiers in this case (Mountrakis et al. 2011;Belgiu and Drăguţ 2016) were employed for mangrove LULC classification. These algorithms are briefly described in the following paragraphs. The classification results were finally evaluated and compared by calculating accuracy indices including kappa coefficient, overall accuracy, and user's and producer's accuracies (see Congalton and Green (2008)) from confusion matrices. Fig. 3 shows the steps undertaken in this research.

Support Vector Machine (SVM)
SVM was pioneered by Fisher in 1936 to reduce the error rate of discriminating training data based on the Statistical Learning Theory and then expanded by Cortes and Vapnik (1995) as a classification algorithm. This algorithm has broad applications in regression analysis, classification, and clustering. The basic principle of SVM is to draw an optimal separating linear line in which the distance between the marginal distribution of features from each class and the hyperplane is maximized (Cristianini and Shawe-Taylor 2000;Wang 2005). In image classification, SVM distributes each training sample as a point in an n-dimensional space where each dimension corresponds to an image band and uses an optimization algorithm to find surfaces (known as hyperplane) which best distinguish between the points with the same LULC labels (Mountrakis et al. 2011).
One of the most remarkable characteristics of this algorithm is that it concurrently maximizes the geometric margin and minimizes the experimental classification error and thus is known as Maximal Margin classifier (Cristianini and Shawe-Taylor 2000).  RF (also known as random decision forests) was first introduced by Breiman (2001) to solve the problems of classification. This algorithm consists of a collection of decision trees (a forest). Trees in this algorithm are built using a random subset of training data through bootstrap sampling with replacement (known as bagging). The trees are then grown independently and those having a high variance and low bias are chosen as the best performing trees while the others are pruned (Breiman 2001;Belgiu and Drăguţ 2016). Through this process, each tree casts a vote to attribute the most frequent class to an input vector and the votes of all trees are finally combined, counted and the class containing the majority of votes are labeled as a new classification (Angadi et al. 2013).
Two-thirds of training data are used for generating trees (known as in bag samples) and the remaining one-third for cross-validation which evaluates the performance of the RF model (Belgiu and Drăguţ 2016).

Comparison of spatial pattern analysis
A review of the literature shows that the metrics most commonly used to study forest ecosystems include metrics of area, shape, proximity, and diversity (Tran and Fischer 2017). In this study, four metrics of number of patches (NP), Patch Density (PD), Large Shape Index (LSI) and Mean Patch Area (AREA-MN) were selected for comparison. At the class level, the metrics Percentage of Landscape (PLAND), Number of patches (NP), Area-Weighted Mean Fractal Dimension Index (FRAC-AM) and Mean Euclidean Nearest Neighbor Distance (ENN-MN) were selected. Seven metrics including Total Area (TA), Edge Density (ED), Mean Patch Area (AREA-MN), Number of patches (NP), Patch Density (PD), Contagion (CONTAG) and Shannon's Diversity Index (SHDI) were selected at the landscape level. Table 2 shows the selected landscape metrics calculated using Fragstats 4.2 software.

Results
The accuracy of object-based classification depends significantly on the quality of image segmentation. In this research, segments were generated through frequent trialand-error and visual interpretation of the outputs. Accordingly, the scale factor was assigned to a value of 5. The influence weight of NDVI was considered twice greater than other bands (2:1). The shape and compactness factors were also kept constant as 0.1. The resampled and fused Sentinel-2 images were then classified using pixel-and objectoriented RF and SVM algorithms. Table 3 shows the area of LULC classes. According to these results, the total area specified to each class varied significantly among the classifications. The least difference between the areas devoted to each LULC class was observed for water class with the maximum area difference of 1007 ha (3.6% difference) and the biggest one with the maximum area difference of approximately 3,150 ha (20.2% difference) was obtained for sand spit class. One of the most intriguing results derived from table 2 is the difference between object-based and pixel-based classifications in exploring mangrove trees, according to which the area (the number of pixels) specified to this class through pixel-based classifications was relatively larger than that of delineated under object-based classifications. The spatial difference between objectbased and pixel-based classifications is shown in a close-up of the central part of the study area in fig. 4, which illustrates parts of all four LULC classes (mangrove, water, sand spit, and mudflat). As seen from fig. 4, a glaring difference was found between the spatial pattern of LULC classes derived from object-based and pixel-based classifications. Pixel-based classifications were influenced heavily by the effect of so-called salt and pepper noise (for example see the LULC map produced by classifying the fused image using pixel-oriented RF in fig. 4), whereas in object-based classifications boundaries of LULC polygons were smoother and visually more appealing.  The results of accuracy assessment for each classification are summarized in fig. 5 (due to limited space and for easier comparison of the results, confusion matrices are not shown). In this study, object-based classifications produced more satisfactory results than pixel-based classifications. Kappa coefficient and overall accuracy values obtained from pixel-based classifications were relatively low, below 72.2% and 80.8%, respectively, whereas these values were estimated to be much higher for object-based classifications. None of the object-based classifications represented kappa coefficient and overall accuracy values of less than 85% and 90%, respectively. According to the kappa classification scheme proposed by Monserud and Leemans (1992), pixel-based classifications represented a fair to good level of kappa coefficient (kappa coefficient of between 40% and 75%) while it was categorized as excellent (kappa coefficient of above 75%) for object-based classifications. In terms of the type of imagery data, the fused image was classified more accurately than the resampled image. Furthermore, regardless of the type of imagery data, the accuracy of RF in object-based classifications was higher than the accuracy of SVM while in pixel-based classifications, SVM performed better than RF. Among all classifications, the object-oriented RF classifier applied on the fused Sentinel image outperformed other classifications in terms of kappa coefficient and overall accuracy with values of 93.0% and 95.2%, respectively.
The producer's and user's accuracy statistics were also calculated to find out whether the classifications succeeded in mapping each LULC class. The producer's accuracy, as an indicator depicting how well a class is categorized (the accuracy of classification) (Pelcat 2006), obtained high values for water and mangrove classes under all classifications. Mudflat and sand spit classes were misclassified with each other as well as categorized frequently as mangrove class under pixel-based classifications and yielded relatively low producer's accuracy values, below 73%. From the user's accuracy perspective (as a measure of the reliability of classification (Congalton and Green 2008)), a relatively large proportion of mudflat and sand spit classes pixels were mistakenly labeled as mangrove under pixel-based classifications. In total, object-based classifications achieved higher user's and producer's accuracy values, while in pixel-based classifications, accurate and reliable results were only obtained for water class.
At the patch level, mangrove patches extracted from the pixel-based and objectoriented methods were investigated for both RF and SVM algorithms. The results of the metrics of NP and PD showed that in the pixel-based method the total number of patches and mangrove patch density was higher than in the object-oriented method and a significant difference was observed between the two algorithms ( Fig. 6 (A) and (B)). The results of LSI metric showed that this metric for the pixel-based method is less than the object-oriented method and there is a significant difference between the value of this metric for the pixel-based method between RF and SVM algorithms ( Fig. 6 (C)). The AREA-MN metric results showed that the metric value is higher for the pixel-based method with SVM algorithm and the object-oriented method with RF algorithm; however, there was not a significant difference between the classification methods. At the class and landscape level, metrics were compared for classification maps obtained from the object-oriented method with the RF algorithm and the pixel-based method with the SVM algorithm, which had the highest accuracy ( fig. 7). The results of PLAND metric showed that the percentage of mangrove and water class area was similar in both methods, but the mudflat class area was larger for the pixel-based method with SVM algorithm than the object-oriented method with RF algorithm and vice versa for barren class. The results of the NP metric showed that the number of spots for all classes in the pixel-based method with the SVM algorithm was much higher than the object-oriented method. In both methods, the highest number of patches was related to the mudflat class and the lowest was attributed to the water class. The fractal dimension metric was related to the complexity of the patches, which approaches two if the shape is very complex. The results of fractal dimension metric showed that the value of this metric for the object-oriented method with RF algorithm was less than the pixel-based method with SVM algorithm and the highest value of fractal dimension in both methods was related to the water class. The lowest values for the object-oriented and pixel-based methods were related to the barren and mangrove methods, respectively. The results of ENN-MN metric showed that the amount of this metric for the pixel-based method was less than the object-oriented method. The lowest ENN-MN values in the object-oriented method belonged to the mudflat class and in the pixel-based method to the barren class.
At the landscape level, the results of the metrics are displayed in Table 4. The total number of patches in the object-oriented method with RF algorithm was 1365 and in Table 4. Comparison of landscape metrics for two classification maps using the object-oriented and pixel-based methods at the landscape level.

FRAC-AM
SVM Pixel-based RF Object-based the base pixel method with SVM algorithm was 26982. Comparison of the results of metrics at the landscape level also showed that in the pixel-based method with SVM algorithm, the PD and ED increased significantly compared to the object-oriented method with RF algorithm. In the object-oriented method, the AREA-MN metric and the closest distance metric increased. The results of the variability and continuity metrics for the object-oriented method were greater than for the pixel-based method.

Discussion
This study compared the performance of different combinations of satellite images and classification techniques to provide insights for future studies on mangrove LULC classification. In terms of image selection, a growing propensity has emerged recently in using free Sentinel 1/2 images (Liu et al. 2017).
In Sentinel images, fusion can provide an opportunity to take full advantage of both spectral and spatial power of the image bands (Amarsaikhan et al. 2012). Regardless of the type of classification method and algorithm, the fused Sentinel images led to more reliable results. In line with the results of this study, a review by Joshi et al. (2016) showed that 87% of studies which conducted image classification experiments using both resampled and fused images acknowledged the superiority of fused images over the resampled ones in LULC classification. While the results of this study supported the idea that fused Sentinel images may provide better results in mangrove LULC classification, further research still needs to evaluate various image fusion approaches presented for Sentinel images as well as to develop new approaches to make use of all Sentinel's fine resolution (10 m) images. Progress in this research area, alongside conducting further research on comparison of the performance of resampled and fused Sentinel images (especially in mangrove LULC classification), will more effectively help practitioners in utilizing Sentinel images.
Another challenging issue which has increasingly caught the attention of remote sensing experts is the difference between the performance of pixel-based and object-based methods in LULC classification. Studies in this case showed that there is a glaring disparity between the classification results achieved under these methods and that the superiority of them over each other may depend on a number of factors such as the classification extent, spatial pattern and number of LULC classes (Dingle Robertson and King 2011). Moreover, selecting each method depends on the type of LULC data required; for example, if we need to detect objects at the size of image pixels, pixel-based-classification would be a better choice. Regardless of these considerations, the results of this study showed that object-based classification produced more accurate results. In pixel-based classifications, although water areas were accurately identified, sand spit and mudflat pixels were constantly misclassified as mangrove trees and thus obtained low user's and producer's accuracy values. Such a misclassification substantially raised the area estimated for mangrove class. In the same vein, Belgiu and Csillik (2018) found that the accuracy of object-based classifications is relatively higher than pixel-based in classifying Sentinel-2 images.
As given in the introduction section, the majority of studies on LULC classification concluded that non-parametric algorithms can provide better classification results than the parametric ones. Drawing on previous research carried out in this research area, the performances of RF and SVM as two well-known classification algorithms were evaluated in this research. As shown in fig. 5, none of the algorithms scored best under all circumstances. In object-based classifications, RF was found to be the foremost classifier while SVM performed better in pixel-based classifications. Similar to these results, Thanh Noi and Kappas (2017) compared the performance of a set non-parametric algorithms, especially RF and SVM, in classifying Sentinel-2 images and obtained contradictory results under comparable circumstances. Comparison of landscape metrics at the three levels showed that the type of method and algorithm used to prepare thematic maps can affect the results of composite and structural metrics. Area and size dependent metrics are more influenced by the type of classification method. Object-oriented methods reduced NP and PD due to the absence or lower noise in the processing of the images with greater continuity of patches, and as a result, the distance between the patches was greater. Comparison of shape-dependent metrics such as the fractal dimension showed that the pixel-based method reflects the structure of the patches with more complexity. In addition, changes in land cover have a direct impact on the spatial distribution of mangrove forests and often lead to fragmentation at the landscape scale (Toosi et al. 2022). Besides these disturbances, other studies have displayed that land cover change can cause biodiversity losses of animals and plants (Pedrinho et al. 2019). The land cover change can affect ecological functions by modifying their structure and composition. Thus, with the approach applied in this study, the required information on the trend of changes in land cover that affect the successful development and management of mangrove ecosystems can be provided, supporting better planning and decision-making (Toosi et al. 2022).

Conclusion
Mangroves are globally threatened, disappearing and degraded. Monitoring and evaluating changes in the trends of mangrove distributions and dynamics is urgent for the conservation of these ecosystems. The results of this research showed that object-based classification can better distinguish mangrove trees from mudflat and sand spit, while no solid conclusion was drawn in terms of the application of classification algorithms. Moreover, although further investigations are necessary to determine the potential of fused Sentinel images in LULC classification, our results showed that the accuracy of mangrove LULC classification can be well improved by object-oriented classification of fused Sentinel images. Mangrove ecosystems generally consist of the same LULC classes as those categorized in this research. Therefore, further research in our study area or other mangrove ecosystems is needed to strengthen or support the results obtained by this research and, in turn, inform future research on mangrove ecosystems. The findings of this research can contribute to the improvement of management and conservation strategies for these ecosystems being impacted by human activities. Introduction Due to fluctuating numbers of individuals within populations, short individual life spans, vulnerability to stochastic mortality factors and close specialisation to highly exacting resources (e.g. Dennis (2010)), conservation of declining insects typically relies on conserving their habitats (Settele et al. 2009;van Swaay et al. 2012;Warren et al. 2021). This may require active management interventions, if persistence of a habitat depends on a specific disturbance regime, historically supplied by now outdated land use patterns (Öckinger et al. 2006a, b;Bonari et al. 2017;Sienkiewicz-Paderewska et al. 2020). The latter is often the case of the "semi-natural grasslands" of Europe, land-cover types derived from early-Holocene and perhaps even older vegetation forms (Thomas 1993;Feurdean et al. 2018), hosting an outstanding diversity of endangered species (van Swaay et al. 2006), but suffering a continent-wide decline due to land-use changes (Nilsson et al. 2013) and so increasingly dependent on targeted conservation management. The marsh fritillary, Euphydryas aurinia (Rottenburg, 1775) (Nymphalidae, Nymphalinae) is a polymorphic butterfly, distributed across the Palaearctic temperate zones (Junker et al. 2015;Korb et al. 2016). Within its wide range, it occupies various habitats and displays regional specialisations on various larval host plants (Singer et al. 2002;Liu et al. 2006;Meister et al. 2015). Its distribution has declined seriously in Western and Central Europe, where it is restricted to oligotrophic conditions and develops on Gentiana asclepiadea L. (Anthes et al. 2003), Knautia spp. (Anthes and Nunner 2006), Scabiosa spp. (Anthes and Nunner 2006;Scherer and Fartmann 2022) or Succisa pratensis Moench (Wahlberg et al. 2002;Konvicka et al. 2003;Schtickzelle et al. 2005). The butterfly is listed in the EU Habitats Directive and its regional persistence increasingly depends on active vegetation management (e.g. Bulman et al. (2007), Scherer and Fartmann (2022)), including reintroductions (Davis et al. 2021).
The currently occupied grasslands were historically utilised as non-intensive pastures and litter meadows. A relatively large area of such grasslands (≈ 5% of the landscape; Junker et al. (2021)) was regionally spared the twin threats of woody encroachment and intensification owing to the existence of a military range, sanitation zones of two freshwater reservoirs and numerous mineral water zones with low agrochemical use. Following the Czech Republic's entry to the EU (2004), selected E. aurinia localities became Sites of (European) Community Importance (SCI), purposefully managed for the species. The prevailing management is light machine-mowing with temporary retention of unmown strips and host plant patches. Conservation grazing, applied for E. aurinia, for example, in Britain and Sweden (Smee et al. 2011;Johansson et al. 2020), is rarely used, because local beef farms operate with too large herds in relatively intensive regimes. In the wider environs, changes during the last two decades involved re-seeding former arable fields by grass mixtures, concurrent with a shift from wheat or dairy farming to beef ranching.
The targeted management of valuable sites, combined with non-intensive use of the surrounding landscape, should spare the submontane grasslands from further degradation. However, these efforts may not buffer the grasslands from multiple deteriorating factors, operating at both large and local scales. The large-scale threats include eutrophication by increased nitrogen and phosphorus inputs from the atmosphere (Stevens et al. 2010;Roth et al. 2021), known to impair the species' composition of oligotrophic grasslands (Bollens et al. 2001) and climatic effects, manifested by periods of drought and rising temperatures (Filz et al. 2013). The years 2015-2018 were the driest in recorded history in Central Europe (Buras et al. 2020;John et al. 2020)  and, while the average temperatures rose by 1.1 °C compared to pre-industrial situation globally (World Meteorological Organization 2019), they regionally increased by 2.1 °C compared to mid-20 th century (Czech Hydrometeorological Institute 2022). The small-scale threats stem from the fact that currently practised interventions differ from those that maintained the biotopes in the past (Valkó et al. 2018;Almásy et al. 2021;Kuhn et al. 2021). Light machine-mowing effects differ from those of manual mowing or low intensity grazing. For instance, mowing may support grasses at the expense of forbs (Stammel et al. 2003;Mládková et al. 2015) and homogenises vegetation due to the absence of small-scale sod disturbance (Tälle et al. 2016).
In this paper, we compare vegetation changes at E. aurinia colonies over a 15-year period and relate them to vegetation management of the sites and occupancy patterns by the butterfly. We analyse vegetation records from a selection of the occupied sites in 2005-7, i.e. in the time when a conservation regime was established and replication of the records from identical spots in 2020-21. In the intervening time, the sites were annually monitored for the presence of E. aurinia larval nests. Interpreting the changes in the plant species composition using readily available information on individual plants' ecological requirements (i.e. their indicator values: Ellenberg et al. (1992)) provides insights into drivers of the plant community change. Specifically, we asked: (1) How did the vegetation change between the two surveys? (2) Was the vegetation change linked to the management of the sites? (3) Were the vegetation changes linked to utilising individual sites by the butterfly? Further, we test the specific hypotheses: (4) that the interventions applied changed the vegetation composition (4a), that occupancy of the sites by E. aurinia (4b) and abundance of its larval nests (4c) changed between the two surveys and that management changed occupancy (4d) and nest counts (4e).

Study system and data collection
Euphydryas aurinia is a univoltine butterfly, with adult generation from late May to late June. Succisa pratensis, its locally used host plant, is a late-season richly blooming perennial, forming prostate leaf rosettes in early summer and growing to ≈ 70 cm in August to September. Pre-hibernation larvae form conspicuous silk-woven communal nests on the plants and finish development solitarily in the following spring.
Annual counts of the pre-hibernation larval groups at all known E. aurinia sites, i.e. habitat patches of varying size occupied by the butterfly and utilised for its larval development began in 2001. This was combined with searching for hitherto unknown sites in the wider environs of the occupied ones (Hula et al. 2004;Junker et al. 2021), in connection with vegetation mapping by the Czech Nature Conservation Agency (Härtel et al. 2009). The number of known sites gradually increased from 17 in 2001 to 97 in 2022. Their summed area is 295 ha (mean area 3.0 ± 7.47, range 0.08-70). Following discovery of a site, it was visited annually in September for larval nest counts and overall assessment.

Statistical analyses
As all the analyses are based on past-present comparisons from identical relevés; we used Wilcoxon-matched pairs tests to compare species richness, covers of vegetation layers, the representation of the host plant S. pratensis and covers of grasses and forbs, between the present and past surveys.
Changes in the species' composition of vegetation relevés were analysed using multivariate ordination techniques, always separately for vascular plants, mosses and all plants, in CANOCO for Windows 5.00 (Ter Braak and Smilauer 2012). Given the rather long gradients in the samples' species composition (all plants: 3.5, vascular plants: 3.6, mosses: 4.2), we used unimodal methods and, due to the horseshoe effects apparent in unconstrained ordination biplots, we used detrended correspondence analysis (DCA) for unconstrained and detrended canonical correspondence analysis (DCCA) for constrained calculations, always detrending by segments. DCA ordinates the species according to their distribution in samples. DCCA constrains the ordination by the predictor(s) of interest, testing the significance of species composition ~ predictors relationships via a Monte-Carlo test (999 permutations).
'Survey' (past vs. present, two states) was one of the explanatory variables used to compare plants species composition. Further explanatory variables were: 'Management', factorially coded as the intervention applied during the three years centred by the year of survey for the section of the site containing the relevé; we distinguished Neglect (past/present: 59/25 relevés), Conventional mowing (standard farm machinery, cutting the entire locality at once) (6/1) and Conservation mowing (light machinery, proceeding by strips or a chequer, intentionally leaving aside some S. pratensis patches) (1/40). 'Nests count' denotes the mean annual number of nests per site in the years 2005-9 (past) and 2017-21 (present). Finally, 'Occupancy' (two states, 1/0) denotes whether a ≈ 20 m diameter circle centred in the centre of the relevé contained E. aurinia larval nests(s) in the year of vegetation survey; this was assessed during the first week of September following the surveys.
As the numbers of relevés were unbalanced amongst sites and the sites varied, for example, in altitude, soil and moisture conditions, we relied on covariables-controlled partial ordinations to filter out the background variation amongst sites. We used two options. First, we defined a geography model explaining the plant species composition of the relevés by a linear combination of latitude, longitude and altitude; and their 2 nd -order polynomials and interactions, defining the most appropriate covariate model using the CANOCO forward selection procedure. Second, we entered site identities as a 12-state factor.
In the DCCAs, we reflected the time-replicated sampling by a split-plot permutation design, in which the two temporal replications per relevé represented whole plots, which were permuted as time series and the 66 relevés represented split plots, permuted randomly.
To interpret the ordination results by Ellenberg's bioindication values, we followed the fourth-corner approach (Legendre et al. 1997;Dray and Legendre 2008), which relates two tables, that of species composition of samples and that of relevant predictors, to a table of constituent species' traits. Specifically, we related the ordination axes returned by DCAs/DCCAs to a table of Ellenberg's indicator values using redundancy analysis (RDA), an ordination method analogous to linear regression. We selected significant traits via forward selection procedures, again with 999 permutations.
Testing the specific hypotheses 4a -4e relied on testing interactions amongst predictors. If predictors a and b, for example, 'survey' and 'management', both influence the species composition y, their effect can be additive, y ~ a + b, or multiplicative, y ~ a + b + a*b, the latter implying that 'management' exerted different effects during the first and second 'survey'. Then, setting the linear terms as covariates, y ~ a*b | a+b tests for the separate effect of the interaction, i.e. whether 'management' affected the vegetation differently during the first and second 'survey'. In cases when the interaction terms were significant, we again interpreted the results using plants' indicator values.
In the partial indirect DCA ordinations of species composition of samples, site always fitted more variation than geography (Table 1). We, therefore, used site as the covariate in all subsequent analyses. In the indirect DCA analyses for vascular plants (Fig. 2a, c), the primary gradients differentiated samples from wet waterlogged sites, with vascular plants, such as Filipendula ulmaria, Equisetum fluviatile, Potentilla palustre or Viola palustris, from drier vegetation containing, for example, Agrostis capillaris, Nardus stricta or Hypericum maculatum. Secondary gradients run from tall-growing, nutrients-demanding vascular plants, such as Descampsia caespitosa, Poa pratensis, Cirsium heterophylum, towards short-growing species characteristic for nutrient-poor or frequently disturbed substrates, such as Potentilla erecta, Valeriana dioica, Carex nigra. Succisa pratensis, the host plant of E. aurinia, was located within the latter group. For mosses (Fig. 2b, d), the primary gradient differentiated species requiring low nutrient sites, such as Sphagnum warnstorfii, S. teres, Tomentypnum nitens, from species tolerating high nutrients, such as Hygroamblystegium humile or Brachythecium mildeanum. The secondary gradient differentiated hygrophilous (e.g. Hygroamblystegium humile) from relatively xerophilous (e.g. Pleurozium schreiberi) species. Interpreting the indirect ordinations by indicator values (Table 1) corroborated that in vascular plants, the primary gradients were moisture-driven and secondary gradients nutrients-, light-or soil reaction-driven. This differed from mosses, in which nutrients drove the primary gradient. See Suppl. material 1 for ordination scores of all species.
Constraining the ordinations by the predictors of interest returned consistent results for vascular plants, mosses and all plants ( Table 2). The amounts of variation attributable to the predictors were always low, but the composition of relevés was significantly related to the predictors even when controlled for site identity, except for 'occupancy' and 'nest count' in the case of mosses. Whereas in vascular plants and all plants, results of almost all partial DCCAs were attributable to the plants' indicator values, only the partial ordination constrained by 'survey' was explicable by indicator values for mosses.
'Survey' exerted significant effects for both vascular plants and mosses, revealing that vegetation changed from past to present consistently across the sites. Interpretation by indicator values (Table 2, Fig. 3a, e)

showed lower values for nitrogen (vascular plants, mosses, all plants) and higher values for light (vascular plants and all plants)
in the past. Inspection of the ordination diagrams revealed a higher representation of vascular plants, such as Dactylorhiza majalis, Achillea ptarmica or Festuca ovina and mosses, such as Plagiomnium cuspidatum or Thuidium tamariscunum in the past and a higher representation of vascular plants, such as Cirsium arvensis or Festuca pratense and mosses, such as Drepanocladus aduncus or Polytrichum formosum at present. Succisa pratensis inclined towards the past, likely due to intervening loss from some relevés.
In the ordinations for 'management', primary gradients differentiated conservation and conventional mowing from neglect, whereas secondary gradients differentiated conventional and conservation mowing. In both vascular plants and all plants ordinations, Succisa pratensis ended up near the centres of ordination spaces. Interpretation of the gradients, significant only for vascular and all plants, showed an association of low indicator values for nitrogen and high indicator values for moisture, with neglect. Contrarily, conservation mowing was associated with higher indicator values for temperature and nitrogen (Fig. 3b).
The ordinations for 'occupancy', again significant for vascular plants and all plants, showed that presence of E. aurinia was associated with low nitrogen and high temperature. Succisa pratensis and multiple nitrogen-intolerant species, including Redlisted ones, were associated with occupied sites (e.g. Parnassia palustris, Carex rostrata, Trientalis europaea, Scorzonera humilis, Dactylorhiza fuchsii, Oxycoccus palustris). Nitrophilous plants, such as Urtica dioica, Heracleum sphondylium or Plantago major, were associated with unoccupied conditions (Fig. 3c).
For 'nest count', the explained variation after controlling for site identity was very low, but still indicated that sites with high nest counts were drier and displayed higher soil reaction than those with low nest counts. This was reflected by association of high nest counts with plants, such as Potentilla erecta, Holcus lanatus or Nardus stricta, as opposed to, for example, Juncus articulatus or Equisetum fluviatile (Fig. 3d). Succisa pratensis was situated centrally in ordination space.  0.055 0.035 0.031 0.001 11.4 3.2  Tests of the specific hypotheses regarding interactions about predictors (Table 3) detected no significant relationship for mosses, but several relationships for vascular plants and all plants, implying that vascular plants drove the patterns for all plants. We reject the hypotheses (4a) that effects of 'management' on vegetation differed between the 'surveys', (4b) that 'occupancy' systematically changed between the 'surveys' and  (4e) that 'management' influenced 'nest count'. The significant interaction 'survey*nest count' (4c) documented that nest numbers decreased at some occupied sites and increased at others, but these changes were not related to the plants' indicator values. The interaction 'management*occupancy' (4d) showed that conservation mowing of high moisture sites increased the chance of occupancy, while conservation-mowed dry sites tended to become unoccupied. This was apparent (Fig. 4) from association of such hygropilous plants as Filipendula ulmaria, Scutellaria galericulata or Potentilla palustris with the occupied conservation mown situations.

Discussion
Recent resampling of 66 vegetation relevés, taken from 12 sites occupied by the endangered butterfly Euphydryas aurinia in the Czech Republic 15 years ago and mostly managed for the benefit of the butterfly and its host plant during the interim period, revealed vegetation changes explicable by ecological requirements of the constituent plant species and demonstrably connected to the prosperity of E. aurinia colonies. The changes included both structural properties of vegetation, i.e. increased cover of trees, forbs and grasses and decreased cover of mosses and more subtle changes at the level of floristic composition. The most evident change was the increased representation of plants tolerating high nitrogen, accompanied by decrease in plants preferring high values of light. This corresponded with the increase in tree layer and increased cover of both forbs and grasses. Increased nitrogen load due to increased atmospheric deposition (Nijssen et al. 2017) and runoffs from close environs (Stoate et al. 2001;Butler et al. 2008) is a commonly acknowledged problem, causing declines of poorly-competitive plants from grasslands across Europe (Bollens et al. 2001;Stevens et al. 2010;Payne et al. 2013), including the Czech Republic (Novotný et al. 2016). Plants utilising high nitrogen tend to be bulky and competitively dominant, outcompeting light-demanding plants by shading their habitats. It follows that E. aurinia sites in western Czech Republic are not spared the general threat of eutrophication reported for this species (e.g. Brunbjerg et al. (2017)) and its host plant (Vergeer et al. 2003;Holder et al. 2020) from other parts of Europe.
The increased indicator values for nitrogen correspond with decreasing cover of mosses. Although the mechanisms are still disputed, decreasing bryophytes representation and species richness from temperate grasslands are associated with increased nutrient levels and accompanying increase of grasses and forbs (Arróniz-Crespo et al. 2008;Duprè et al. 2010;Müller et al. 2012).
Nutrient loads of grasslands are amenable by management interventions, such as removal of the biomass by mowing (Pecháčková et al. 2010;Ziaja et al. 2017;Swacha et al. 2018;Yang et al. 2019;Sienkiewicz-Paderewska et al. 2020). In our results, however, the 'management' effect on plant species' composition was counterintuitive, with nitrogen-demanding plants inclining towards mowing (both conventional or conservation-minded) and relatively thermophilous plants inclining towards either conservation mowing or neglect. Before interpreting this paradox, we should admit that our categorisation of interventions affecting the diverse sites over a decade and a half necessarily oversimplifies the matter. Whereas "neglect" is an unequivocal category, "conventional mowing" may vary amongst years in timing and numbers of cuts. The same applies for "conservation mowing", which is practised by concerned officers and volunteers, who adaptively react to the momentary situation at the sites (and E. aurinia populations), varying the timing, intensity and spatial extent of the interventions. Consequently, no intervention ("neglect") is more likely applied at nutrient-poor sites, such as dry heathlands or waterlogged bogs, whereas mowing intensifies in response to successional overgrowth. This probably generated the contradictory pattern for nitrogen and the expected pattern for thermal conditions. The absence of a significant effect  Table 3 for details. 1 -Agrostis canina, 2 -Agrostis stolonifera, 3 -Achillea millefolium, 4 -Alnus glutinosa juv., 5 -Angelica sylvestris, 6 -Bistorta major, 7 -Caltha palustris, 8 -Cardamine pratensis, 9 -Carex echinata, 10 -Carex nigra, 11 -Carex panicea, 12 -Carex rostrata, 13 -Cirsium palustre, 14 -Crepis paludosa, 15 -Epilobium palustre, 16 - for the 'survey*management' interaction (specific hypothesis 4a) corroborated that the effects of management did not change between surveys in a systematic way.
Two predictors related to utilisation of the sites by the butterfly were 'occupancy' and 'nest count'. Although values of the predictors refer to broader areas than to specific relevés ('occupancy') and even to entire sites ('nest count'), our results point to the affinity of E. aurinia towards warmer and/or drier conditions with relatively high soil reaction. This agrees with findings from E. aurinia populations developing on S. pratensis elsewhere: a preference for sparser sward easily penetrated by light in Poland (Pielech et al. 2017); a unimodal response to moisture in Denmark (Brunbjerg et al. 2017); the preferred oviposition at plants surrounded by low sward in Wales (Pschera and Warren 2018); and occurrence of larval webs amidst intermediate sward height in Britain (Botham et al. 2011). Regarding populations utilising other host plants, Scherer and Fartmann (2022) reported a preference for relatively warmer microhabitats from Scabiosa lucida developing populations in hummocky meadows in the German pre-Alps. Although E. aurinia is often described as a moist grasslands/fens/bogs species, it arguably prefers drier and warmer patches amidst such grasslands. A common denominator of all these requirements appears to be low soil nutrients. In the mostly intensively farmed landscapes of temperate Europe, low-nutrients conditions were more likely preserved in remote landscapes and poorly-accessible localities. Table 3. Results of DCCA tests of specific hypotheses of interactions amongst predictors. In all models, significance of the interaction between variables a*b was tested by setting a, b and Site as covariables, y ~ a*b | a + b + Site. The resulting ordination model, if significant, was further interpreted by the constituent plants' Ellenberg's indicator values. "n.a." stands for situations when no relationship to Ellenberg's values was found. It is, thus, tempting to view E. aurinia as a "refugee species", surviving in suboptimal areas or habitats due to human pressure on the optimal ones (cf. Kerley et al. (2012Kerley et al. ( , 2020). Whereas all the above observations are correlational, the tests relating the interactions between the butterfly-related ('nest count', 'occupancy') and sites-related ('management', 'survey') predictors to the relevés' species composition (Table 3) directly assess the vegetation effects on E. aurinia. Amongst them, we view the significant 'occupancy*management' interaction as the most interesting result. An identical intervention type -in this case, conservation mowing exporting biomass and increasing thermal intake -supports E. aurinia at moister sites, but not at drier sites. Cases when site conditions modulate the impacts of conservation interventions might be rather common (e.g. Morris (2000), Helden et al. (2020), Dumont et al. (2020), Bussan (2022)) and if ignored, they may result in undesirable outcomes of well-intended activities. This highlights the necessity to flexibly adapt vegetation management to both local variation amongst sites and interannual variation in such aspects as rainfall or phenology. In our study system, within the Slavkovský les PLA, management of E. aurinia sites is administered by a single administrative unit, the practitioners have first-hand experience with the system and routinely adapt the interventions according to the momentary circumstances. More generally in the Czech Republic, however, "management plans" for nature reserves are issued for 10-15 years' duration and lack the necessary flexibility (Ministry of the Environment of the Czech Republic 2018). Still less flexible may be generic management prescriptions for non-protected lands, such as provisions for the EU agri-environmental payments. The lack of flexibility may explain why effects of such incentives are sometimes disappointing (Batary et al. 2011;Concepción et al. 2012;Merckx and Pereira 2015).
A more general problem with conserving E. aurinia sites in the Czech Republic stems from the fact that it targets only the pre-defined localities known to be occupied by the species at some time interval during the last two decades. Although so far successful -the butterfly is still there and most of the sites (except for a few at the margins of national distribution) are still occupied (Junker et al. 2021 and unpublished data), this approach does not cover all potentially habitable sites in the wider environs. Within the Slavkovský les PLO, management of the farmland "matrix" underwent substantial transformation during the last decades, from marginal and unprofitable arable fields, through seeding of species-poor grass mixture, to the current prevalence of beef ranching accompanied by hay production at large scales. Within this matrix, new patches with abundant presence of the S. pratensis host plant occasionally appear, calling for a more dynamic, landscape-orientated conservation strategy for E. aurinia, which would presumably be more robust with regard to recently changing moisture and temperature conditions.

Conclusions
A vegetation survey of sites inhabited by a butterfly of European conservation concern, Euphydryas aurinia, its replication after 15 years and interpretation of the vegetation changes using plants' indication values contributed to understanding both the vegetation development and larval ecology of the butterfly. The vegetation changes observed may be characterised as increased nutrient load, causing shifts towards less heliophilous vegetation. This represents a potential problem for long-term existence of the butterfly, which prefers warmer, drier and less acidic conditions within otherwise cold, moist and base-poor sites for its larval development. Follow-up surveys that would document future vegetation development of the sites with respect to conservation interventions applied and changing climatic conditions are highly desirable. Even more fascinating results could be produced by periodic vegetation surveys of E. aurinia sites across the entire diversity of its habitats across Europe and beyond.

Introduction
China plays an important role in regulating legal wildlife trade and combating illegal wildlife trade for global biodiversity conservation and further research is required to understand the sustainability of demand and trade in wildlife products within the country (Esmail et al. 2020). In particular, frequent seizures of large quantities of pangolin products have attracted the attention of conservationists worldwide. All eight pangolin species (family Manidae), which occur in Asia and Africa, are threatened with extinction due to illegal trade and threats from trade are not a recent phenomenon. The Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) has listed most pangolin species on its appendices under regulation since 1975, the year the convention came into effect. A stricter zero-trade quota was applied to the four Asian pangolin species in 2000 to further reduce pressures from trade (CITES 2000). In 2016, all eight pangolin species were uplisted to CITES Appendix I, banning all wild-sourced commercial trade (CITES 2016). However, this legislation has not been able to prevent ongoing trafficking in pangolin products (Cheng et al. 2017;Heinrich et al. 2017;Challender et al. 2020).
Pangolin products in trade can be broadly grouped into three types: foods, medicines and ornaments (Challender et al. 2014;Heinrich et al. 2017). Confiscation reports show that each product type is supplied through different trading chains and sold in different end-markets. The current trade of pangolin products is truly international, with over 70 countries across four continents identified as being involved in the trade in various ways (Heinrich et al. 2017). Amongst these countries, China is under the spotlight as one of the main demand markets. All three types of pangolin products are traded in China, with meat and medicinal products reported most frequently (Yang et al. 2007;Xu et al. 2016). Traded medicinal pangolin products are commonly associated with Traditional Chinese Medicine (TCM) (Yin et al. 2015).
TCM has been developed and used in China for many centuries and is still widely practised today (Chow and Liu 2013). In 2016, TCM hospitals and clinics treated 962 million patients, while medical services using TCM treatment accounted for 15.8% of all medical services provided in China (National Health and Family Planning Commission of the PRC 2017). The use of Chinese pangolin (Manis pentadactyla) scales in TCM can be traced back to AD 480 in a medical text entitled "Bencaojing jizhu", which was later cited in the "Compendium of Materia Medica" (Li 1578). Chinese TCM practitioners generally approve the perceived medicinal value of pangolin scales and use them in prescriptions targeting a wide range of symptoms or illnesses . Although pangolin scales have recently been removed from the "Pharmacopoeia of the People's Republic of China", some patented medicines or 'zhongchengyao' in the Pharmacopoeia still contain pangolin scales as an ingredient (National Pharmacopoeia Committee 2020).
The Chinese pangolin has been listed as a Second-Class Protected Animal in China and protected under the Law on the Protection of Wildlife since 1989(SFA of China 1989. It was uplisted to First-Class protection in 2020, with an aim to provide more deterrence and resources for combating illegal trade (SFGA of China 2020). However, the Chinese government still permits a pangolin scale market to operate nationally for medicinal usage, with specific requirements on sources of supply, manufacturers, sellers, annual quotas allowed for sale and a product labelling system (Wang et al. 2022). Market requirements were released by China's State Forestry Administration (SFA, now State Forestry and Grassland Administration) and came into effect in 2008. These requirements specify 711 hospitals in China that are allowed to sell medicines containing pangolin scales (e.g. patent drugs or 'zhongchengyao') and pangolin scale medicines (individual ingredients to be used in prescriptions) (SFA of China 2007, 2008a. All manufacturers and traders are required to hold corresponding permits. There are 209 certified manufacturers in China, which are allowed to produce over 60 types of medicines containing pangolin scales . All pangolin scale medicines must carry certificates on the smallest packages, to differentiate clearly between legal and illegal products in end-consumer markets. Therefore, illegal trade or illegal products can be identified if: (1) trade participants, such as manufacturers or hospitals, do not hold corresponding permits and/or (2) traded products do not have certificates on the smallest packages.
Trade volume in China is regulated through an annual quota, which has a mean of 26.58 ± 1.58 tonnes, based on released data from 2008 to 2014 (range;25.09-29.23 tonnes;SFA of China (2008b). The detailed policy process that determines the annual quota amount is unclear (Wang et al. 2022) and these regulations also do not specify that scales should be from Chinese pangolins. The source of the legal quota is stockpiles held by individuals or companies, which could contain legal imports of pangolin scales from Africa or other parts of Asia when it was allowed. Seized pangolin products may also be auctioned to permitted buyers without species-level identification to supply the legal market. TCM demand for pangolin products within China therefore poses potential impacts on all pangolin species.
As China constitutes one of the largest demand markets for pangolin products, it is essential to understand the nature of TCM-related demand in order to provide information for management options. This approach is also important beyond pangolin conservation, as other threatened species are involved in the TCM trade in other contexts (e.g. saiga horn, bear bile; Doughty et al. (2019), Wang et al. (2022)). Key questions about pangolin trade in China include the trade volume, the level of demand from the legal market and whether the legal supply is the main source for traded products. These questions are difficult to answer, as the geographic scale over which trade takes place is vast and illegal trade is a sensitive topic to investigate. This study aims to provide preliminary answers to these questions using data from two Chinese provinces. Since our data were collected before the COVID-19 outbreak, our study also provides a baseline for understanding the pangolin scale trade from a quantitative perspective prior to the impact of the pandemic on the use of wildlife products in China.

Study area
The lead author conducted face-to-face semi-structured interviews in Mandarin between October 2016 and April 2017 across eight Chinese administrative regions in two provinces, Henan (Kaifeng and Zhengzhou Municipalities) and Hainan (Baisha, Haikou, Ledong, Qiongzhong, Sanya and Wuzhishan Counties) (Fig. 1). Chinese pangolin (M. pentadactyla) historically occurred in both provinces (Zhang 1997). The current status of the species across China is unclear; it has likely been extirpated from Henan, whereas Hainan still supports a highly-threatened remnant population (Nash et al. 2016).
These study regions were chosen for several reasons. First, we had readily available local networks in both regions that could help us gain access to respondents. Second, Henan and Hainan differ greatly in terms of geographic location, local ecology and biodiversity, culture, economy and human population (National Bureau of Statistics 2020). Henan has a high human population density and a long history of TCM utilisation. In contrast, Hainan is less economically developed and is culturally distinct; local Li and Miao ethnic groups do not have a history of using pangolin scales for medicine and consumption of pangolin products is a recently developed behaviour ). These regional variations provide increased site diversity, with the potential to indicate more general patterns of pangolin product usage across other areas of China that were not logistically feasible to cover in this study. The resulting dataset also allowed us to statistically investigate determinants of geographic variation in patterns of pangolin scale trade.

Interviews
Interviews were carried out in a semi-structured and discursive manner and included questions about annual sales of pangolin scale products and other trade-related information (Suppl. material 1). Some respondents did not answer all questions, either because they did not know the answers or were not willing to provide answers. Questions were not read directly from a questionnaire, but instead were asked in a more nuanced way according to the context of the specific interview conversation to gain more trust and connection with respondents.
Before each interview, we explained the aim of the study, how data would be used and other relevant information. All respondents remained anonymous and provided informed oral consent for participating. We did not ask direct questions about the legality of types of trade behaviour or medicinal products, but instead addressed these topics indirectly, for example, through questions about respondents' knowledge of pangolin trade-related regulations. Illegal trade was further assessed by checking whether hospitals were certified and by looking for trade certificates on products where these were available for examination. Interview methods and questions were piloted before the main study in five pharmaceutical shops in the town centre of Longlou, Hainan; no changes were subsequently made to the study protocols.
Respondents in this study included doctors from hospitals (either TCM hospitals or hospitals containing TCM departments) and shop owners or assistants from TCM shops. One respondent per hospital/shop was interviewed, but respondents sometimes asked other people within their establishment for further information about specific questions. We accessed TCM shops in Henan and doctors in Henan and Hainan through social connections (primarily through introductions to potential respondents by friends or family members); therefore, these respondents could not be specifically selected. All hospitals where interviews were conducted had TCM departments or sold TCM medicines. No target sample size was set for hospitals or Henan shop surveys; instead, we aimed to interview as many respondents as possible in the time available. A relatively low sample size was thus expected and unavoidable due to limited access to potential respondents. We located pharmaceutical shops in Hainan through cluster sampling, which involved using an online map (https://map.baidu.com/) to identify areas within each county town/city centre where pharmaceutical shops were centralised. We then conducted interviews in all shops in those areas. Respondents from shops were approached, based on their availability at that moment and their willingness to participate. We sampled a maximum of 30 shops for cities in Hainan where many pharmaceutical shops were located and surveyed all shops in smaller counties that had fewer than 30 shops.
We investigated three main types of pangolin scale products: raw scales, roasted scales (scales treated using standardised TCM procedures, including 'paoshanjia' and 'cushanjia') and scale powder (fine powder produced by blending and sieving roasted scales). Some patent drugs also contain pangolin scales as ingredients, but these medicines were not included in this study as the quantity of pangolin scales they contain was difficult to estimate. Therefore, only three types of pangolin scale medicines were considered, rather than all medicines containing pangolin scales. To calculate the total quantity of scales being traded, we used numerical conversions between these three types of products following Hu and Li (2007), who specified that weight loss of raw scales after the traditional processing method was 20-25%. The size and weight of one scale differs considerably across different pangolin species, animals of different ages and different parts of the body. Therefore, we used the lower boundary of this range (20%) to provide a conservative estimate of the final quantity and assumed 1 kg of raw scales would produce 0.8 kg of roasted scales. We also assumed 1 kg of raw scales would produce 0.53 kg of scale powder through the process of blending and sieving roasted scales, based upon information obtained from two TCM practitioners who independently reported the percentage loss as around one-third due to sieving; this ratio is supported by written descriptions on packages of pangolin scale powder medicine observed during our study, which stated "1.4 g of powder = 6 g of roasted scales". We again used the lower boundary of these estimations (33.3%) as a conservative ratio. Reported quantities were all converted to roasted scales for data processing.

Analytical methods
Generalised linear models (GLMs) were used to investigate potential relationships between annual sale quantities of pangolin scale products (normally distributed response variable) and potential predictor variables (see Table 1). We assumed that product price, local human population size, local economic development, hospital size and pharmaceutical shop type (chain shop versus private shop) might all potentially influence the annual sales quantity: we expected that product price and private shops might be negatively correlated with higher sales, whereas other predictors might be positively correlated. Separate analyses were conducted for results from hospitals and shops. All predictors were first included in maximal models and stepwise selection (R function stepAIC) was then used to find the best-performing models with the lowest Akaike Information Criterion corrected for small sample size (AICc). Additional GLMs were conducted to investigate which variables could predict whether hospitals or shops reported selling pangolin scale products (binary response variable). One-sample t-tests were also conducted to compare means of sales between years. All statistical analyses were performed using R version 3.5.2 (RStudio Team 2015). Study methods were reviewed and approved by the Department of Geography Ethics Review Group, University of Cambridge (#1503).

Results
We interviewed doctors or sellers from 41 hospitals and 134 pharmaceutical shops in eight municipalities/counties. Pangolin scale products were sometimes on display and we observed both legal and illegal products being sold (i.e. products with and without trade certificates) by both legal and illegal sellers (i.e. certified hospitals and noncertified hospitals and all pharmaceutical shops). Hospitals with legal trading permits were also found selling products without the legally required certificates, while legal products were also found in uncertified hospitals.

Hospital results
A high proportion of interviewed hospitals sold pangolin products, with sales quantities of considerable size. Of the 41 hospitals, 27 (65.9%) were found to sell pangolin scale products and 20 reported data on sales quantity. In total, these 20 hospitals sold 1905.8 kg roasted scale (2382.3 kg raw scale) in one year, with a mean of 95.3 ± 193.9 kg per hospital per year. Eight of the 27 hospitals that sold pangolin scale products held a legal permit. Seven of these eight permitted hospitals reported sales data; in total, these seven hospitals had sold 423.0 kg of roasted scales during the previous year, equivalent to 528.8 kg of raw scales. In addition, 13 of the 19 unpermitted hospitals that sold pangolin scale products also provided sales quantity data; in total, these 13 hospitals had sold 1482.9 kg of processed scale products during the previous year, with a mean of 114.1 ± 226.5 kg per hospital per year, equivalent to 1853.6 kg of raw scales.
Reported annual sales also varied considerably across hospitals (Fig. 2). Ten hospitals reported selling less than 20 kg per year, while four hospitals reported annual sales quantities of over 200 kg of roasted scales; three of these four hospitals did not hold trade permits. The four hospitals with the highest annual sales quantities were all located in Henan. Sales quantity was not explained by any predictor variables included in GLMs (P > 0.1 or NA, N hospital = 19), including price (Fig. 2). Hospitals in cities with larger populations were more likely to report selling scale products (P = 0.03, N hospital = 41).
Two hospitals in Zhengzhou also provided longer-term sales data on the total amounts of roasted scales purchased each year from 2012 to 2016, in addition to sales data for 2017 (Fig. 3). Both hospitals purchased pangolin scale medicines multiple Table 1. Predictors included in GLMs investigating sale of pangolin scale products in hospitals and pharmaceutical shops, with variable type specified. "3A-grade" refers to the Chinese system for evaluating hospitals (3A is the highest grade).

Predictors
In both  times per year depending on demand, so that purchase quantities indicate annual sale quantities. For both hospitals, the mean of sales across the previous five-year period did not differ statistically from sales data estimated for 2017, based on sales during the first four months of 2017 (one-sample t-test, P hospital 1 = 0.208; P hospital 2 = 0.998), indicating that sales data for 2017 are representative of the general market situation.

Shop results
Over a third (n = 46, 34.3%) of the 134 surveyed pharmaceutical shops reported selling pangolin scale products. Sales quantity again varied substantially across shops (Fig. 2). A total of 25 shops reported sales quantity data. These shops sold a total of 53.3 kg of processed scales in one year (66.6 kg of raw scales), with a mean of 2.1 ± 4.0 kg per shop per year. Reported sales quantity again varied greatly amongst shops, with most  shops reporting very low sales quantities and a few shops reporting high sales quantities. Sales quantity was not explained by any predictor variables included in GLMs (P > 0.1 or NA, N shop = 25), including price (Fig. 2). Chain shops were more likely to report sales compared to private shops (P = 0.003, N = 134). A total of 11 shop respondents provided information on variation in demand for pangolin scales over the previous few years. Eight of these 11 respondents said that demand was stable because TCM products were typically purchased by relatively wealthy consumers, whereas the remaining three respondents said that demand fluctuated with price. Two respondents (one hospital respondent and one shop respondent) provided information on seasonal variation in demand; both respondents considered that TCM sales in general were lower during summer months, because TCM "tasted bad" (referring to the bitter taste of most TCM products) and people did not like to take it when their appetite was already reduced due to hot weather.

Discussion
Our study reveals that pangolin scale products were widely available in hospitals and pharmaceutical shops across Henan and Hainan Provinces during the time of our survey. At a policy level, government regulations specify that only 711 listed hospitals in China are legally allowed to sell pangolin scale products directly to consumers (SFA of China 2007, 2008a. However, we found that 46% of surveyed hospitals and 34% of surveyed pharmaceutical shops were selling pangolin scale products illegally. Some legal sellers (i.e. permitted hospitals) were also observed selling illegal pangolin scale products, as evidenced by the absence of trade certificates on product packages. Our study, therefore, highlights the urgent need for managing legal and illegal pangolin product trade across China.
The presence of illegal pangolin scale products in pharmaceutical shops has been previously reported (Yin et al. 2015;Xu et al. 2016). However, little research has previously been conducted on trade in pangolin products in Chinese hospitals. The cooccurrence of legal and illegal trade makes the trade in hospitals more complex than for other types of sellers. Our results show that many hospitals sell pangolin scale products illegally and annual sales quantities reported by these hospitals are also high and of significant conservation concern. These findings together suggest that illegal products have penetrated the legal pangolin product sale system.
The quantity of pangolin scale products involved in the three types of sales channels varies substantially, both between different groups of sellers and within the same seller group. None of the variables investigated in this study was found to explain the observed variation. Qualitative observations during interviews suggest that this variation might instead be more related to individual doctors' preferences and specialities rather than wider external factors, posing difficulties in estimating accurate provinciallevel or national-level trade volumes across China using regional data.
However, it is still meaningful to estimate market size on a larger geographic scale, to provide some understanding of the possible level of medicinal trade in pangolin products across China. Extrapolating from the sales data that we obtained from permitted hospitals included within our study, we provisionally estimate that annual sales across the 711 permitted hospitals in China could be 53.7 tonnes, twice the mean legal quota of 26.58 tonnes per year. Although we recognise that this extrapolation is only approximate, it suggests that the demand on medicinal pangolin products in China is likely to be high and above legally permitted levels. We also note that our survey did not collect data on sales levels for patent drugs, another group of medicines that can legally contain pangolin scale products and share the annual national quota with prescription medicines. As our estimates are based only on data for prescription medicines, actual trade volumes of medicinal pangolin products could, therefore, be even higher, further widening the gap between legal supply capacity and market demand.
Our results show that sale of medicinal pangolin products is more likely to be reported by hospitals in cities with higher populations, suggesting a potential greater demand for pangolin products across more heavily urbanised areas of China and, thus, raising further potential concerns about sustainability of levels of demand. Our shop survey results also demonstrated that sale of pangolin products was more likely to be reported by chain shops rather than by private non-chain shops. This finding might indicate either genuine variation in sales between different types of shops or that owners of private non-chain shops know more about the illegal nature of pangolin scale trade and are, thus, less likely to report sales .
Reducing illegal pangolin trade within China is recognised as a global conservation concern (Wang et al. 2022). Our results thus demonstrate that illegal trade in pharmaceutical shops and uncertified hospitals within China represents a management priority for pangolin conservation. Understanding and reducing consumer demand for pangolin products and collaborating with the TCM sector to encourage the use of substitutes and raise awareness about legislation, are identified as potential mitigation strategies (Luo et al. 2011;Wang et al. 2020). Another proposed option is to increase the supply of medicinal pangolin products through farming. However, although conservation breeding has been carried out for some pangolin species (Parker and Luz 2020;Yan et al. 2021), establishing pangolin farming at a commercial scale is still a long way off, due to numerous well-recognised technical difficulties, such as designing suitable feed (Wicker et al. 2020). The two current problems facing pangolin conservation that we highlight in China, widespread illegal trade and very limited legal supply capacity compared to market demand, instead both need urgent mitigation rather than relying on solutions that are not feasible in the near future. More generally, the effectiveness of farming wildlife to relieve pressure on wild populations requires comprehensive and critical evaluation to ensure that it does not have unforeseen negative consequences, such as increasing demand or providing a cover for laundering of illegal products (e.g. Kirkpatrick and Emerton (2010); Lyons and Natusch (2011);). More research is, therefore, needed to understand the complex dynamics of pangolin trade, to ensure that commercial farming would not pose additional threats to wild pangolin populations (Challender et al. 2019;Chen and 't Sas-Rolfes 2021).
Collecting accurate data using social science research methods can be challenging for sensitive topics such as illegal activities. However, illegality of pangolin product trade was not well recognised by most respondents during interviews , which led to greater willingness to discuss the trade openly. Moreover, we aimed to build trust with respondents by providing maximum anonymity and maintaining a neutral position during interviews (Cunliffe and Alcadipani 2016). All hospital respondents were approached through social network connections, which further helped to establish trust (van Uhm 2016). Respondents were not pushed to answer questions and were free to withdraw at any point during the interview. However, despite these protocols, we consider it more likely that respondents will have under-reported rather than over-reported sales behaviour due to the sensitive nature of the subject and our direct questioning technique (Olmedo et al. 2021). This consideration thus suggests that actual sales levels of medicinal pangolin products might be greater than indicated in our study, raising even further concerns about the sustainability of this trade and its wider impact on wild pangolin populations.

Conclusion
Our study shows that the illegal pangolin scale trade is widespread in China and that pharmaceutical shops and hospitals are both contributors. Legal products were observed in illegal sale channels and hospitals holding legal permits also sold illegal products. The amount of scale products sold by certified hospitals might greatly exceed the legal supply capacity. Our findings, thus, highlight the need to better regulate and rethink the current legal market. More detailed regulations might be needed to ensure a close legal trade framework that will prevent legal products leaking out of allowed trade routes, while preventing the sale of illegal products. Redesign of existing legal market management strategies, combined with large-scale demand-reduction interventions and stronger enforcement of existing regulations, is, therefore, an urgent policy priority in China for both global pangolin conservation and public health concerns.
Although this study only focused on pangolins, our methods and insights are also useful for studying and conserving other threatened species involved in the TCM trade. For example, the saiga horn trade is regulated within a similar legal framework in China, with only certified hospitals allowed to sell products to patients (SFA of China 2007). Conservation issues identified in pangolin trade might, therefore, also be concerns in saiga horn trade and other animal-product trades. This highlights the need for more research on the dynamics and sustainability of TCM trade across different species, with a focus on understanding markets and assessing enforcement effectiveness. We also recognise that this study was conducted before the COVID-19 pandemic, which has had a major impact on wildlife trade management and markets within China. We, therefore, provide a baseline for future studies to investigate markets and trade dynamics before and after COVID-19.