Benthic studies in LTER sites : the use of taxonomy surrogates in the detection of long-term changes in lagoonal benthic assemblages

In benthic studies, the identification of organisms at the species level is known to be the best source for ecological and biological information even if time-consuming and expensive. However, taxonomic sufficiency (TS) has been proposed as a short-cut method for quantifying changes in biological assemblages in environmental monitoring. In this paper, we set out to determine whether and how the taxonomic complexity of a benthic assemblage influences the results of TS at two different long-term ecological research (LTER) sites in the Po delta region (north-eastern Italy). Specifically, we investigated whether TS can be used to detect natural and human-driven patterns of variation in benthic assemblages from lagoonal soft bottoms. The first benthic dataset was collected from 1996 to 2015 in a “choked” lagoon, the Valli di Comacchio, a lagoon characterised by long water residence times and heavy eutrophication, while the second was collected from 2004 to 2010 in a “leaky” lagoon, the Sacca di Goro, a coastal area with human pressure limited to aquaculture. Univariate and multivariate statistical analyses were used to assess differences in the taxonomic structure of benthic assemblages and to test TS on the two different datasets. TS seemed to work from species to family level at both sites, despite a higher natural variability of environmental conditions combined with multiple anthropogenic stressors. Therefore, TS at the family level may represent effective taxonomic surrogates across a range of environmental contexts in lagoon Nature Conservation 34: 247–272 (2019) doi: 10.3897/natureconservation.34.27610 http://natureconservation.pensoft.net Copyright Valentina Pitacco et al. This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. RESEARCH ARTICLE Launched to accelerate biodiversity conservation A peer-reviewed open-access journal

Benthic studies in LTER sites: the use of taxonomy surrogates in the detection of long-term changes in lagoonal benthic assemblages

Introduction
Taxonomic sufficiency (TS) is an analysis technique developed in light of the current need for rapid and reliable procedures in marine impact assessment and monitoring.
The basic concept behind TS (Ellis 1985) is that the identification of taxa at taxonomic levels, higher than the species level, enables the detection of changes in assemblages exposed to environmental stress without significant loss of information.In marine systems, TS is supported by a number of studies suggesting that using higher-level taxa (e.g.genera or families) is an affordable approach to depicting changes in the structure of macrobenthic assemblages.Such studies have been performed worldwide and have proven the efficiency of TS in different habitat types, from soft bottom (e.g.Vanderklift et al. 1996, Włodarska-Kowalczuk andKędra 2007) to hard bottom (e.g.Mistri andRossi 2000, Terlizzi et al. 2002), from high latitudes (e.g.Gray et al. 1990, Wlodarska-Kowalczuk et al. 2005) to tropical areas (e.g.Guzmán andGarcía 1996, Warwick et al. 1990).Indeed, TS saves both time and costs, potentially obviating the need for arduous precise taxonomic identification and has, therefore, received increasing attention in recent years (Olsgard et al. 1998).
In particular, TS is mandatory when non-destructive sampling techniques have to be used and taxonomic resolution is low (Roberts et al. 1994, Terlizzi et al. 2003).It has been applied in the assessment of both environmental impacts -focusing on large pollution gradients, mostly including oil and heavy metal pollution (Dauvin et al. 2003) -and assemblage variation along natural gradients (Terlizzi et al. 2009).It is also useful for comparing data from different habitats or geographical regions (Warwick and Clarke 1993) when data at the species level could introduce noise into the analysis, given their higher dependence on natural variations and biogeographic areas (Warwick 1988b).In addition, TS has been tested as a tool for selecting Marine Protected Areas (Vanderklift et al. 1998).
That being said, the applications of TS, particularly in conservation studies, have been criticised (Terlizzi et al. 2003).First and foremost, the proliferation of TS studies has led to a fragmented knowledge base, since most attempts have been focused on finding a general taxonomic-sufficiency level, rather than on building a general theory of TS.Hence, factors influencing the effectiveness of TS still require a great deal of clarification (Dethier and Schoch 2006).Indeed, TS has been mainly used to describe the spatial patterns of macrobenthic communities -usually in response to heavy disturbances, such as oil spills (see Dauvin et al. 2003) -or to compare areas with different levels of anthropogenic disturbance (Arvanitidis et al. 2009).In contrast, data on issues, such as using taxonomic surrogates to investigate long-term temporal patterns of marine assemblages, is still scant (Fraser et al. 2006, Musco et al. 2011).This is mainly due to the lack of long-term monitoring programmes based on the description of variables at the species level (Musco et al. 2011), an issue we set out to address in this study.Specifically, we tested the efficiency of TS at two long-term ecological research (LTER) sites in the Po delta region of north-eastern Italy (Mediterranean Sea).The first benthic dataset was collected from 1996 to 2015 from a "choked" lagoon, the Valli di Comacchio -a lagoon characterised by long water residence times, while the second was collected over a period from 2004 to 2010 from a "leaky" lagoon (defined here as a lagoon which receives water and discharges it from one or more points), the Sacca di Goro.According to the Ramsar Convention, both LTER sites are wetlands of international importance and they constitute part of the Regional Park of the Po River Delta -one of the largest Mediterranean deltaic systems.They are also classified as Special Protection Areas (SPA) under the Birds Directive (2009/147/EC) in the Natura 2000 Italian network of protected areas.
At the same time, both sites are heavily affected by human pressures, mainly related to agricultural and aquacultural activities.The Valli di Comacchio is mostly affected by eel aquaculture and Sacca di Goro is subjected to intense bivalve fishing.Moreover, both are sites of eutrophication caused by excessive nutrient loads (Mistri et al. 2001) and contamination by pollutants of mainly agricultural origin (Pitacco et al. in press a, b).Given the ecological and economic importance of these sites, long-term monitoring is fundamental for understanding the effect of anthropogenic stress on the macrobenthic community and, therefore, planning efficient management plans for conservation of their ecosystems.Since such projects usually benefit from limited financial support, TS could be a useful tool for improving the cost/benefit ratio of environmental monitoring and enabling more efficient use of available resources (Chapman 1998).
The aim of the present study was, therefore, to test the efficiency of TS in longterm monitoring at each of the two LTER sites.Information loss was calculated for (i) different levels of taxonomic aggregation (from species to phylum) and (ii) different data transformations (row data, square root, logarithm, presence/absence), in order to understand whether and how the structure and taxonomic complexity of benthic assemblages influence TS results and whether TS can be used to detect natural and human-driven patterns of variation in long-term environmental monitoring.

Study area
The Valli di Comacchio (Figure 1) is the largest lagoonal system in the Po River delta and consists of choked lagoons with an average depth of 1 m.The Valli is characterised by limited water renewal (Table 1) and high nutrient load (Mistri et al. 2000), being surrounded by earthen dykes and separated from the Adriatic Sea by the 2.5 km wide Spina spit.There are, however, two marine channels directly connected to the lagoons and freshwater input comes from the Reno River and a few drainage canals.Both marine and freshwater inflows are regulated by sluice gates and dams.
Sacca di Goro (Figure 1), on the other hand, is a leaky coastal lagoon, located in the southern part of the Po River delta, with an average depth of about 1.2-1.5 m.Salinity varies in relation to river and seawater inflows (Simeoni et al. 2007).The majority of its freshwater input is from the Po di Volano canal (about 3.5×10 8 m 3 /y), which flows directly into the lagoon and the artificially regulated deltaic branch of the Po di Goro.Secondary freshwater sources are three irrigation canals with similar flow rates (2.0-5.5×10 7m 3 /y) called Giralda, Romanina and Canal Bianco.Marine inflows -which vary according to tidal dynamics -originate from two estuarial mouths connecting the lagoon to the northern Adriatic Sea (Natali et al. 2016).Hence, the Sacca di Goro is characterised by large daily variations, depending on the height of the tides as well as seasonal fluctuations in the physicochemical composition of its waters (Table 1) (Corbau et al. 2016).Most of its lagoon bed consists of silty-clay sediments (Table 1), carried to the sea by the rivers, but there are also areas of mainly sandy sediments, in particular near the lagoon mouths and behind the spit (Simeoni et al. 2000).

Sampling
In the Valli di Comacchio lagoons (COM), sampling was performed from 1996 to 2015 at four sampling stations, COM1 to COM4 (Figure 1).Samples were taken seasonally throughout most of the study period.Sites COM1 (44°36.95'N,12°07.38'E)and COM2 (44°38.47'N,12°09.25'E),located in the northern part of the lagoon system, were subjected to both marine inflows and continental input and site COM3 (44°33.88'N,12°10.25'E), in the southern part, to seasonal freshwater inflows from the Reno River.Site COM4 (44°36.10'N;12°12.73'E), on the other hand, was located in the central, most confined area, which was only occasionally influenced by marine inflow when the Bellocchio drain was opened.
In the Sacca di Goro lagoon (GOR), sampling was performed from 2004 to 2010 at three sampling sites, GOR1, GOR3 and GOR4 (Figure 1).Samples were taken seasonally throughout most of the study period.Site GOR1 (44°49.65'N,12°16.91'E),located in the western part of the lagoon, was influenced mainly by freshwater discharged from the Po di Volano and Giralda and is, therefore, characterised by variable salinity.Site GOR3 (44°48.73'N,12°20.33'E)was located in the eastern part of the lagoon and site GOR4 (44°49.78'N,12°18.31'E) in the central area and was influenced more by tidal exchange.
Three replicates were collected seasonally for macrofaunal community analysis using a 4l Van Veen grab.For the scope of the present work, averages of the replicates were considered and sampling stations were used as replicates in order to obtain a global picture of the general status of each lagoon.These samples were sieved at 0.5 mm and preserved in 8% formalin.Animals were carefully sorted, identification was performed up to the species level in most cases (exceptions were due to the poor condition of the animals) and all specimens were counted.

Data analysis
At each of the LTER sites, the annual averages of the following structural indices were calculated: species richness (S), Shannon diversity index on log e basis (H') and Pielou index (J').A chi-square test, applied to Kruskal-Wallis (KW) ranks (Kruskal and Wallis 1952), was run to check for significant differences between years.Those calculations were performed using R version 2.4.0 (R Development Core Team 2008).For each of the two LTER sites, 'loss of information α' (α) was determined for each year according to Bacci et al. (2009).Specifically, the difference NTx -NTx+1 was calculated, in which NT expresses the number of taxa identified at the taxonomic level x and x+1 expresses the taxonomic level above level x. 'α' values give general information about the taxonomic heterogeneity-complexity within each level considered (higher values correspond to a greater information loss and vice versa).In order to describe the information loss along the taxonomic structure, the 'α' value was calculated stepwise from the lowest to the highest taxonomic level considered (species-genus, genusfamily, family-order, order-class, class-phylum).The percentage was then calculated as follows: (NTx -NTx+1)/NTx * 100.
For each of the two LTER sites, abundance matrices were produced for each of the six taxonomic levels (species, genus, family, order, class and phylum) and for each of four different transformations (none, square-root, logarithm and presence/absence).Affinities between years were established using the Bray-Curtis similarity.For each dataset, a second-stage non-metric multi-dimensional scaling (MDS) ordination was plotted to visualise differences between similarity matrices at different levels of taxonomic aggregation and data transformation.Spearman's rank correlation coefficient (r s ) was calculated between matrices at the species and higher taxonomic levels.The stress of the two-dimensional plot was calculated using Kruskal's stress Formula 1 (Clarke and Green 1988).Stress is a measure of the reliability of the representation, a value < 0.1 being considered a good result (Clarke and Warwick 2001).
To test the effect of different data transformations on the effectiveness of taxonomic sufficiency, a third-stage resemblance matrix was built.This third-stage resemblance matrix, defined as a second second-stage resemblance matrix, constructed using rank correlations between corresponding elements in the set of second-stage matrices following Arvanitidis et al. (2009), was constructed for each dataset using second-stage matrices constructed for each data transformation.Spearman's rank correlations (r s ) between resemblance matrices were tested using RELATE, a non-parametric analogue to the Mantel test, using 9999 permutations.All these calculations were performed us-ing the PRIMER v6 + PERMANOVA software package (Anderson et al. 2008, Clarke andGorley 2006).
Identifying temporal changes in macrobenthic communities is fundamental for the efficiency of monitoring programmes.Therefore in order to identify breakpoints in each multivariate dataset, "Constrained Clustering Analysis" was performed on each of the following six matrices: species, genus, family, order, class and phylum.This technique, originally developed for stratigraphic analysis, is more suitable for time-series analysis than for ordinary unconstrained cluster analysis, since only adjacent clusters, according to sample order, are considered for merging.The Bray-Curtis similarity was calculated on the square-root transformed data and the CONISS algorithm, which relies on the incremental sum of squares (Grimm 1987), was used as an agglomeration method.The Broken Stick model (Bennett 1996) was applied to determine the number of significant groups identified by the cluster analysis.Calculations were performed using the 'vegan' and 'rioja' packages of R version 2.4.0 software (R Development Core Team 2008).
In order to test the significance of variations in taxa-abundance matrices between identified clusters, permutational multivariate analysis of variance, PERMANOVA (Anderson et al. 2008), was carried out on the six matrices corresponding to the different taxonomical levels.A single-factor design (with the number of levels corresponding to the number of clusters) and the "unrestricted permutation of row data" with 9999 permutations were applied, as recommended by Anderson et al. (2008).A multivariate t-statistic analogue was used for subsequent pairwise comparisons (Anderson et al. 2008) and, to test whether the between-group variation identified by cluster analysis was also due to a dispersion component, a dispersion homogeneity test (PERMDISP) was performed on the same matrices.The same procedure was followed for each type of data transformation and for each of the two LTER datasets.For all analyses, p < 0.05 was chosen as the significance threshold.
In contrast, the Sacca di Goro (GOR) dataset comprised a total of 88 taxa at the lowest taxonomic level, in this case belonging to 7 phyla: Annelida, Arthropoda, Mol- lusca, Cnidaria, Nemertea, Sipuncula and Platyhelminthes.The annual averages for S, H' and J' (Figure 2) displayed no significant differences throughout the study period (KW, p > 0.05), although S fell from 27 ± 5 SD in 2004 to 17 ± 4 SD in 2005; H' from 1.64 ± 0.81 SD in 2009 to 1.27 ± 0.92 SD in 2010; and J' from 0.54 ± 0.16 SD in 2009 to 0.42 ± 0.32 SD in 2010.

Taxonomic complexity and information loss
At the COM site, the taxonomic complexity was highly variable throughout the study period, with total 'loss of information α', from species to phylum level, showing the highest values (45) in 2001 and 2002 and the lowest (8) in 2011 (Figure 3A).Only one genus contained eight species, two genera contained three species and nine genera contained two species; the remainder contained only one species.At the higher level, two families contained six genera, two contained five genera, three contained three genera and six contained two genera; the remaining families contained only one genus.Each family contained from 1 to 9 species.The most specious families were Spionidae and Phyllodocidae with 9 species, followed by Serpulidae with 7 species.Information loss (α) from the species to genus level was observed in 1998, 1999, 2000, 2001, 2002, 2003, 2009 and 2013.The highest loss (16%) was observed in 2001.Information loss (α) between genus and family levels was observed every year, with the exception of 2011, at percentages ranging from 7% in 2015 to 32% in 2002, indicating the presence of families with more than one genus.Information loss (α) was also observed every year at the family-order level (from 11% in 2003 to 31% in 2009), order-class level (from 14% 2013 to 40% in 2011) and class-phylum level (from 6% in 2001 and 2002 to 20% in 2011).
At the GOR site too, the taxonomic complexity was more or less constant throughout the study period, with information loss (α) ranging from 24 to 40% from the lowest to the highest level (Figure 3B).A simple taxonomic structure was recorded for this site: only two genera contained three species, ten genera contained two species and the rest contained only one species.At the highest level, only two families contained four genera, three families contained three genera, seven contained two genera and the rest contained only one genus.Each family contained from 1 to 7 species.The Spionidae family was the most specious with 7 species represented.Information loss (α) was recorded every year in the steps between all levels considered, descending from species to phylum.The greatest information loss between species and genus levels (13%) was observed in 2007 and the lowest (6%) in both 2005 and 2006

Information loss in multivariate data structure
The ordination of similarity matrices in second-stage MDS plots (Figure 4) showed a typical 'fan' pattern for each investigated dataset, with a vertical and horizontal spread of points at increasing taxonomic aggregation and data transformation, respectively.Low stress values were obtained (<0.1) in both cases.The effects of aggregation and transformation, however, varied between the two datasets.
For the COM dataset, there were good correlations between ordination plots at the species and genus levels (always r s > 0.96; p < 0.05) and between the species and family levels (always r s > 0.88; p < 0.05), whatever the type of transformation considered (Figure 4A).Data points, derived from species and genus abundance from matrices with the same data transformation, tended to overlap, indicating that the similarity matrices were very closely related.Conversely, at higher levels, the similarity progressively decreased (order: r s > 0.78; p < 0.05, class: r s > 0.49; p < 0.05, phylum: r s > 0.29; p < 0.05), in particular for untransformed and square-root transformed matrices (Figure 4A).The pattern of similarity between matrices with different levels of taxonomic aggregation was consistent between different data transformations, as indicated by the third-stage correlation matrix (Table 2); this shows high correlations, in particular between untransformed, square-root and log-transformed second-stage ordination matrices (always r s > 0.9; p < 0.05).Information loss increased with increasing taxonomic aggregation, but the distances between ordination plots at different taxonomic levels varied with the strength of the data transformation.Correlations between ordination plots at species and genus levels ranged from r s = 1 for untransformed data to r s = 0.96 for species-abundance transformation; between species and family levels, these varied from r s = 0.94 for untransformed data to r s = 0.89 for species-abundance transformation, indicating a greater information loss as the transformation strength increased.Such a variation was more marked between order-, class-and phylum-level ordination plots, but did not follow a general trend.Correlation between similarity matrices at species and order aggregation varied from r s = 0.92 (untransformed) to r s = 0.78 (presence/absence); between species and class aggregation matrices, it varied from r s = 0.65 (presence/absence) to r s = 0.49 (square root); and between species and phylum aggregation, it varied from r s = 0.63 (presence/absence) to r s =0.29 (square root).This suggests that, at higher levels of aggregation, information loss was lower at stronger transformations, in particular between class and phylum levels.
For the GOR dataset, the ordination plot showed a clear clustering pattern amongst untransformed similarity matrices at all different taxonomic levels (Figure 4B).The significantly high values of Spearman's correlation between the species similarity matrix and matrices at the higher taxonomic levels (always r s > 0.95, p < 0.05) indicated that very little information about the general structure of the community was lost going from the species to higher taxonomic levels.However, for this dataset, this pattern was not consistent amongst matrices with different data transformations (Figure 4B), as shown by the low correlation values of the third-stage matrix (always r s < 0.9; Table 2).The distance between matrices at different taxonomic aggregations increased with increasing data transformation strength.Furthermore, similarity matrices at the genus and family levels displayed a significantly high correlation with matrices at the species level (r s > 0.90 and r s > 0.80, respectively; p-value < 0.05), irrespective of the data transformation.This indicates that reasonably little information was lost between species and genus and family levels, even at the strongest transformation.Conversely, in similarity matrices aggregated at higher taxonomic levels (order, class and phylum), information about the structure of the benthic assemblages markedly decreased, as indicated by the low correlation values with species similarity matrices (order: r s > 0.63; class: r s > 0.47; phylum: r s > 0.50; p-value < 0.05).Interestingly, for those higher taxonomic levels, the distances from similarity matrices at species levels did not always increase with decreasing taxonomic resolution.In fact, the phylum-level similarity matrices were closer to those at the species-rather than class-level matrices, especially with the square-root transformation (Figure 4B).
In the Valli di Comacchio, constrained cluster analysis, based on square-root transformation (Figure 5) at the lowest taxonomic level, yielded 7 main groups, separated by 5 breakpoints between the years: 1997-1998, 1998-1999, 1999-2000, 2002-2003, 2009-2011 and 2013-2014.In the Sacca di Goro LTER site, on the other hand, constrained cluster analysis (Figure 6) on square-root transformed data at the lowest taxonomic level showed 4 main groups, separated by 3 breakpoints between the years: 2005-2006, 2006-2007 and 2008-2009.The results of the PERMANOVAs and PERMDISPs applied at different levels of taxonomic resolution and data transformation on all datasets are summarised in Table 3.For the Valli di Comacchio dataset, PERMANOVA highlighted significant differences (p < 0.05) in macrobenthic assemblages between the groups identified by cluster analysis at the species level.Those differences were significant across all taxonomic levels and data transformations, with only one exception: such differences were not retained at the phylum level with presence/absence transformation.Nevertheless, pairwise comparisons revealed that not all possible combinations of clusters showed significant differences.The number of significant pairwise differences decreased from the species to higher taxonomic levels (Table 3), thereby indicating that the ability of PERMANOVA to discriminate amongst groups decreased.PERMDISP analyses  showed that these differences between clusters were due to a combined effect of sample location and dispersion (p < 0.05), with the only exception being presence/absence data, in which no significant differences in dispersion between cluster groups was observed (p > 0.05).Significant differences in dispersion were retained across taxonomic levels up to the phylum level for untransformed data, the order level for square-root transformed data and the family level for log-transformed data.
Regarding the Sacca di Goro dataset, PERMANOVA highlighted significant differences (p < 0.05) in macrobenthic assemblages between groups identified by cluster analysis at the species level when each of the three transformations were used, but not when data remained untransformed.For each of the transformations (square-root, logarithm and presence/absence data), the significance of the differences decreased with increasing taxonomic level, with no significance being detected at the phylum level.Additionally,  in this case too, pairwise comparisons revealed that not all possible pairwise combinations of clusters differed significantly and the number of significant pairwise differences decreased from the species to higher taxonomic levels (Table 3).Once again, therefore, the ability of PERMANOVA to discriminate between groups appeared to decrease.PERMDISP analyses, on the other hand, revealed no differences between clusters, either in untransformed data or in terms of dispersion (p < 0.05).For other transformations, significant differences between clusters (p < 0.05) were due to a combined effect of location and dispersion.Significant differences in dispersion were retained across taxonomic levels, up to the phylum level, for log-transformed data, the class level for square-root transformed data and the family level for presence/absence data.

Macrobenthic community characteristics at the two LTER sites
The macrobenthic communities at both LTER sites were characterised by reduced richness and diversity (low S and H') and were badly structured (low J'), as is typical in transitional environments, in particular those of the Po River delta (e.g.Marchini et al. 2008, Munari et al. 2010, Pitacco et al. 2018).Nevertheless, the two LTER sites differed in terms of their hydromorphological features and anthropogenic impact, differences that were reflected in their respective macrobenthic communities.Specifically, at the Val di Comacchio site, the community showed high variability in annual richness, diversity, equitability and taxonomic complexity, whereas at the Sacca di Goro site, the community, characterised by lower interannual variations, showed higher annual species richness and diversity (S and H'), but at the same time, reduced equi-distribution of taxa (low J') and low taxonomic complexity.From the COM dataset (Valli di Comacchio), a clear decreasing trend of structural indices (S, H' and J') was observed; changes at the community level at this site, in addition to the major factors driving those changes (mainly eel aquaculture and climate changes), have been described thoroughly in previous papers (e.g.Mistri et al. 2000, Munari and Mistri 2014b, Munari et al. 2005, Pitacco et al. 2018).As for the GOR dataset (Sacca di Goro), structural indices failed to depict a temporal trend in the macrobenthic community, but changes were revealed through multivariate analyses.Sacca di Goro, as a leaky lagoon, is characterised by a high spatial variability in physicochemical parameters, related to the distance from the rivers and the sea and by large daily fluctuations linked to the height of the tides (Corbau et al. 2016).Furthermore, in recent years, Sacca di Goro has been subjected to frequent sediment dredging related to fishery activities, as well as restoration and maintenance projects (such as macroalgal removal in the event of blooms) fundamental for the local clam fishery productivity (Corbau et al. 2016).Nonetheless, the macrobenthic population at this site showed high resilience, recovering rapidly (within months) after these types of disturbance (Munari and Mistri 2014a), which could explain the high infra-annual variability of structural indices.The effects of such stressors and restoration actions on the macrobenthic community have been exhaustively described in previous papers (e.g.Corbau et al. 2016, Munari et al. 2006, Munari and Mistri 2014a).

Information loss along taxonomic groups
Information loss, in terms of the percentage of 'α', was reasonably low for both datasets from both species-to-genus (< 20%) and genus-to-family levels (< 40%), despite the higher variability of the taxonomic complexity at the COM site.Indeed, the suitability of TS for taxonomically complex communities, as well for simple, species-poor ones, has also been observed in different habitat types (Bacci et al. 2009, Mistri andRossi 2001).In fact, the efficiency of TS is not dependent on the number of species belonging to the same genus or family, but instead on their response to disturbance (Dauvin et al. 2003).At the GOR site, the most specious family, Spionidae, was the only one that could represent a limitation for TS.Although the family itself was classified as tolerant according to the AMBI library (EGIII), it comprised a number of species, not only tolerant (EGIII) but also opportunistic (EGIV and EGV).At the COM site, not only the family Spionidae, but also the family Serpulidae represented limitations for TS, comprising sensitive (EGI), indifferent (EGII) and tolerant species (according to the AMBI library; www.ambi.azti.es).Instead, the family Phyllodocidae, although responsible for high α values at the COM site, comprised only two genera, with almost exclusively similar species and did not, therefore, present a limitation for TS.In all cases, no variation in sensitivity (in terms of AMBI groups) was observed at the genus level, suggesting that the genus level provided a good representation of the response to disturbance.Variations in the general structure of the macrobenthic community (multivariate analyses) were maintained with reasonably low information loss, considering both location and dispersion components, from species to genus and from species to family levels, almost irrespective of the data transformation, in both datasets (as shown by MDS, Spearman's correlation, hierarchical clustering, PERMANOVA and PERMDISP analyses).However, the response to TS differed between the two datasets for higher levels of taxonomic aggregation.In fact, information loss, due to both location and dispersion components, at taxonomic levels higher than the family level was quite high with respect to the species level at both sites.Our results are consistent with investigations performed in a western Mediterranean lagoon, where the ordination models derived from species and family abundances were very similar both in terms of location and dispersion effects, while further aggregation to the class level altered the observed spatial patterns (Tataranni et al 2009).
Although loss of information about the structure of the benthic assemblages increased with decreasing taxonomic accuracy at the COM site, this was not the case for the GOR dataset.In fact, at the GOR site, aggregation at the phylum level yielded better results than aggregation at the class level.Hence, TS, using family as surrogate, could be a good compromise between time/costs and efficiency, while genus remains the best surrogate for the identification of temporal variations in the macrobenthic community.That being said, the sufficiency level of taxonomic resolution could be strongly context-dependent (Terlizzi et al. 2003) and could change according to the bio-geographic background (Roy et al. 1996) or habitat type (Chapman 1998).More-over, it could vary according to different relationships of abundance and redundancy between species.For instance, Vanderklift et al. (1998) found that, in an Australian marine bay, genus richness highly reflected species richness, but families only well described species richness for fish assemblages and not for plants.In addition, Warwick et al. (1990) found little information loss related to taxonomic aggregation at the family level in a macrobenthic community from Bermuda, but significant information loss at the level of the meiobenthic community, in particular regarding nematodes.
A review of the current literature on taxonomic sufficiency (Table 3), performed considering only papers testing the efficiency of the use of higher taxa as surrogates for species in marine macrobenthic invertebrates, showed that, when the entire community was analysed, the family level is recognised as the best surrogate by many authors worldwide.This review revealed that analyses performed at the family level are considered suitable for assessing the response of macrobenthic invertebrates to both anthropogenic impact and natural gradients of environmental variation in most of the published papers.The suitability of family levels has acquired a consensus worldwide, from polar to tropical regions and from transitional (both estuaries and coastal lagoons) to marine habitats.

Information loss with data transformations
The choice of data transformation required particular attention.It is well known that data transformation can influence the results of consequent analyses to a similar extent to the choice of taxonomic resolution (Olsgard et al. 1997), but only a small percentage of published papers on TS have analysed this aspect in detail (Table 4).As suggested by the 'fan' shape of the second-stage MDS plots, reported also in other habitat types (e.g.Olsgard et al. 1998;Olsgard and Somerfield 2000), the effects of transformation and taxonomic resolution operate largely independently.In particular, data transformation can critically affect data dispersion and is considered a common remedy for heterogeneity of variance in univariate analyses (Anderson et al. 2008).
In our case, irrespective of data transformation, differences in dispersion amongst clusters remained significant, at least to the family level, in accordance with Tataranni et al. (2009).At higher taxonomic levels, the effect was more evident and changed according to the strength of the data transformation and the dataset considered.The decreasing similarity of ordination matrices with increasing strength of data transformation observed at the two LTER sites clearly indicates that abundance was a key factor in characterising the macrobenthic community.The importance of abundance was further confirmed by the greater information loss with the use of presence/absence transformation at both LTER sites.In particular, at the COM site, the high correlation between second-stage similarity matrices (representing different transformations) indicated that transformation had little effect on the efficiency of TS, although loss of information regarding the general pattern of the community from one taxonomic level to the one above slightly increased with the strength of transformation.The only exception was the strongest transformation (presence/absence), in which the contribution of taxa abundance was null.Conversely, at the GOR site, the correlation between second-stage matrices (representing different transformations) was, in general, very low, indicating a stronger effect of transformation on the efficiency of TS.Interestingly, for the GOR dataset, weak data transformation (i.e.raw data) was sufficient to allow good correspondence amongst similarity matrices at the species level and all taxonomic surrogates up to the phylum level.This suggests that abundance was fundamental in determining temporal changes in the macrobenthic community at the GOR site.Nevertheless, both PERMANOVA and PERMDISP analyses on untransformed data failed to discriminate between groups identified by cluster analysis.For the GOR dataset, therefore, the choice of data transformation was more important to the correct evaluation of the efficiency of TS.Indeed, the populations within transitional waters typically show dramatic seasonal, annual and interannual variations, ranging from disappearance to complete dominance during periods of dystrophic crisis (Arvanitidis et al. 2009).Hence, changes in annual abundance could be biased by seasonal or occasional local variations, with changes in the abundance of the dominant species obscuring all other changes in the macrobenthic community.
Untransformed data is the most commonly used for TS in environmental monitoring (Musco et al. 2011), but in our case, the best results were obtained with square-root data transformation.Our results suggested caution in the use of both untransformed data and strong transformations, at least for transitional environments, particularly in the absence of strong pressures causing marked effects on all aspects of the macrobenthic community, as was the case with the GOR dataset.Indeed, recent publications (Table 4) have shown that the effect of transformations could vary across different environmental conditions, spatial scales and habitat types.For instance, data transformation affected differences between similarity matrices at varying taxonomic resolutions in hard-bottom natural gradients, while this effect was less pronounced in soft-bottom gradients (Bevilacqua et al. 2009).As a consequence, the choice of data transformation should not rely exclusively on the best statistical match between similarity matrices, but should be more a biological question, as already suggested by previous investigations.A weak transformation gives a narrow view of the community, deeply influenced by the most abundant taxa, whereas a strong transformation yields a wider view of the community, in which all taxa have the same weight, regardless of their relative abundance (Karakassis and Hatziyanni 2000;Mistri and Rossi 2001).Therefore, if changes in the dominant species are considered important, using raw abundance data will provide just such a narrow view of community structure, whereas if all species are considered to be equally important, then a strong transformation is more appropriate (Olsgard and Somerfield 2000).In short, investigators should bear in mind that the choice of data transformation may be affected by the structure of the analysed community, the scale of the changes therein, the taxonomic level considered and the specific objective of each individual investigation.

Conclusions
Our results showed that TS could be an efficient tool for long-term monitoring programmes.Our results also showed how LTER observations are critical for detecting meaningful ecological shifts and assessing whether ecological changes are due to human or natural causes.LTER data are particularly important for the identification of temporal trends, as many ecological processes develop at temporal scales that are longer than have typically been considered in traditional short-term ecological research studies.The higher natural variability of environmental conditions, combined with multiple stressors of anthropogenic origins at the two LTER sites analysed, did not represent an impediment for TS in detecting temporal changes in the macrobenthic community.For the two LTER sites analysed, the solution providing the best compromise between time/cost and information loss was a square-root transformation using family data as the taxonomic surrogate.Given the rising importance of long-term data series for detecting trends and changes at community levels, in particular in greatly fluctuating environments such as transitional waters, TS could be a useful method of enabling an increase in sampling frequency, together with a higher spatial resolution, while still reducing costs.Increasing the available information on a temporal scale could also help reduce the bias of seasonal and local variations and, therefore, increase the efficiency of environmental management actions and biodiversity conservation measures.
At the same time, our results showed that the structure of the community and the magnitude of changes influence the efficiency of taxonomic surrogates and data transformations.Therefore, great care in the choice of those aspects of TS is necessary, in particular at sites where the effect of disturbance on community structure is not marked, as was the case for the GOR dataset.The best choice could be a function of environmental conditions, habitat types and biogeographic area.Therefore, care is needed when generalising outcomes in the field of TS and pilot studies are required to distinguish the most suitable procedure on a case-by-case basis (Chapman 1998, Pagola-Carte andSaiz-Salinas 2000).Knowledge about the species present and their biology and ecology is still an indispensable prerogative for defining the most suitable taxonomic level for TS (Terlizzi et al. 2003).Indeed, taxa including many abundant species might contain rare species with key roles in the structure of communities that only fine taxonomic analyses and manipulative experiments may reveal (Mistri et al. 2001).We therefore suggest that analyses at a finer taxonomic level should be performed periodically in routine monitoring programmes in order to provide, case by case, baseline knowledge for interpreting changes in communities and to check the efficiency of taxonomic substitutes and data transformation in monitoring programmes.In particular, baseline knowledge seems particularly important for longterm monitoring in lagoon systems affected by several natural and anthropogenic factors.

Figure 1 .
Figure 1.Map of the studied sites.

Figure 2 .
Figure 2. Variation in diversity indices across the study period at the two LTER sites.S: species richness, H': Shannon diversity index, J': Pielou diversity index.

Figure 4 .
Figure 4. Second-stage MDS ordination of resemblance matrices derived from species, genus, family, order and phylum abundance data at the two LTER sites: Valli di Comacchio lagoon (A) and Sacca di Goro lagoon (B).unt: untransformed data, sqr: square-root transformed, log: log-transfomed, pa: presence/absence data.

Figure 5 .
Figure 5. Constrained cluster analysis of macrobenthic data on Valli di Comacchio lagoon, aggregated to different taxonomic levels.

Figure 6 .
Figure 6.Constrained cluster analysis of macrobenthic data on Sacca di Goro lagoon, aggregated to different taxonomic levels.

Table 1 .
Physicochemical composition of the two LTER sites.

di Comacchio Sacca di Goro Extension (km 2 )
. The information loss between genus and family levels was highest (26%) in 2008 and lowest (14%) in 2005; at the family-order level, it was highest (35%) in 2009 and lowest (21%) in 2005 and at the order-class level, it was highest (39%) in 2005 and lowest (16%) in 2009; while at the class-phylum level, it varied from 5% in 2008 to 8% in 2004 and 2007.

Table 2 .
Spearman correlations (r s ) resulting from the third-stage correlation matrix, showing the effect of data transformation on differences between aggregation matrices.

Table 3 .
Significance of cluster groups (PERMANOVA), percentage of significant pairwise combinations between those groups and the significance of differences in dispersion between cluster groups (PER-MDISP).