Corresponding author: Tom J. M. Van Dooren (
Academic editor: K. Henle
Changes in pollinator abundances and diversity are of major concern. A recent study inferred that pollinator species richnesses are decreasing more slowly in recent decades in several taxa and European countries. A more careful interpretation of these results reveals that this conclusion cannot be drawn and that we can only infer that declines decelerate for bees (
Van Dooren TJM (2016) Pollinator species richness: Are the declines slowing down?. Nature Conservation 15: 11–22. doi:
In a recent paper investigating different plant and pollinator groups in three countries,
European Union
(EU) policy could have played a role in the effect they infer. Ambitions in terms of biodiversity management seem reduced in recentThe analysis in CA2013 is based on comparisons of species accumulation curves (
Species accumulation curves. Species richness is the asymptote of a species accumulation curve, which expresses the dependence on sampling effort of the number of species sampled from an assemblage. In CA2013, sampling effort is given by the number of records from which the number of species is calculated. For illustrative purposes, an example with three arbitrary samples (for 10000, 5000 and 2000 records, labeled from one to three) is drawn. For sample one, a predicted species accumulation curve is added that gradually increases from one species sampled to the predicted species richness for that assemblage (full line). Such curves are constructed on the basis of interpolation and extrapolation. For samples two and three only segments of extrapolated curves are drawn (dotted lines). For sample two, a curve that crosses the species accumulation curve of sample one is sketched. For samples one and three species accumulation curves are more or less proportional. The way in which the species richness differences between samples are assessed in CA2013 is illustrated by indicating on the species accumulation curves at which numbers of records pairwise comparisons would be made between two sample pairs (1 vs. 2 and 1 vs. 3). The number of species of the sample with the smallest number of records is extrapolated to the number expected at three times the number of records. When the number of records of the other sample is still larger than that, the number of species of the second sample is interpolated (rarefied), otherwise it is extrapolated as well.
Statements in CA2013 supporting a slowing down of species richness decline. The column with spatial scales to which a statement applies lists either the grid sizes (as length of a grid side in km) or the country level. $: The changes on Figure
Species group/Country  Statement on the change between the two most recent periods  Spatial scale 

Non 

Belgium  –  – 
Great Britain  Significant increase  10/20/40/80/160 
Netherlands  Significant increase  10/20/40/80 
Belgium  –  – 
Great Britain  Declines less accentuated  – 
Great Britain  Significant increase  Country level 
Netherlands  Declines less accentuated  
Belgium  –  – 
Great Britain  –  – 
Netherlands  Declines less accentuated  10/20 
Netherlands  Significant increase^{$}  10/20 
Belgium  Significant increase^{*}  – 
Great Britain  –  – 
Netherlands  –  – 
Belgium  –  – 
Great Britain  Recovery of species richness  10/20/40 
Netherlands  Recovery of species richness  Country level 
Note that species richness was nowhere estimated in CA2013, rather numbers of species at particular finite sampling efforts were used as proxies for species richness.
As Table
While appropriate tests for a slowing down of richness decline are lacking in CA2013, we can still check ourselves whether confidence intervals for changes overlap between interval pairs, and conclude on the significance of decelerations in species richness from the absence of overlaps.
I assess limits of confidence intervals in tables and figures of CA2013 to construct tests for a significant deceleration in richness decline. When these numbers are provided, limits of intervals are calculated from parameter estimates and their standard deviations using the normal approximation for 95% intervals (
I believe that the parallel analyses in CA2013 with different estimators of species number variance, for rarefaction and for extrapolation are to some extent a valid way of assessing the robustness of the inference. Comparing results when differences are predicted at different numbers of records implicitly checks whether crossing species accumulation curves might be present. With such crossings, the sign of species number differences will depend on the number of records where the difference is assessed (Fig.
Unfortunately, the robustness assessment in CA2013 is affected by anticonservative inference, duplications of statistics and errors as explained in the following sections, such that a number of assessments are removed from consideration. This makes the assessment of robustness more limited than originally intended.
In the reassessment, I will give less importance to results at smaller spatial scales than at country level. First of all, the conclusion reinvestigated here was formulated at country level. Second, for comparisons at smaller scales sample sizes are smaller. Hence the risk that predicted species numbers badly reflect species richness can be increased. Third, there is no guarantee and no evidence that the local grid cells compared between the first two and the last two periods are the same or samples from sets with identical properties. Fourth, I will show below that regression corrections carried out in CA2013 for the smaller spatial scales carry an additional risk of anticonservative inference and bias.
In part of their calculations, CA2013 have used a bootstrap estimate of the variance of predicted species number. After inspection of the R code used, it turns out that their estimate is neither bootstrap variance nor bootstrap accuracy (e.g., Walther and Moore 2005), so not a regular bootstrap estimate of variance. In their paper as published before correction, they sum the absolute value of the bootstrap estimated bias and the squared bootstrap average of
CA2013’s script contained a calculation error: the bias on species number in the earlier period is used in calculations where it should have been that of the later period. In a modified script distributed with their corrigendum (
The unweighted tests in Table S5 of CA2013, where we expect standard deviations to be calculated automatically from the data variances, all use averages of the bootstrap standard deviations as weights. They are therefore all weighted in an unexpected manner and should not be considered.
In Table S5 of CA2013, the three listed tests per taxon/country for effects at the national scale are in fact always the same test pasted in three times, namely the test based on a bootstrapped variance estimate. The R script of the authors does not contain any calculations for nonbootstrap weighted tests for national data, such that there is no robustness assessment possible. An unweighted test is impossible to carry out at the national level, as there are too few values to calculate data variances per species group/country. I am forced to rely solely on CA2013’s bootstrap statistics for the countrylevel comparisons. Statistics in Tables S2 and S5 of CA2013 and the supplementary figures have not been adapted to the new heuristic to estimate bootstrap standard deviation, we therefore need to inspect the corrected Figure
Species number statistics. CA2013 presents standard deviations of species number change estimated in different ways and at different sampling efforts (
: not available
). Each of these statistics available from the paper or its figures is calculated in an unexpected manner. Categories where remarks are in bold are used for the reassessment.Unweighted standard deviation  Analytical standard deviation  Bootstrapped standard deviation  

Rarefaction (Interpolation)  
Extrapolation + Interpolation  Bootstrapped standard deviation  
Extrapolation 
CA2013 assume that their response variable (logratio of predicted species numbers at a particular sampling effort) is directly comparable between different sampling efforts. This is only expected when nonlinear species accumulation curves for different periods are proportional and will often not hold good when these curves for example cross (Fig.
CA2013 call it a bias that their response variable often becomes larger with larger differences in data sample sizes between periods. To correct for this presumed bias, they included the difference of the logged numbers of records in the two time periods as covariate in the random effects models. That the difference in species number often becomes larger with larger differences in sample sizes is an absolutely normal pattern if species accumulation curves are not proportional and sample sizes where species numbers are interpolated are not randomly distributed (for example consistently smaller in one period). It is not difficult to sketch a pair of species accumulation curves where unbiased estimation and a nonrandom set of sample sizes per period would produce this pattern. Applying the proposed regression correction here would distort the pattern of real differences. On the other hand, we also need to check whether it removes estimation bias when differences are calculated between different samples from the same assemblage.
Simulating data from a single species accumulation curve based on the EIS bee data, I found that the “regression correction” approach CA2013 used in their analysis of species richness change patterns at spatial resolutions smaller than the national scale does not remove bias on the intercept. When the ratio of sample sizes between periods is on average different from one in these simulations, tests on the intercept or on the partial residuals used in this regression approach too often conclude that species accumulation curves have changed. These tests have anticonservative amounts of type I errors and also show estimation bias. When the sample in the second period is on average larger (smaller), the intercept is negative (resp. positive).
Additionally, the variances of partial residuals used for “corrected” pergrid tests in CA2013 were not taking variances of estimated regression parameters into account neither the random effect variance of the model fitted (
There are further irregularities in the analysis of CA2013. In the functions used to predict species accumulation curves, which I originally wrote, zero values have been replaced by positive numbers, and where functions should produce missing values or zeroes, shortcuts have been inserted that return other values. The threshold for the
From the comparisons of confidence intervals extracted from CA2013’s corrected Figure
A reassessment of results in CA2013. Confidence intervals are extracted and calculated from figures and tables in CA2013. Cells with value “1” indicate taxa and countries where a significant decline in number of species between the two first periods is followed by a change in species number between the two most recent periods, which is significantly less negative (with nonoverlapping 95% confidence intervals). Columns “nat” indicate comparisons at the national level and are in bold. I attach a larger importance to them as explained in the text. “
Spatial scale  Figure 
Table S5 NonBootstrap Weights  Supplementary Figures  

Extrapolation  Rarefaction  

160  80  40  20  10  160  80  40  20  10  nat  10  nat  10  
Non 

Belgium  0  0  0  0  0  0  0  0  0  
Great Britain  0  0  0  0  0  0  0  0  0  0  0  0  
Netherlands  0  1  1  1  0  1  1  1  1  1  
Bumblebees  
Belgium  1  0  0  0  0  
Great Britain  0  1  0  0  0  0  0  0  0  0  0  
Netherlands  1  1  1  1  0  1  1  1  1  
Butterflies  
Belgium  0  0  0  0  0  
Great Britain  0  0  0  0  0  0  0  0  0  0  0  0  
Netherlands  0  0  1  1  0  0  1  1  1  0  
Hoverflies  
Belgium  0  0  0  0  0  0  0  0  0  0  
Great Britain  0  0  0  0  0  0  0  0  0  0  0  0  
Netherlands  0  0  0  0  0  0  0  0  0  0  
Plants with Intermediate Dependence on Insects  
Belgium  0  0  0  0  0  0  0  0  0  1  
Great Britain  0  0  1  1  1  0  0  1  1  1  0  1  
Netherlands  0  0  0  0  0  0  0  0  0  0  
Plants Independent of Insects  
Belgium  0  0  0  0  0  0  0  0  0  1  
Great Britain  0  0  0  1  1  0  0  0  1  1  0  1  
Netherlands  0  0  0  0  0  0  0  0  0  0  
Plants Dependent on Insects  
Belgium  0  0  0  0  0  0  0  1  0  1  
Great Britain  0  0  1  1  1  0  0  1  1  1  0  1  
Netherlands  0  0  0  0  0  0  0  0  0  0 
For bumblebees and other bees in the Netherlands, the pattern is found in Figure
Taken together, the inference in CA2013 only provides robust inference of a slowing down of species richness decline in two out of fifteen taxon/country combinations. This is in fact one taxon, the bees
Table
Alternatively, we could forgo trying to draw conclusions on the separate species groups and countries altogether and just inspect estimates and the relative occurrence among them of species number decreases between the first two periods, and whether the last periods would often have increased values for species number change relative to the previous period. Such an approach would be akin to CA2013’s statements that estimates have become less accentuated. Please note that such a procedure would be sensitive to estimation bias. At the same time, we would assume that all data heterogeneity can be ignored. Note that this inspection can easily generate bias itself: We should not select a subset of data points with negative changes between the first periods to assess further. In the case of independent changes between pairs of periods, just inspecting groups with the smallest values between the first pair of periods would bias the estimates for the second pair of periods to be larger than the first more often. Neither should we use all estimates provided in CA2013, as estimates for different spatial scales on the same group are not independent but recalculations on the same data. If we count the changes at the national level from the corrected Figure
My reassessment and the brief discussion of what a simplified analysis and hypothesis testing scheme would provide do accept the inference method based on extrapolating and rarefying species numbers as valid. One can argue against that. Species richnesses were not estimated in CA2013, and the paper did not provide statistics that allowed tests at the country level without using bootstrap estimates of standard deviations. The time period was arbitrarily binned in three time intervals. If declines and decelerations occur, they don’t have to be synchronous across taxa and countries and match with the time intervals. Moreover,
I thank Menno Reemer for providing the EIS (Invertebrate Survey Netherlands) bee dataset. An anonymous peer provided extremely useful comments through Peerage of Science.