Data Paper |
Corresponding author: Florencia Grattarola ( flograttarola@gmail.com ) Academic editor: William Magnusson
© 2025 Florencia Grattarola, Kateřina Tschernosterová, Petr Keil.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Grattarola F, Tschernosterová K, Keil P (2025) MIAU: An analysis-ready dataset on presence-only and presence-absence data of Neotropical carnivores (Mammalia, Carnivora) from 2000 to 2021. Nature Conservation 58: 11-30. https://doi.org/10.3897/natureconservation.58.140644
|
In the last decade, databases of records of species observed at the same location at different points in time over large spatial extents have been made available. Unfortunately, these sources are scarce in regions such as Latin America. We present a dataset of 60,179 point occurrences (i.e. presence-only data, PO) and 45,468 camera-trap survey records (i.e. presence-absence data, PA) for 63 species of carnivores of the Neotropical Region from 2000 to 2021. We collated the data from various sources, including 64 newly-digitised bibliographic references. We cleaned, taxonomically harmonised and standardised the data following the Darwin Core and Humboldt Core standards and present them here as csv files. We have also made these data fit for analyses by aggregating the data into two time periods (time1: 2000–2013 and time2: 2014–2021), with PO grid cell counts of 100 × 100 km and PA polygons of varying size, presented as geopackage files. These data can be used for large-scale species distribution models, calculation of population trends, extinction risk analyses and educational purposes.
Camera trap, data deficiency, Latin America, point occurrence, species distribution models
To understand and monitor global biodiversity change over time, we need data on species distributions spanning long time periods and large spatial areas (
To study changes in the distribution of continental-wide species in such data-scarce areas, there are two options. First, we can gather data from scratch, but this is challenging at large scales. Alternatively, we can rescue and collate multiple already-published sources and digitise, clean, harmonise and standardise them for reuse (
An important lesson from these analyses and also from other authors (e.g.
Here, we present MIAU, a ready-to-use, cleaned, continental-wide dataset on presences-only and presences-absences of Neotropical carnivores. We expect the dataset to be useful for many purposes, such as large-scale species distribution models, calculation of population trends and extinction risk analyses, ultimately providing information for large-scale conservation of one of the most charismatic taxonomic groups in the Neotropics.
We compiled data from different sources (Table
Data sources in our dataset, including the source type, number of datasets involved, data type, number of species and number of records they span.
Source | Source type | Datasets involved | Data type | Number of species | Number of records |
---|---|---|---|---|---|
|
online database | 434 | presence-only | 59 | 56,413 |
|
data paper | 105 | presence-only | 31 | 3,766 |
|
data paper | 207 | presence-absence | 45 | 34,784 |
Literature sources processed in this study (Suppl. material |
literature | 64 | presence-absence | 40 | 10,684 |
List of species covered by our dataset, including family and the number of presence-only (PO) and presence-absence (PA) records (only reported presences).
Species | Family | Number of PO records | Number of PA records |
---|---|---|---|
Atelocynus microtis | Canidae | 41 | 290 |
Canis latrans | Canidae | 2288 | 89 |
Canis lupus | Canidae | 241 | 0 |
Cerdocyon thous | Canidae | 4332 | 4065 |
Chrysocyon brachyurus | Canidae | 575 | 480 |
Lycalopex culpaeus | Canidae | 8561 | 41 |
Lycalopex fulvipes | Canidae | 72 | 0 |
Lycalopex grisea | Canidae | 80 | 11 |
Lycalopex gymnocerca | Canidae | 2087 | 508 |
Lycalopex sechurae | Canidae | 13 | 34 |
Lycalopex vetula | Canidae | 91 | 127 |
Speothos venaticus | Canidae | 60 | 64 |
Urocyon cinereoargenteus | Canidae | 2868 | 204 |
Vulpes macrotis | Canidae | 132 | 0 |
Herpailurus yagouaroundi | Felidae | 1171 | 787 |
Leopardus colocola | Felidae | 183 | 1 |
Leopardus geoffroyi | Felidae | 302 | 249 |
Leopardus guigna | Felidae | 509 | 6 |
Leopardus guttulus | Felidae | 34 | 742 |
Leopardus jacobitus | Felidae | 1 | 0 |
Leopardus pajeros | Felidae | 8 | 0 |
Leopardus pardalis | Felidae | 3371 | 4242 |
Leopardus tigrinus | Felidae | 379 | 224 |
Leopardus wiedii | Felidae | 471 | 918 |
Lynx rufus | Felidae | 1055 | 56 |
Panthera onca | Felidae | 1230 | 2074 |
Puma concolor | Felidae | 3999 | 2478 |
Conepatus chinga | Mephitidae | 629 | 189 |
Conepatus leuconotus | Mephitidae | 406 | 71 |
Conepatus semistriatus | Mephitidae | 578 | 380 |
Mephitis macroura | Mephitidae | 569 | 28 |
Mephitis mephitis | Mephitidae | 210 | 0 |
Spilogale angustifrons | Mephitidae | 161 | 6 |
Spilogale gracilis | Mephitidae | 106 | 11 |
Spilogale pygmaea | Mephitidae | 16 | 3 |
Eira barbara | Mustelidae | 2957 | 2223 |
Galictis cuja | Mustelidae | 374 | 101 |
Galictis vittata | Mustelidae | 196 | 44 |
Lontra canadensis | Mustelidae | 1 | 0 |
Lontra longicaudis | Mustelidae | 959 | 86 |
Lontra provocax | Mustelidae | 81 | 1 |
Neogale felipei | Mustelidae | 1 | 0 |
Neogale frenata | Mustelidae | 251 | 26 |
Pteronura brasiliensis | Mustelidae | 272 | 22 |
Taxidea taxus | Mustelidae | 154 | 0 |
Bassaricyon alleni | Procyonidae | 32 | 1 |
Bassaricyon gabbii | Procyonidae | 53 | 0 |
Bassaricyon medius | Procyonidae | 11 | 0 |
Bassaricyon neblina | Procyonidae | 26 | 0 |
Bassariscus astutus | Procyonidae | 1113 | 40 |
Bassariscus sumichrasti | Procyonidae | 63 | 1 |
Nasua narica | Procyonidae | 5127 | 303 |
Nasua nasua | Procyonidae | 2620 | 2908 |
Nasua olivacea | Procyonidae | 275 | 3 |
Potos flavus | Procyonidae | 815 | 17 |
Procyon cancrivorus | Procyonidae | 2681 | 1661 |
Procyon lotor | Procyonidae | 2098 | 120 |
Procyon pygmaeus | Procyonidae | 1 | 0 |
Tremarctos ornatus | Ursidae | 2753 | 26 |
Ursus americanus | Ursidae | 436 | 22 |
This dataset was generated to study the range dynamics of eight Neotropical carnivores (
We extracted the PA data from two main data sources. The first was the database of
The
For the literature data extraction, we explored 262 potential sources and kept 64 (see ‘Data availability: source data’) that included studies in the Neotropics using camera traps that were performed from the 2000s onwards, reported all surveyed species and stated the sampling effort and the study area. We excluded studies that were exclusively focused on arboreal species, reported only some focal species and discarded others, and used a combination of sampling methods for which the effort in camera-trap days to detect a species was not possible to extract or calculate. We excluded further 22 studies for being duplicated sources and did not digitise 18 studies that were located in areas where we already had sufficient data. For all studies, we report the presence/absence of the species under ‘presence’. For those that included an abundance metric, we report it under the ‘abundance’ column and report the abundance units used in ‘abundanceUnits’ (e.g. NOIR: number of individual records, RAI: relative abundance index - number of records per trap effort, AI/month: abundance index per month). Digitisation of the literature data represented a huge challenge as the different sources reported the spatial information, sampling effort and sampling period of the studies in very heterogeneous and incomplete ways and many times, they did not provide the primary data, but aggregated information. Therefore, as we often had to estimate or calculate these values, we report the origin of the information about effort, area of study and time span in specific columns. The column ‘areaOrigin’ refers to whether the area was given in the article, estimated from information provided in the article or calculated by manually georeferencing the study area or extracting the information from WDPA (
We extracted the PO data from two main data sources, the GBIF’s (2024) database (56,413 records) and the
We extracted occurrence records from the GBIF database (
We complemented the GBIF data with records from the
The main goal of creating this analysis-ready data was to use it in the studies
Summary of the analysis-ready data, including the data type, spatial features and spatial and temporal resolution.
Data type | Spatial features | Spatial resolution | Temporal resolution |
---|---|---|---|
Presence-only (PO) | 2,265 grid cells with counts per species | 100 × 100 km | 2 time periods (2000 to 2013 and 2014 to 2021) |
Presence-absence (PA) | 565 polygons of presences/absences values per each species | Varying sizes | 2 time periods (2000 to 2013 and 2014 to 2021) |
Spatial features (geometries) of the analysis-ready a presence-only (100 × 100 km grid cells) and b presence-absence data (aggregated polygons of varying sizes). PA polygons were buffered by 20 km to improve visibility.
The PO data consist of 100 × 100 km grid cells with counts for each species in the two time periods. We provide the code to generate the grid-cells, which can be adapted to any other preferred size and temporal extension.
The PA data consists of polygons of varying sizes of aggregated camera-trap studies per time period. To create them, we generated a buffer polygon for each survey using the latitude and longitude of the survey as centroid and the study area as buffer. Then, all overlapping polygons were combined and absences were generated for each species in those polygons where the species was not recorded. For each polygon at each time period, we calculated the total surface area, timespan and the effort in camera-trap days and the aggregated presence for each species. We also provide the code to generate the polygons to split the data at any other preferred temporal split (e.g. biannual or every five years).
The data cover the entire Neotropical Region, spanning 26 countries from Central to South America (Fig.
The dataset includes a total of 63 species of carnivores native to the Neotropics (Fig.
Number of records (a) and species (b) for the presence-only (PO) data and number of records (c) and species (d) for the presence-absence (PA) data in our dataset. Shown in greyscale are the carnivores’ family.
The following species are not included in our dataset: Leopardus braccatus, L. fasciatus and L. garleppi (from the Leopardus colocola complex), L. narinensis and L. emiliae (from the Leopardus tigrinus complex), Spilogale interrupta, S. leucoparia and S. yucatanensis and Enhydra lutris. The following species are poorly covered by our dataset (only a few records are included): Leopardus pajeros and L. colocola (from the Leopardus colocola complex), Leopardus jacobita, Lyncodon patagonicus, Lontra canadensis, Neogale felipei and N. africana. This is because most of these species are distributed either in Mexico or Argentina (northern and southernmost countries of the Neotropics) and they are not exclusive to the Neotropics or abundant in this region. Other species, such as those in the pampas cat (Leopardus colocola) species complex and the tiger cat (Leopardus tigrinus) species complex, have gone through several recent taxonomic changes and rediscoveries (
The data have been observed from 01-01-2000 to 31-12-2021, with a more intense effort over the last five years (Fig.
Some countries are poorly covered in our dataset (i.e. have fewer records than expected), while others are well-covered (Fig.
Most PO records come from the second time period. This is characteristic of the data made available on GBIF (i.e. an artefact) and not a real difference in species abundance over time. In
Although we cover 87.7% of the species recorded in Neotropical countries (63 out of 71), for those species that are not exclusively distributed there (i.e. they are primarily distributed in the Nearctic Region,
As the project has ended, we do not plan to update the dataset soon. However, with our data structure description, detailed data cleaning and standardisation workflow and code available, we encourage future users to update the dataset as needed.
Citation: Grattarola F, Tschernosterová K, Keil P (2024) MIAU: An analysis-ready dataset on presence-only and presence-absence data of Neotropical carnivores (Mammalia, Carnivora) from 2000 to 2021; Zenodo; https://doi.org/10.5281/zenodo.14278694. [Dataset].
If you use our underlying data, please cite the source data as well.
Licence: Data are available under the terms of the Creative Commons Attribution 4.0 International licence CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/legalcode.en).
Citation: Grattarola F, Tschernosterová K, Keil P (2024) MIAU: An analysis-ready dataset on presence-only and presence-absence data of Neotropical carnivores (Mammalia, Carnivora) from 2000 to 2021; Zenodo; https://doi.org/10.5281/zenodo.14278694. [Code].
Licence: Code is available under the terms of the GPL-3.0 licence (https://www.gnu.org/licenses/gpl-3.0.html).
Thanks to Diego Alejandro Torres (Universidad de Caldas, Colombia), Marcelo Magioli (Instituto Pró-Carnívoros and Instituto Chico Mendes de Conservação da Biodiversidade, Brazil), Daniel Renison (Universidad Nacional de Córdoba, Argentina), Alexandra Cravino (Universidad de la República, Uruguay), Paul E. Ouboter (Institute for Neotropical Wildlife and Environmental Studies, Suriname) and María Florencia Aranguren (Universidad Nacional del Centro de la Provincia de Buenos Aires, Argentina) for providing extra information and useful comments on their records published in the literature.
The authors have declared that no competing interests exist.
No ethical statement was reported.
This work was funded by the European Union (ERC, BEAST, 101044740). Views and opinions expressed are, however, those of the author(s) only and do not necessarily reflect those of the European Union or the European Research Council Executive Agency. Neither the European Union nor the granting authority can be held responsible for them. The funder had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
FG conceptualised the work, designed the methodology, implemented the computer code, did the data standardisation and validation, prepared the data visualisation, supervised the data digitisation work and wrote the original draft. Literature data collation and digitisation was led by KT. FG and PK did the project administration. PK acquired funding. FG, KT and PK reviewed and edited the manuscript. The authors’ contributions to the scholarly output followed the ‘Contributor Roles Taxonomy’ (CRediT; https://credit.niso.org/).
Florencia Grattarola https://orcid.org/0000-0001-8282-5732
Kateřina Tschernosterová https://orcid.org/0009-0002-8097-8836
Petr Keil https://orcid.org/0000-0003-3017-1858
All of the data that support the findings of this study are available in the main text or Supplementary Information.
Complete list of the 64 digitised literature sources
Data type: csv
Explanation note: See https://doi.org/10.5281/zenodo.14278694 for a BibTeX with the bibliographical database file.
Supplementary information
Data type: docx
Explanation note: table S2. Density of presence-only (PO) and presence-absence (PA) surveys per country. figure S1. Number of presence-only (PO) and presence-absence (PA) records per country and density of records and surveys per area (1/1,000 km2).