TY - JOUR
T1 - Sampling bias correction in species distribution models by quasi-linear Poisson point process
AU - Komori, Osamu
AU - Eguchi, Shinto
AU - Saigusa, Yusuke
AU - Kusumoto, Buntarou
AU - Kubota, Yasuhiro
N1 - Funding Information:
Financial support was provided by the Japan Society for the Promotion of Science KAKENHI Grant Number JP15H04424 , JP18H03211 , the Environment Research and Technology Development fund of the Ministry of the Environment , Japan ( 4-1501 and 4-1802 ), and Program for Advancing Strategic International Networks to Accelerate the Circulation of Talented Researchers, the Japan Society for the Promotion of Science .
Funding Information:
Financial support was provided by the Japan Society for the Promotion of Science KAKENHI Grant Number JP15H04424, JP18H03211, the Environment Research and Technology Development fund of the Ministry of the Environment, Japan (4-1501 and 4-1802), and Program for Advancing Strategic International Networks to Accelerate the Circulation of Talented Researchers, the Japan Society for the Promotion of Science. Datasets for representative species used in this analysis are available in the package of statistical software R, named qPPP, where all program codes are also available to reproduce the results of the data analysis. OK wrote the first draft of manuscript; OK and YS analyzed the data; SE and OK proposed the methodology, and BK and YK edited the final version of the manuscript
Publisher Copyright:
© 2019 The Author(s)
PY - 2020/1
Y1 - 2020/1
N2 - Species distribution modeling has an essential role in ecology to investigate habitat suitability based on the relationship between species occurrences and environmental conditions. The presence-only data for organisms is usually assumed to be obtained randomly from the region of interest; however, it is often the case that it is biased toward the areas easily accessed and adversely affects prediction accuracy. To address this sampling bias problem relevant to the prediction accuracy and model fitting of habitat distributions, we propose a new Poisson point process (PPP) model named as a quasi PPP, where environmental effect and sampling bias are explicitly modeled in separate clusters in a framework of quasi-linear modeling. The quasi-linear modeling is designed for capturing homogeneity within clusters and heterogeneity between clusters to improve the estimation accuracy of species distribution. The proposed model includes conventional models such as thinned and superposed PPPs as special cases. We have found that the quasi PPP outperforms the other existing methods in terms of goodness of model fitting. A statistical index based on the quasi-linear modeling is proposed to measure how the presence-only data used for the estimation of the species habitat distribution is affected by the sampling bias. The utility of the quasi PPP has been illustrated using simulation studies as well as the comprehensive vascular plant data in Japan. Our proposed model flexibly incorporates the effect of sampling biases to improve the estimation accuracy of species distributions. The results of data analysis are easily reproducible and applications to other data sets are also easily implementable by a package of qPPP of a statistical software R.
AB - Species distribution modeling has an essential role in ecology to investigate habitat suitability based on the relationship between species occurrences and environmental conditions. The presence-only data for organisms is usually assumed to be obtained randomly from the region of interest; however, it is often the case that it is biased toward the areas easily accessed and adversely affects prediction accuracy. To address this sampling bias problem relevant to the prediction accuracy and model fitting of habitat distributions, we propose a new Poisson point process (PPP) model named as a quasi PPP, where environmental effect and sampling bias are explicitly modeled in separate clusters in a framework of quasi-linear modeling. The quasi-linear modeling is designed for capturing homogeneity within clusters and heterogeneity between clusters to improve the estimation accuracy of species distribution. The proposed model includes conventional models such as thinned and superposed PPPs as special cases. We have found that the quasi PPP outperforms the other existing methods in terms of goodness of model fitting. A statistical index based on the quasi-linear modeling is proposed to measure how the presence-only data used for the estimation of the species habitat distribution is affected by the sampling bias. The utility of the quasi PPP has been illustrated using simulation studies as well as the comprehensive vascular plant data in Japan. Our proposed model flexibly incorporates the effect of sampling biases to improve the estimation accuracy of species distributions. The results of data analysis are easily reproducible and applications to other data sets are also easily implementable by a package of qPPP of a statistical software R.
UR - http://www.scopus.com/inward/record.url?scp=85074632758&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85074632758&partnerID=8YFLogxK
U2 - 10.1016/j.ecoinf.2019.101015
DO - 10.1016/j.ecoinf.2019.101015
M3 - Article
AN - SCOPUS:85074632758
SN - 1574-9541
VL - 55
JO - Ecological Informatics
JF - Ecological Informatics
M1 - 101015
ER -