# Stat-Sum

Here you can find lay summaries of various statistical issues that are encountered by behavioral ecologists. These materials are pointers to the primary literature (e.g. scientific papers or conferences)

- How to submit material to Stat-Sum?
- Just click on the "add new..." tab and select the most appropriate format...
- Courses by Paul D. Allison
- Missing Data (March 8-9, Lyon, France; April 23-24, Philadelphia), Longitudinal Data Analysis (May 7-8 Washington, DC (Using Stata); May 21-22 Los Angeles (Using SAS)), Categorical Data Analysis (July 12-16, Philadelphia) Event History & Survival Analysis July 19-23, Philadelphia
- Carrascal et al. (2009). Partial least squares regression as an alternative to current regression methods used in ecology. Oikos, 118: 681-690.
- Behavioral ecologists must often face the problem of dealing with large sets of predictor variables in relation to sample size, especially in experimental studies where sample size is limited due to methodological difficulties. In addition, these predictor variables are often highly correlated among them. These problems have been traditionally solved by applying multiple regression (MR) after variable-selection or multivariate reduction methods, or by performing best-subsets techniques. However, removing some variables prior to the analysis may lead to the loss of some variability that, albeit less important than other variables, could still be biologically significant. Multivariate reduction methods also present problems, as they reduce the initial variability in the predictor variables independently of their covariation with the response variable. Lastly, best-subsets techniques may conclude that several models are equally probable. The partial least squares regression analysis (PLSR) is an alternative to these methods because: 1) PLSR models explain a similar amount of variance to results obtained with MR and with MR after principal components analysis (a multivariate reduction method), and 2) PLSR is more reliable than other techniques when identifying relevant variables and their magnitudes of influence, especially in cases of small sample size and high correlation between predictor variables.
- Murtaugh (2009). Performance of several variable-selection methods applied to real ecological data. Ecology Letters, 12:1061-1068.
- The author evaluated the predictive ability of statistical models obtained by applying seven methods of variable selection to 12 ecological and environmental data sets. Cross-validation, involving repeated splits of each data set into training and validation subsets, was used to obtain honest estimates of predictive ability that could be fairly compared among methods. There was surprisingly little difference in predictive ability among five methods based on multiple linear regression. Stepwise methods performed similarly to exhaustive algorithms for subset selection, and the choice of criterion for comparing models (Akaike's information criterion, Schwarz's Bayesian information criterion or F statistics) had little effect on predictive ability. For most of the data sets, two methods based on regression trees yielded models with substantially lower predictive ability. The paper argues that there is no 'best' method of variable selection and that any of the regression-based approaches discussed is capable of yielding useful predictive models.
- Garamszegi et al. (2009). Changing philosophies and tools for statistical inferences in behavioral ecology. Behavioral Ecology, in press.
- Recent statistical developments have reached behavioral ecology, and more and more studies now apply analytical tools that incorporate novel philosophical concepts. However, these “new” approaches continue to receive mixed support in our field. Because our statistical choices can influence how we commit science, there is an urgent need for reaching consensus on statistical practice. The paper provides a brief overview of the recently proposed approaches and open an online forum for future discussion (https://bestat.ecoinformatics.org/). For this review, the authors adopt the perspective of practicing behavioral ecologists relying on various kinds of data. They emphasize that researchers should recognize that uncertainty is an inherent feature of biological data, and that it is important to integrate previous knowledge in the current analysis. For these tasks, novel approaches offer a variety of tools. However, a pluralistic perspective is recommended for statistical decisions, in which researchers should objectively decide about the most appropriate statistical method that the biological question and data at hand require. The paper highlights how these concepts could be made apparent in scientific publications.