Seminar in Psychometrics

COMPS seminar main page


The "Botanical Garden" of Machine Learning - Understanding the Ideas behind Decision Trees and Random Forests

Date and time: March 13, 2023 (4:00 PM CET)
Place On Zoom, projected to ICS CAS room 318, Pod Vodárenskou věží 2, Prague 8.

Abstract: Classification and regression trees (also termed decision trees), model-based trees, bagging and random forests are powerful statistical methods from the field of machine learning. They have been shown to achieve a high prediction accuracy, especially in big data applications with many predictor variables and complex association patterns with nonlinear and interaction effects. However, while individual trees are easy to interpret, random forests are "black box" methods and their interpretation can by misleading. The aim of this presentation is to introduce the rationale behind tree-based methods, to illustrate their potential for exploratory analyses in psychological research, but also to point out limitations and potential pitfalls in their practical application, as well as fairness issues in machine learning in general.

References.
Strobl C, Malley J, Tutz G (2009). An Introduction to Recursive Partitioning: Rationale, Application and Characteristics of Classification and Regression Trees, Bagging and Random Forests. Psychological Methods, 14(4), 323–348.
Henninger, M., Debelak, R., Rothacher, Y., & Strobl, C. (2022). Interpretable machine learning for psychological research: Opportunities and pitfalls. Psychological Methods (accepted). Preprint at: https://psyarxiv.com/xe83y/.

anonymous
Carolin Strobl
Univesity of Zurich, Switzerland

https://www.psychology.uzh.ch/en/areas/nec/methoden/team/carolinstrobl.html

Carolin Strobl is professor for Psychological Methods at the University of Zurich (UZH), Switzerland. She has degrees in psychology and statistics and graduated from the Ludwig-Maximilians-University of Munich (LMU), Germany, with a PhD and Habilitation in Statistics. She has been actively developing reliable and interpretable machine learning methods and promoting their application in psychology for over 15 years. Carolin and her group have contributed to several software packages related to machine learning and psychometrics in the free, open source software R, and have broad experience teaching statistics and machine learning with R in BA, MA and PhD study programs as well as in their postgraduate and professional training program, the Zurich R courses.