Singular learning theory: connecting algebraic geometry and model selection in statistics
December 12 to December 16, 2011
at the
American Institute of Mathematics,
San Jose, California
organized by
Russell Steele,
Bernd Sturmfels,
and Sumio Watanabe
Original Announcement
This workshop will be devoted to singular
learning theory, the application of algebraic geometry to problems in
statistical model selection and machine learning. The intent of this workshop is
to connect algebraic geometers and statisticians specializing in Bayesian model
selection criteria in order to explore the relationship between analytical
asymptotic approximations from algebraic geometry and other commonly used
methods in statistics (including traditional asymptotic and Monte Carlo
approaches) for developing new model selection criteria. The hope is to
generate interest amongst both communities for collaborations that will spur new
topics of research in both algebraic geometry and statistics.
Singular
statistical learning is an approach for statistical learning and model selection
that can be applied to singular parameter spaces, i.e. can be used for
non-regular statistical models. The methodology uses the method of resolution of
singularities to generalize the criteria for regular statistical models to
non-regular models. Examples of non-regular statistical models that have been
studied as part of singular learning theory include hidden Markov models, finite
mixture models, and multi-layer neural network models. Although there exists a
large body of recent published work in this area, it is has not yet been
integrated or even well-cited by the larger statistical community.
The
workshop has three primary goals:
- To introduce statisticians and computer
scientists working the area of model selection to the topic of singular learning
theory, in particular the application of the method of resolution of
singularities to model selection for non-regular statistical models.
- To
generate a list of open problems in algebraic geometry motivated by complex
statistical models that cannot be covered by current results.
- To
collaboratively develop a set of core materials that will define the area of
singular statistical learning that will be accessible to geometers and
statisticians.
The main open problems the workshop will consider:
-
Exploring connections of Widely Applicable Information Criteria (WAIC) from
singular learning theory to other model selection criteria, including the
Deviance Information Criterion (DIC), regular statistical versions of the AIC
and BIC, and other criteria specific to particular non-regular statistical
models (for example, the scan statistic from spatial statistics).
-
Identifying fundamental problems in algebraic geometry are related to
generalizing these information criteria to model selection problems for
Generalized Estimating Equations (GEE), which use ideas from semi-parametric
inference to obtain estimates of parameters without assuming a parametric form
for the likelihood of the observed data.
- Generalizing the singular learning
theory information criteria be to statistical problems where some observations
contain missing information and/or measurement error.
- Establishing the
finite sample properties of WAIC, in particular for problems where one can
incorporate prior knowledge in a fully Bayesian modelling approach.
Material from the workshop
A list of participants.
The workshop schedule.
A report on the workshop activities.
Papers arising from the workshop: