MJH: JST PRESTO (Sakigake) project page

Background

One of the major funding bodies for scientific research in Japan is JST, the Japan Science and Technology Agency (en/ja). JST provides a wide variety of grants both to individual researchers and teams of researchers, but the flagship grant for individuals is called PRESTO (Sakigake in Japanese; en/ja). My application to PRESTO was accepted within the research area of "Trustworthy AI" (en/ja), and it runs from October 2021 to March 2025, with a total budget of 40 million yen.

On this page, I will give an overview of the goals and key ideas underlying my initial proposal, as well as provide a summary of the research papers, software, and presentations I make which are closely related to this project.

Overview of key concepts

The title of my project is "Machine learning with guarantees under diverse risk measures" and the most important underlying idea is that the current machine learning methodology rooted in performance on average needs a principled re-evaluation.

Put simply, the formal definition of "success" in most machine learning tasks can be formally expressed as minimizing the expected value (i.e., the average) of a random loss computed using some kind of loss function. Here the randomness is typically assumed to be over the random draw of a new data point at test time (i.e., after "training" is complete). This approach is perfectly natural, but in designing and evaluating learning systems (e.g., human workflows supported by machine learning software, automated systems running such software), the emphasis on the average leaves out other important properties of the random loss distribution (e.g., dispersion, heaviness of tails, symmetry, etc.). In the title of my project, I use the term "diverse risk measures" to emphasize that I want to develop new algorithms, theory, and methodologies for machine learning tasks characterized by the optimization of a wider variety of properties of the test loss distribution, including but not limited to the expected loss.

Some background reading

In the following paper, I discuss some of the key ideas underlying this project with a bit more formal notation, and also complement this with a brief historical review of statistical learning and the role played by the expected loss.

A Survey of Learning Criteria Going Beyond the Usual Risk
Matthew J. Holland and Kazuki Tanabe
Journal: Journal of Artificial Intelligence Research, 78:781-821, 2023.
Oral: AAAI 2024 (Journal track), Vancouver, Canada.
[journal, doi, arXiv]

Several works done by myself and colleagues can be considered precursors to the current PRESTO project.

Learning with risks based on M-location
Matthew J. Holland
Journal: Machine Learning, 111:4679-4718, 2022.
Oral: ECML-PKDD 2022, Grenoble, France.
[journal, doi, arXiv, code]

Spectral risk-based learning using unbounded losses
Matthew J. Holland and El Mehdi Haress
Presented at AISTATS 2022, online.
Proceedings of Machine Learning Research 151:1871-1886, 2022.
[proceedings, arXiv, code]

Making learning more transparent using conformalized performance prediction
Matthew J. Holland
Presented at ICML 2021, Workshop on Distribution-Free Uncertainty Quantification.
[arXiv]

Learning with risk-averse feedback under potentially heavy tails
Matthew J. Holland and El Mehdi Haress
Presented at AISTATS 2021, online.
Proceedings of Machine Learning Research 130:892-900, 2021.
[proceedings, arXiv, code]

With these works as technical and conceptual context, the following section summarizes key points regarding the progress made since starting this project.

New work since starting this project

The first substantive new results build upon the "M-location" notion considered in our previous work (the MLJ/ECML-PKDD2022 paper cited above), making a significant conceptual and technical expansion by placing the notion of "dispersion" at the forefront when designing off-sample generalization metrics (i.e., risk functions). This work was presented at AISTATS 2023, and I have since then continued to develop these ideas into several new papers (and software repositories). Here are the main representative works related to this project.

Soft ascent-descent as a stable and flexible alternative to flooding
Matthew J. Holland and Kosuke Nakatani
NeurIPS 2024, to appear.
[arXiv, code]

Making Robust Generalizers Less Rigid with Soft Ascent-Descent
Matthew J. Holland and Toma Hamada
Preprint.
[arXiv, code]

Criterion Collapse and Loss Distribution Control
Matthew J. Holland
Presented at ICML 2024, Vienna, Austria.
Proceedings of Machine Learning Research 235:18547-18567, 2024.
[proceedings, arXiv, code]

Robust variance-regularized risk minimization with concomitant scaling
Matthew J. Holland
Presented at AISTATS 2024, Valencia, Spain.
Proceedings of Machine Learning Research 238:1144-1152, 2024.
[proceedings, arXiv, code]

Flexible risk design using bi-directional dispersion
Matthew J. Holland
Presented at AISTATS 2023, Valencia, Spain.
Proceedings of Machine Learning Research 206:1586-1623, 2023.
[proceedings, arXiv, code]

In the first few months of the 2022 academic year (starting in April), in the vein of sharing the key ideas and initial results of this research work with a diverse audience, I gave several talks, both at universities and conferences here in Japan. In particular, the following two oral presentations were the first in-person presentations I have made since the start of the COVID outbreak.

JSAI 2022 (Kyoto): "Achieving desirable loss distributions by design"
NEURO 2022 (Okinawa): "Achieving desirable reward distributions by design" (received presentation award)

In addition to rigorous theoretical and experimental analysis aimed at experts in machine learning, I have also been making some effort to author an "explainer" article which breaks down the key concepts underlying this research project into a form that is congenial to a more diverse audience, inspired in part by the ICLR Blog Track introduced in 2022, currently stored in the following public GitHub repository.

offgen: A visual "explainer" for off-sample generalization metrics
Matthew J. Holland
Public GitHub Repository.
[code]

As additional progress is made, this article will be updated and expanded.