About
The ability to perceive and estimate temporal dynamics can be considered as one of the central elements of intelligent biological agents – equipped with a model of their environment. Similarly, if one takes the view that an agent’s (internal) model is its primary guide to behaviour, the ability to learn appropriate temporal representations and employ them for action selection is a crucial consideration in reinforcement learning (RL).
Currently, RL agents have matured such that model-based unsupervised approaches achieve competitive and even SOTA behaviours (for instance, MuZero - Schrittwieser et al. 2020, or DreamerV2 - Hafner et al. 2020). However, these models tend to operate over a physical timescale aligned with the shift in environment dynamics. Consequently, further considerations are required to mimic the types of spatio-temporal representations observed in neuronal responses – operating at both subjective and objective (physical) timescales. Indeed, a large amount of neuroimaging and modeling studies in cognitive science have been focused on explaining temporal representations and how they influence human behaviour (using neural networks and Bayesian inference) e.g., Jazayeri & Shadlen (2010), Roseboom et al. (2019), Deverett et al. (2019), Fountas et al. (2022).
TRiRL will bring together experts in model-based RL and neuroscientists working on the brain’s ability to represent time, in order to exchange insights, brainstorm, and encourage a multi-angle discussion on this important topic.
Speakers
Richard SuttonDeepMind, Amii and University of Alberta "Some Foundations of Temporal Representations" |
Marc HowardBoston University "Temporal memory in the brain and reinforcement learning"
|
Ida MomennejadMicrosoft Research Temporal Abstraction in Biological and Artificial RL
|
Schedule
Time | Agenda | Details |
---|---|---|
1:00 - 1:15 | Introduction | A short introduction to the topic and the workshop structure by the organisers. |
1:15 - 3:00 | Keynote speakers | Three keynote presentations will last approxmately 35 minutes each including questions. |
3:00 - 3:15 | Break / Group allocation | During a coffee break, the participants will be divided into moderated groups with the aim to maintain diversity in levels of seniority and field of expertise. |
3:15 - 4:20 | Group discussions | All groups will be given a list of open problems in the field to discuss and propose solutions. |
4:25 - 4:55 | Panel with questions | Each group will nominate representative to present the outcome of the discussion and defend the group’s position to the rest of the workshop participants, including online participants. |
4:55 - 5:00 | Closing remarks | In closing remarks, the results of the panel discussion will be summarised. |
5:00 | Social event | Participants who are physically present will be encouraged to attend a social event organised by the workshop to continue the discussion |
RLDM requires speakers and active participants to be physically present in the workshop. However, we plan to stream the whole program and we encourage online participants to submit questions online, which we will try to convey during the group and panel discussions. Finally, the most important outcomes of TRiRL will be formally described in opinion papers organised by the group moderators and written by willing participants after the workshop.
Organisers
Warrick Roseboom
University of Sussex
Panagiotis Tigas
University of Oxford
Mailing list
Sign up to receive the latest updates on the event (programme announcement and livestream):
References
- Deverett, B., Faulkner, R., Fortunato, M., Wayne, G. & Leibo, J. Z. (2019), ‘Interval timing in deep reinforcement learning agents’, Advances in Neural Information Processing Systems 32.
- Fountas, Z., Sylaidi, A., Nikiforou, K., Seth, A. K., Shanahan, M. & Roseboom, W. (2022), ‘A predictive processing model of episodic memory and time perception’, (in press) Neural Computation.
- Hafner, D., Lillicrap, T., Fischer, I., Villegas, R., Ha, D., Lee, H. & Davidson, J. (2019), Learning latent dynamics for planning from pixels, in ‘International conference on machine learning’, PMLR, pp. 2555–2565.
- Jazayeri, M. & Shadlen, M. N. (2010), ‘Temporal context calibrates interval timing’, Nature neuro- science 13(8), 1020–1026.
- Roseboom, W., Fountas, Z., Nikiforou, K., Bhowmik, D., Shanahan, M. & Seth, A. K. (2019), ‘Activity in perceptual classification networks as a basis for human subjective time perception’, Nature communications 10(1), 1–9.