Spotlight Seminars on AI – Winter 2025

30/01/2025 - 21/03/2025

The “Winter 2025” edition of the AI Spotlight Seminars (organized by AIxIA and AISB) features 3 talks:

  • Anthony Cohn, University of Leeds

    • January, 30 – 5:00PM (CEST)
    • Title: Evaluating Commonsense Reasoning in Large Language Models
    • Abstract:In this talk I will discuss the ability of LLMs to perform commonsense reasoning, particularly with regard to spatial reasoning. Across a wide range of LLMs, although they show abilities rather better than chance, they still struggle with many questions and tasks, for example when reasoning about directions, or topological relations. I will also discuss issues arising from the fact that some of the most powerful language models currently are proprietary systems, accessible only via (typically restrictive) web or software programming interfaces. This is the Language-Models-as-a-Service (LMaaS) paradigm. In contrast with scenarios where full model access is available, as in the case of open-source models, such closed-off language models present specific challenges for evaluating, benchmarking, and testing them.
    • Bio: Anthony (Tony) Cohn is Professor of Automated Reasoning in the School of Computer Science, University of Leeds. His current research interests range from theoretical work on spatial calculi (receiving a KR test-of-time classic paper award in 2020) and spatial ontologies, to cognitive vision, modelling spatial information in the hippocampus, and Decision Support Systems, particularly for the built environment, as well as robotics. He is Foundation Models lead at the Alan Turing Institute where he is conducting research on evaluating the capabilities of large language models, in particular with respect to commonsense reasoning, and is also a co-investigator on a project combining LLMs and probabilistic answer set programming.  He is Editor-in-Chief of Spatial Cognition and Computation and was previously Editor-in-chief of the AI journal. He has previously been President of IJCAI, EurAI, KR inc, and AISB. He is the recipient of the 2021 Herbert A Simon Cognitive Systems Prize, and is also (uniquely)  the recipient of  Distinguished Service Awards from the three main international AI societies:  IJCAI, AAAI and EurAI, as well as from KR Inc. He is a Fellow of the Royal Academy of Engineering,  the Learned Society of Wales, the AI societies AAAI, AISB,  EurAI and AAIA, as well as the CORE Academy (International Core Academy of Sciences and Humanities) and the International AI Industry Alliance.
  • Bernardo Magnini Fondazione Bruno Kessler 
    • February, 27 – 5:00PM (CEST)
    • Title: Rethinking NLP Evaluation in the Age of LLMs: Lessons from Benchmarking Italian
    • Abstract: Large Language Models (LLMs) are now at the core of most NLP applications, mainly because of their strong performance and their adaptability to different tasks and languages. However, despite their widespread use, evaluating LLMs is still an active area of research, and a debate about methodologies is ongoing. Several issues are under discussion, including competence-oriented and task-oriented approaches; how to balance prompt naturalness and effectiveness; investigate the role of multiple prompts in evaluation; considering both multiple-choice and generative tasks along with the most appropriate metrics for each; and comparing zero-shot and few-shot settings taking into consideration execution performance. To be more concrete, I will report examples and lessons learned from developing an LLM benchmark for the Italian language.
    • Bio: Bernardo Magnini is senior researcher at FBK (Trento, Italy), and responsible of the NLP research group. His interests are in the field of Computational Linguistics, particularly lexical semantics and lexical resources, question answering, textual entailment, and conversational agents, areas in which he has published more than 300 scientific papers. He has co-chaired several events, including EVALITA, the evaluation campaign for both NLP and speech tools for the Italian language, CLIC-it 2014 (the first Italian conference on Computational Linguistics), AI*IA 2018 (the 17th International Conference of the Italian Association for Artificial Intelligence) and ACL 2022, the 60th Annual Meeting of the Association for Computational Linguistics. He has been contract professor at the University of Trento, Bolzano and Pavia, and President of the Italian Association for Computational Linguistics (AILC) from 2015 to 2022.
  • Alessio Lomuscio, Imperial College London
    • March, 21 – 5:00PM (CEST)
    • Title: Towards Verification of Neural Systems
    • Abstract: A major challenge in deploying ML-based systems, such as ML-based computer vision, is the inherent difficulty in ensuring their performance in the operational design domain. The standard approach consists in extensively testing models against a wide collection of inputs. However, testing is inherently limited in coverage, and it is expensive in several domains. Novel verification methods provide guarantees that a neural model meets its specifications in dense neighbourhood of selected inputs. For example, by using verification methods we can establish whether a model is robust with respect to infinitely many re-illumination changes, or particular noise patterns in the vicinity to an input. Verification methods can also be tailored to specifications in the latent space and establish the robustness of models against semantic perturbations not definable in the input space (3D pose changes, background changes, etc). Additionally, verification methods can be paired with learning to obtain robust learning methods capable of generating models inherently more robust than those that may be derived with standard methods. In this presentation I will succinctly cover the key theoretical results leading to some of the present ML verification technology, illustrate the resulting toolsets and capabilities, and describe some of the use cases developed with our colleagues at Boeing Research, including centerline distance estimation, object detection, and runway detection. I will argue that verification and robust learning can be used to obtain models that are inherently more robust than present learning and testing approaches, thereby unlocking the deployment of applications in society critical applications.
    • Bio: Alessio Lomuscio (http://www.doc.ic.ac.uk/~alessio) is Professor of Safe Artificial Intelligence at Imperial College London (UK), where he leads the Safe AI Lab (http://sail.doc.ic.ac.uk/). He is a Distinguished ACM member, a Fellow of the European Association of Artificial Intelligence and currently holds a Royal Academy of Engineering Chair in Emerging Technologies. He is founding co-director of the UKRI Doctoral Training Centre in Safe and Trusted Artificial Intelligence. Alessio’s research interests concern the development of verification methods for artificial intelligence.  Since 2000 he has pioneered the development of formal methods for the verification of autonomous systems and multi-agent systems, both symbolic and ML-based. He has published over 200 papers in leading AI and formal methods conferences and journals. He is the founder and CEO of Safe Intelligence, a VC-backed Imperial College London spinout helping users build and assure robust ML systems.