Softskills Seminar

Content

This course is an obligatory course of the M2 of the Master’s program “Data AI” of the Institut Polytechnique de Paris - open to students of other programs as well. The purpose of this course is to train students to give scientific presentations.

Every student chooses one research paper from the list of proposed papers. The student then prepares a 20min presentation about this paper. For this purpose, she/he can request the help of the advisor of the paper (by email and/or by meeting with them). The student then gives the presentation in the allocated time slot of the Softskills seminar, in the presence of the lecturer. Students are warmly encouraged to take into account the advice on giving good talks dispensed during the first session.

Each presentation is followed by a question-answer session, where both the students and the lecturers can ask the presenter questions about the paper. To animate this, each student is assigned to some other paper as the “devil’s advocate”. In this role (which is not known to the other students), she or he prepares some questions for the presenter. However, all students are invited to participate in the question-answer session.

Grading

The course is graded by

Schedule

The course takes place on Monday mornings (9:00-12:15) at Telecom Paris.
24/11/2025: Introduction (Room 0D20)
Given by the lecturer, Fabian Suchanek
  1. Introduction
  2. How to give good talks
  3. How to do a PhD
1/12/2025 (Room 1D19)
  1. 9:00 Louis Jachiet 1: The case for learned index structures (David Nelischer)
  2. 9:45 Louis Jachiet 2: Automating string processing in spreadsheets using input-output examples (Mathieu Antonopoulos)
  3. 10:30 Louis Jachiet 3: How Good are Learned Cost Models, Really? Insights from Query Optimization Tasks (Saba Shahsavari)
  4. 11:30 Pietro Gori 1: Learning Transferable Visual Models From Natural Language Supervision (Ziyi Liu)
8/12/2025 (Room 1D19)
  1. 9:00 Nikola Simidjievski 1: Accurate predictions on small data with a tabular foundation model (Do Huu Quan)
  2. 9:45 Nikola Simidjievski 2: A visual-language foundation model for computational pathology (Leiheng Qin)
  3. 10:45 François Crespin 1: Toolformer: Language Models Can Teach Themselves to Use Tools (Anastasiia Karpova)
  4. 11:30 François Crespin 2: ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings (Arpi Hunanyan)
15/12/2025 (Room 0D20)
  1. 9:00 Matthieu Labeau 1: How to Compute the Probability of a Word (Zeinab Ghamlouch)
  2. 9:45 Matthieu Labeau 2: From Pretraining Data to Language Models to Downstream Tasks: Tracking the Trails of Political Biases Leading to Unfair NLP Models (Wanqiu Lei)
  3. 10:45 Matthieu Labeau 3: Balancing the Budget: Understanding Trade-offs Between Supervised and Preference-Based Finetuning (Yifei WANG)
  4. 11:30 Matthieu Labeau 4: Generative or Discriminative? Revisiting Text Classification in the Era of Transformers (Nader Sadek)
5/1/2026 (Room 1D19)
  1. 9:00 Fabian Suchanek 1: Enabling LLM Knowledge Analysis via Extensive Materialization (Jayamangalage Abeytunge)
  2. 9:45 Jean-Louis Dessalles 1: Emergent world models and latent variable estimation in chess-playing language models (Zhonghan WANG)
  3. 10:30 Jean-Louis Dessalles 2: Solving analogies on words based on minimal complexity transformation (Yixing Yang)
  4. 11:15 Jean-Louis Dessalles 3: Algorithmic complexity for short binary strings applied to psychology: a primer (Alessa MAYER)
12/1/2026 (Room 1D19)
  1. 9:00 Vicky Kalogeiton 1: MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency (Mindiiarova Renata)
  2. 9:45 Vicky Kalogeiton 2: How far can we go with ImageNet for Text-to-Image generation? (Georgios Margaritis)
  3. 10:45 Samy Haffoudhi 1: Scalable Zero-shot Entity Linking with Dense Entity Retrieval (Yixiang Wang)
  4. 11:30 Samy Haffoudhi 2: Autoregressive Entity Retrieval (Hai Thien Long Vu)
19/01/2026 (Room 1D19)
  1. 9:00 Maria Boritchev 1: On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? (Tristan Donzé)
  2. 9:45 Maria Boritchev 2: Tackling Language Modelling Bias in Support of Linguistic Diversity (Christelle Clervilsson)
  3. 10:45 Sao Mai Nguyen 1: LOTUS: Continual Imitation Learning for Robot Manipulation Through Unsupervised Skill Discovery (Hongsheng Ye)
  4. 11:30 Sao Mai Nguyen 2: TidyBot++: An Open-Source Holonomic Mobile Manipulator for Robot Learning (Yangtao FANG)
26/01/2026 (Room 1D19)
  1. 9:00 Ada Diaconescu 1: Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data (Baptiste Geisenberger)
  2. 9:45 Ada Diaconescu 2: Quantifying causal emergence shows that macro can beat micro (Hongxu Chen)
  3. 10:45 Laurent Decreusfeond 1: Bit-Level Discrete Diffusion with Markov Probabilistic Models: An Improved ... (Ziqian Liu)
  4. 11:30 Laurent Decreusfeond 2: Sampling Binary Data by Denoising through Score Functions (Ling Liu)
2/2/2026 (Room 1A242)
  1. 9:00 Fabian Suchanek 2: GRASP: Generic Reasoning And SPARQL (Enzo PINCHON)
  2. 9:45 Yanzhu Guo 1: Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs (Kuan SUN)
  3. 10:30 Luca Benedetto 2: Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs (Avrile Floro)
  4. 11:15 Luca Benedetto 3: RADAR: Reasoning-Ability and Difficulty-Aware Routing for Reasoning LLMs (Victor Micha)
9/2/2026 (Room 1C27)
  1. 9:00 Luca Benedetto 1: Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models (Nawel Zait)
  2. 9:45 Yanzhu Guo 2: Do Llamas Work in English? On the Latent Language of Multilingual Transformers (Pablo Bertaud-Velten)
  3. 10:30 Yanzhu Guo 3: Improving Diversity in Language Models: When Temperature Fails, Change the Loss (Stepan Svirin)
  4. 11:30 Nils Holzenberger 1: Constrained Language Models Yield Few-Shot Semantic Parsers (Marc Farah)
16/02/2026 (Room 1C27)
  1. 9:00 Mario Gleirscher 1: Safe Multiagent Learning With Soft Constrained Policy Optimization in Real Robot Control (Fares Boudelaa)
  2. 9:45 Mario Gleirscher 2: Model-free RL for motion planning of autonomous agents with complex tasks in partially observable environments (Liam Loughman)
  3. 10:45 Nils Holzenberger 2: Neural Module Networks for Reasoning over Text (Shashwat Sharma)
  4. 11:30 Nils Holzenberger 3: What About the Precedent: An Information-Theoretic Analysis of Common Law (Yuxuan Peng)
23/02/2026 (Room 1C27)
  1. 9:00 Fabian Suchanek 3: Robust discovery of positive and negative rules in knowledge bases (Yufei ZHOU)
  2. 09:45 Julien Alexandre dit Sandretto 1: Revising Hull and Box Consistency (Diego Fleury)
  3. 10:30 Julien Alexandre dit Sandretto 2: Interval-Based Sliding Mode Control Design for Solid Oxide Fuel Cells With State and Actuator Constraints (Vijay Venkatesh Murugan)
  4. 11:15 Andrea Araldo 1: Hindsight Learning for MDPs with Exogenous Inputs (only pages 1-9) (Cong-Vinh Dang)
  5. -- Julien Alexandre dit Sandretto 3: Runge–Kutta Theory And Constraint Programming