Remi Cadene

I am co-founder and CEO of UMA, where I build intelligent robots that enhance quality of life for everyone.

Before, I had the privilege of scaling AI in the real-world at Tesla Autopilot, building the first neural networks of Tesla Optimus, and launching LeRobot at Hugging Face to democratize AI for robotics. Academically, I did some postdoctoral studies at Brown University and a PhD at Sorbonne.

My scientific interest lies in understanding the underlying mechanisms of intelligence. My research has been focused on learning human behaviors with neural networks through novel architectures, learning approaches, theoritical frameworks and explainability methods. I support open-source and draw my inspiration from neuroscience!

Google Scholar

Publications

Back to the Baseline: Examining Baseline Effects on Explainability Metrics

A. Martin Picard, T. Boissin, V. Subhash, R. Cadene, T. Fel

Arxiv (2025)

Arxiv

@inproceedings{martin2025back,
author = {Agustin Martin Picard, Thibaut Boissin, Varshini Subhash, Rémi Cadène, Thomas Fel},
title = {Back to the Baseline: Examining Baseline Effects on Explainability Metrics},
booktitle = {Arxiv},
year = {2025},
url = {https://arxiv.org/abs/2512.11433}
}

SmolVLA: A vision-language-action model for affordable and efficient robotics

M. Shukor, D. Aubakirova, F. Capuano, P. Kooijmans, S. Palma, A. Zouitine, M. Aractingi, C. Pascal, M. Russi, A. Marafioti, S. Alibert, M. Cord, T. Wolf, R. Cadene

Arxiv (2025)

Arxiv

@inproceedings{shukor2025smolvla,
author = {Mustafa Shukor, Dana Aubakirova, Francesco Capuano, Pepijn Kooijmans, Steven Palma, Adil Zouitine, Michel Aractingi, Caroline Pascal, Martino Russi, Andres Marafioti, Simon Alibert, Matthieu Cord, Thomas Wolf, Remi Cadene},
title = {Smolvla: A vision-language-action model for affordable and efficient robotics},
booktitle = {Arxiv},
year = {2025},
url = {https://arxiv.org/abs/2506.01844}
}

LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch

R. Cadene, S. Alibert, A. Soare, Q. Gallouedec, A. Zouitine, S. Palma, P. Kooijmans, M. Aractingi, M. Shukor, D. Aubakirova, M. Russi, F. Capuano, C. Pascale, J. Choghari, J. Moss, T. Wolf

Arxiv (2024)

Github

@misc{cadene2024lerobot,
author = {Cadene, Remi and Alibert, Simon and Soare, Alexander and Gallouedec, Quentin and Zouitine, Adil and Wolf, Thomas},
title = {LeRobot: State-of-the-art Machine Learning for Real-World Robotics in Pytorch},
howpublished = "\url{https://github.com/huggingface/lerobot}",
year = {2024}
}

A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation

T. Fel, T. V. Boutin, M. Moayeri, R. Cadene, L. Bethune, M. Chalvidal, T. Serre

NeurIPS (2023)

Arxiv

@inproceedings{fel2023holistic,
author = {Thomas Fel and Victor Boutin and Mazda Moayeri and Remi Cadene and Louis Bethune and Mathieu Chalvidal and Thomas Serre},
title = {A Holistic Approach to Unifying Automatic Concept Extraction and Concept Importance Estimation},
booktitle = {Advances in Neural Information Processing Systems 36},
year = {2023},
url = {https://arxiv.org/abs/2306.07304}
}

Unlocking Feature Visualization for Deeper Networks with MAgnitude Constrained Optimization

T. Fel, T. Boissin, V. Boutin, A. Picard, P. Novello, J. Colin, D. Linsley, T. Rousseau, R. Cadene, L. Gardes, T. Serre

NeurIPS (2023)

Arxiv

@inproceedings{fel2023unlocking,
author = {Thomas Fel and Thibaut Boissin and Victor Boutin and Agustin Picard and Paul Novello and Julien Colin and Drew Linsley and Tom Rousseau and Remi Cadene and Laurent Gardes and Thomas Serre},
title = {Unlocking Feature Visualization for Deeper Networks with MAgnitude Constrained Optimization},
booktitle = {Advances in Neural Information Processing Systems 36},
year = {2023},
url = {https://arxiv.org/abs/2306.06805}
}

CRAFT: Concept recursive activation factorization for explainability

T. Fel, A. Picard, L. Bethune, T. Boissin, D. Vigouroux, J. Colin, R. Cadene, T. Serre

CVPR (2023)

Arxiv

@inproceedings{fel2023craft,
author = {Thomas Fel and Agustin Picard and Louis Bethune and Thibaut Boissin and David Vigouroux and Julien Colin and Rémi Cadène and Thomas Serre},
title = {CRAFT: {C}oncept recursive activation factorization for explainability},
booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition {CVPR}},
year = {2023},
url = {https://openaccess.thecvf.com/content/CVPR2023/papers/Fel_CRAFT_Concept_Recursive_Activation_FacTorization_for_Explainability_CVPR_2023_paper.pdf}
}

Don't Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis

T. Fel, M. Ducoffe, D. Vigouroux, R. Cadène, M. Capelle, C. Nicodeme, T. Serre

CVPR (2023)

Arxiv

@inproceedings{fel2023dontlieeva,
author = {Thomas Fel and Melanie Ducoffe and David Vigouroux and Remi Cadene and Mikael Capelle and Claire Nicodeme and Thomas Serre},
title = {Don't Lie to Me! Robust and Efficient Explainability with Verified Perturbation Analysis},
booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition {CVPR}},
year = {2023},
url = {https://openaccess.thecvf.com/content/CVPR2023/papers/Fel_Dont_Lie_to_Me_Robust_and_Efficient_Explainability_With_Verified_CVPR_2023_paper.pdf}
}

What I cannot predict, I do not understand: A human-centered evaluation framework for explainability methods

J. Colin, T. Fel, R. Cadene, T. Serre

NeurIPS (2022)

Arxiv

@inproceedings{fel2022whaticannotpredict,
author = {Julien Colin andThomas Fel andRemi Cadene andThomas Serre},
title = {What {I} cannot predict, {I} do not understand: {A} human-centered evaluation framework for explainability methods},
booktitle = {Advances in Neural Information Processing Systems 35},
year = {2022},
url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/13113e938f2957891c0c5e8df811dd01-Paper-Conference.pdf}
}

Xplique: A deep learning explainability toolbox

T. Fel, L. Hervier, D. Vigouroux, A. Poche, J. Plakoo, R. Cadene, M. Chalvidal, J. Colin, T. Boissin, L. Bethune, A. Picard, C. Nicodeme, L. Gardes, G. Flandin, T. Serre

Arxiv (2022)

Arxiv

@inproceedings{fel2022xplique,
author = {Thomas Fel and Lucas Hervier and David Vigouroux and Antonin Poche and Justin Plakoo and Remi Cadene and Mathieu Chalvidal and Julien Colin and Thibaut Boissin and Louis Bethune and Agustin Picard and Claire Nicodeme and Laurent Gardes and Gregory Flandin and Thomas Serre},
title = {Xplique: A deep learning explainability toolbox},
booktitle = {Arxiv},
year = {2022},
url = {https://arxiv.org/abs/2206.04394}
}

How good is your explanation? algorithmic stability measures to assess the quality of explanations for deep neural networks

T. Fel, D. Vigouroux, R. Cadene, T. Serre

WACV (2022)

Arxiv

@inproceedings{fel2022howgood,
author = {Thomas Fel and David Vigouroux and Remi Cadene and Thomas Serre},
title = {How good is your explanation? algorithmic stability measures to assess the quality of explanations for deep neural networks},
booktitle = {{IEEE} Winter Conference on Applications of Computer Vision {WACV}},
year = {2022},
url = {https://openaccess.thecvf.com/content/WACV2022/papers/Fel_How_Good_Is_Your_Explanation_Algorithmic_Stability_Measures_To_Assess_WACV_2022_paper.pdf}
}

Understanding the computational demands underlying visual reasoning

M. Vaishnav, R. Cadene, A. Alamia, D. Linsley, R. VanRullen, T. Serre

MIT Press (2022)

Paper

@article{vaishnav2022understanding,
author = {Mohit Vaishnav and Remi Cadene and Andrea Alamia and Drew Linsley and Rufin VanRullen and Thomas Serre}
year = {2022},
title = {Understanding the computational demands underlying visual reasoning},
journal = {Neural Computation},
volume = {32},
issue = {5},
pages = {1075 - 1099},
doi = {https://doi.org/10.1162/neco_a_01485},
url = {https://direct.mit.edu/neco/article/34/5/1075/109662/Understanding-the-Computational-Demands-Underlying}
}

Look at the variance! efficient black-box explanations with sobol-based sensitivity analysis

T. Fel*, R. Cadene*, M. Chalvidal, M. Cord, D. Vigouroux, T. Serre

NeurIPS (2021)

Arxiv

@inproceedings{fel2021sobol,
author = {Thomas Fel and Remi Cadene and Mathieu Chalvidal and Matthieu Cord and David Vigouroux and Thomas Serre},
title = {Look at the variance! efficient black-box explanations with sobol-based sensitivity analysis},
booktitle = {Advances in Neural Information Processing Systems 34},
year = {2021},
url = {https://proceedings.neurips.cc/paper/2021/file/da94cbeff56cfda50785df477941308b-Paper.pdf}
}

Beyond Question-Based Biases: Assessing Multimodal Shortcut Learning in Visual Question Answering

C. Dancette*, R. Cadene*, D. Teney, M. Cord

ICCV (2021)

Arxiv Code

@inproceedings{cdancette2021beyond,
author = {Dancette, Corentin and Cadene, Remi and Teney, Damien and Cord, Matthieu},
title = {{B}eyond {Q}uestion-{B}ased {B}iases: {A}ssessing {M}ultimodal {S}hortcut {L}earning in {V}isual {Q}uestion {A}nswering},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2021},
url = {https://arxiv.org/abs/2104.03149}
}

Deep Multimodal Learning for Vision and Language Processing

R. Cadene

Sorbonne Université (2021)

Thesis

@phdthesis{cadene2021multimodal,
author={R. Cadene},
year = {2021},
title = {Deep Multimodal Learning for Vision and Language Processing},
school={Sorbonne Universit{\'e}, UPMC},
url = {https://hal.archives-ouvertes.fr/tel-03140942}
}

Same-different conceptualization: a machine vision perspective

M. Ricci, R. Cadene, T. Serre

Elsevier COBS (2021)

Paper

@article{ricci2020samedifferent,
author = {Matthew Ricci and Rémi Cadène and Thomas Serre}
year = {2021},
title = {Same-different conceptualization: a machine vision perspective},
journal = {Current Opinion in Behavioral Sciences},
volume = {37},
pages = {47 - 55},
issn = {2352-1546},
doi = {https://doi.org/10.1016/j.cobeha.2020.08.008},
url = {http://www.sciencedirect.com/science/article/pii/S2352154620301352}
}

Overcoming Statistical Shortcuts for Open-ended Visual Counting

C. Dancette*, R. Cadene*, X. Chen, M. Cord

Arxiv (2020)

Arxiv

@inproceedings{dancette2020counting,
author = {Dancette, Corentin and Cadene, Remi and Chen, Xinlei and Cord, Matthieu},
title = {Overcoming Statistical Shortcuts for Open-ended Visual Counting},
booktitle = {Arxiv},
year = {2020},
url = {https://arxiv.org/abs/2006.10079}
}

RUBi: Reducing Unimodal Biases for Visual Question Answering

R. Cadene*, C. Dancette*, H. Ben-Younes, M. Cord, D. Parikh

NeurIPS (2019)

Arxiv Code

@inproceedings{cadene2019rubi,
author = {Cadene, Remi and Dancette, Corentin and Ben-Younes, Hedi and Cord, Matthieu and Parikh, Devi},
title = {{RUB}i: {R}educing {U}nimodal {B}iases for {V}isual {Q}uestion {A}nswering},
booktitle = {Advances in Neural Information Processing Systems 32},
year = {2019},
url = {https://arxiv.org/abs/1906.10169}
}

MUREL: Multimodal Relational Reasoning for Visual Question Answering

R. Cadene*, H. Ben-Younes*, N. Thome, M. Cord

CVPR (2019)

Arxiv Code

@inproceedings{cadene2019murel,
author = {Cadene, Remi and Ben-Younes, Hedi and Thome, Nicolas and Cord, Matthieu},
title = {MUREL: {M}ultimodal {R}elational {R}easoning for {V}isual {Q}uestion {A}nswering},
booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition {CVPR}},
year = {2019},
url = {https://arxiv.org/abs/1902.09487}
}

BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection

H. Ben-Younes, R. Cadene, N. Thome, M. Cord

AAAI (2019)

Arxiv Poster Code

@inproceedings{benyounes2019block,
author = {Ben-Younes, Hedi and Cadene, Remi and Thome, Nicolas and Cord, Matthieu},
title = {BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection},
booktitle = {Proceedings of the 33st Conference on Artificial Intelligence (AAAI)},
year = {2019},
url = {https://arxiv.org/abs/1902.00038}
}

Bootstrap.pytorch, a high-level framework for accelerating research

R. Cadene, M. Carvalho, H. Ben-Younes, T. Robert, M. Cord

Pytorch (2018)

Poster Code

Benchmark Analysis of Representative Deep Neural Network Architectures

S. Bianco, R. Cadene, L. Celona, P. Napoletano

IEEE Access (2018)

Arxiv Code

@article{bianco2018dnnsbench,
author = {Bianco, Simone and Cadene, Remi and Celona, Luigi and Napoletano, Paolo},
year = {2018},
title = {Benchmark Analysis of Representative Deep Neural Network Architectures},
journal = {IEEE Access},
volume = {6},
pages = {64270-64277},
doi = {10.1109/ACCESS.2018.2877890},
ISSN = {2169-3536},
}

VQA Challenge Workshop: Bilinear Superdiagonal Fusion

H. Ben-Younes, R. Cadene, N. Thome, M. Cord

VQA Workshop (CVPR) (2018)

Code

Cross-modal retrieval in the cooking context: Learning semantic text-image embeddings

M. Carvalho*, R. Cadene* D. Picard, L. Soulier, N. Thome, M. Cord

SIGIR (2018)

Arxiv Code

Images & Recipes: Retrieval in the cooking context

M. Carvalho*, R. Cadene*, D. Picard, L. Soulier, M. Cord

DECOR Workshop (ICDE) (2018)

Arxiv Slides Code

MUTAN: Multimodal Tucker Fusion for Visual Question Answering

H. Ben-Younes*, R. Cadene*, N. Thome, M. Cord

ICCV (2017)

Arxiv Slides Code

@inproceedings{benyounes2017mutan,
author = {Ben-younes, Hedi and Cadene, Remi and Cord, Matthieu and Thome, Nicolas},
title = {{MUTAN}: {M}ultimodal {T}ucker {F}usion for {V}isual {Q}uestion {A}nswering},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017},
url = {http://arxiv.org/abs/1705.06676}
}

VQA Challenge Workshop: MUTAN 2.0

H. Ben-Younes*, R. Cadene*, N. Thome, M. Cord

VQA Workshop (CVPR) (2017)

Poster Code

Master's Thesis - Deep Learning for Visual Recognition

R. Cadene, N. Thome, M. Cord

(2016)

Arxiv Slides Code

@article{DBLP:journals/corr/CadeneTC16,
author = {R{\'{e}}mi Cad{\`{e}}ne and
Nicolas Thome and
Matthieu Cord},
title = {Master's Thesis : Deep Learning for Visual Recognition},
journal = {CoRR},
volume = {abs/1610.05567},
year = {2016},
url = {http://arxiv.org/abs/1610.05567},
timestamp = {Wed, 02 Nov 2016 09:51:26 +0100},
biburl = {http://dblp.uni-trier.de/rec/bib/journals/corr/CadeneTC16},
bibsource = {dblp computer science bibliography, http://dblp.org}
}

M2CAI Workflow Challenge: Convolutional Neural Networks for Video Frames Classification

R. Cadene, T. Robert, N. Thome, M. Cord

M2CAI Workshop (MICCAI) (2016)

Arxiv Poster Code

@article{DBLP:journals/corr/CadeneRTC16,
author = {R{\'{e}}mi Cad{\`{e}}ne and
Thomas Robert and
Nicolas Thome and
Matthieu Cord},
title = {{M2CAI} Workflow Challenge: Convolutional Neural Networks with Time
Smoothing and Hidden Markov Model for Video Frames Classification},
journal = {CoRR},
volume = {abs/1610.05541},
year = {2016},
url = {http://arxiv.org/abs/1610.05541},
timestamp = {Wed, 02 Nov 2016 09:51:26 +0100},
biburl = {http://dblp.uni-trier.de/rec/bib/journals/corr/CadeneRTC16},
bibsource = {dblp computer science bibliography, http://dblp.org}
}

* equal contribution

Contact

Couloir 26-00, Bureau 523
4, place Jussieu
75005 Paris
France

@remicadene

Cadene