Philipp Thölke, Gianni De Fabritiis
International Conference on Learning Representations (ICLR) | 2022
The prediction of quantum mechanical properties is historically plagued by a trade-off between accuracy and speed. Machine learning potentials have previously shown great success in this domain, reaching increasingly better accuracy while maintaining computational efficiency comparable with classical force fields. In this work we propose TorchMD-NET, a novel equivariant transformer (ET) architecture, outperforming state-of-the-art on MD17, ANI-1, and many QM9 targets in both accuracy and computational efficiency. Through an extensive attention weight analysis, we gain valuable insights into the black box predictor and show differences in the learned representation of conformers versus conformations sampled from molecular dynamics or normal modes. Furthermore, we highlight the importance of datasets including off-equilibrium conformations for the evaluation of molecular potentials.
Philipp Thölke, Yorguin Jose Mantilla Ramos, Hamza Abdelhedi, Charlotte Maschke, Arthur Dehgan, Yann Harel, Anirudha Kemtur, Loubna Mekki Berrada, Myriam Sahraoui, Tammy Young, Vanessa Hadid, Etienne Combrissone, Jordan O’Byrne, Karim Jerbi
NeuroImage | 2023
Machine learning (ML) is increasingly used in cognitive, computational and clinical neuroscience. The reliable and efficient application of ML requires a sound understanding of its subtleties and limitations. Training ML models on datasets with imbalanced classes is a particularly common problem, and it can have severe consequences if not adequately addressed. With the neuroscience ML user in mind, this paper provides a didactic assessment of the class imbalance problem and illustrates its impact through systematic manipulation of data imbalance ratios in (i) simulated data and (ii) brain data recorded with electroencephalography (EEG), magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI). Our results illustrate how the widely-used Accuracy (Acc) metric, which measures the overall proportion of successful predictions, yields misleadingly high performances, as class imbalance …
Maciej Majewski, Adrià Pérez, Philipp Thölke, Stefan Doerr, Nicholas E Charron, Toni Giorgino, Brooke E Husic, Cecilia Clementi, Frank Noé, Gianni De Fabritiis
Nature communications | 2023
A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential …
Wenhui Cui, Woojae Jeong, Philipp Thölke, Takfarinas Medani, Karim Jerbi, Anand A Joshi, Richard M Leahy
IEEE International Symposium on Biomedical Imaging | 2024
To handle the scarcity and heterogeneity of electroencephalography (EEG) data for Brain-Computer Interface (BCI) tasks, and to harness the power of large publicly available data sets, we propose Neuro-GPT, a foundation model consisting of an EEG encoder and a GPT model. The foundation model is pre-trained on a large-scale data set using a self-supervised task that learns how to reconstruct masked EEG segments. We then fine-tune the model on a Motor Imagery Classification task to validate its performance in a low-data regime (9 subjects). Our experiments demonstrate that applying a foundation model can significantly improve classification performance compared to a model trained from scratch, which provides evidence for the generalizability of the foundation model and its ability to address challenges of data scarcity and heterogeneity in EEG.
Raul P Pelaez, Guillem Simeon, Raimondas Galvelis, Antonio Mirarchi, Peter Eastman, Stefan Doerr, Philipp Thölke, Thomas E Markland, Gianni De Fabritiis
Journal of Chemical Theory and Computation | 2024
Achieving a balance between computational speed, prediction accuracy, and universal applicability in molecular simulations has been a persistent challenge. This paper presents substantial advancements in TorchMD-Net software, a pivotal step forward in the shift from conventional force fields to neural network-based potentials. The evolution of TorchMD-Net into a more comprehensive and versatile framework is highlighted, incorporating cutting-edge architectures such as TensorNet. This transformation is achieved through a modular design approach, encouraging customized applications within the scientific community. The most notable enhancement is a significant improvement in computational efficiency, achieving a very remarkable acceleration in the computation of energy and forces for TensorNet models, with performance gains ranging from 2× to 10× over previous, nonoptimized, iterations. Other …
Antoine Bellemare-Pepin, François Lespinasse, Philipp Thölke, Yann Harel, Kory Mathewson, Jay A Olson, Yoshua Bengio, Karim Jerbi
arXiv preprint arXiv:2405.13012 | 2024
The recent surge in the capabilities of Large Language Models (LLMs) has led to claims that they are approaching a level of creativity akin to human capabilities. This idea has sparked a blend of excitement and apprehension. However, a critical piece that has been missing in this discourse is a systematic evaluation of LLM creativity, particularly in comparison to human divergent thinking. To bridge this gap, we leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans. We found evidence suggesting that LLMs can indeed surpass human capabilities in specific creative tasks such as divergent association and creative writing. Our quantitative benchmarking framework opens up new paths for the development of more creative LLMs, but it also encourages more granular inquiries into the distinctive elements that constitute human inventive thought processes, compared to those that can be artificially generated.
Philipp Thölke, Antoine Bellemare-Pepin, Yann Harel, François Lespinasse, Karim Jerbi
Proceedings of the 15th International Conference on Computational Creativity | 2024
This paper introduces the” Bio-Mechanical Poet”, an adaptive brain-computer interface that integrates realtime electroencephalography (EEG) data with advanced generative artificial intelligence to create immersive audiovisual poetic experiences. We describe a custom prototyping environment for the exploration of various biosignals and their integration in a multimodal pipeline. By mapping brain states to symbolic representations, we explore trajectories of neural states in a multimodal symbolic latent space. This enables humaninterpretable access to it via the modalities of generative music, diffusion-based visuals and AI-crafted poetry. In doing so, we illustrate how the symbiosis of biosignals and generative systems can provide rich multimodal artworks guiding the user throughout the experience. Our discussion centers on the influence of biofeedback systems integrated with generative AI on evolving storytelling methods and altering perceptual states. We further discuss how translating biosignals into tangible expressions could open new avenues for understanding and interacting with our physiological and subconscious selves. Bio-Mechanical Poet exemplifies the potential of biofeedback and real-time feedback systems to foster advancements in the field of computational creativity, offering insights into the integration of human brain dynamics with artistic creation.
Philipp Thölke, Maxine Arcand-Lavigne, Tarek Lajnef, Sonia Frenette, Julie Carrier, Karim Jerbi
bioRxiv | 2024
Caffeine is the most widely consumed psychoactive stimulant worldwide. Yet important gaps persist in understanding its effects on the brain, especially during sleep. We analyzed sleep EEG in 40 subjects, contrasting 200mg of caffeine against a placebo condition, utilizing inferential statistics and machine learning. We found that caffeine ingestion led to an increase in brain complexity, a widespread flattening of the power spectrum’s 1/f-like slope, and a reduction in long-range temporal correlations. Being most prominent during NREM sleep, these results suggest that caffeine shifts the brain towards a critical regime and more diverse neural dynamics. Interestingly, this was more pronounced in younger adults (20-27 years) compared to middle-aged participants (41-58 years) during REM sleep, while no significant age effects were observed during NREM. Interpreting these data in the light of modeling and empirical work on EEG-derived measures of excitation-inhibition balance suggests that caffeine promotes a shift in brain dynamics towards increased neural excitation and closer proximity to a critical regime, particularly during NREM sleep.