Welcome to the Self-supervised learning for 3D light-sheet microscopy image segmentation (SELMA3D) challenge



Background

In modern biological research, visualizing complex tissue and organism structures is crucial. Light-sheet microscopy (LSM), combined with tissue clearing and specific staining, offers a high-resolution, high-contrast method to observe diverse biological structures including cellular components and organelles.

Tissue clearing renders opaque biological samples transparent, while preserving sample integrity and fluorescence of labeled structures, allowing deep light penetration[1]. Structure staining with dyes, fluorophores, or antibodies can selectively label specific biological structures within samples, enhancing their contrast[2]. When paired with LSM, these techniques enable detailed visualization of intricate biological structures with high spatial resolution, offering new insights into various biomedical research fields such as neuroscience[3], immunology[4], oncology[5] and cardiology[6].

To analyze LSM images, segmentation plays a pivotal role in identifying and distinguishing different biological structures[7]. For small-scale images, manual segmentation is feasible, but for large-scale or whole-organ images, it becomes impractical as a single image can have 10003 voxels. Automated segmentation, particularly with deep learning methods, is becoming essential[8-9]. These deep learning methods perform well but require extensive annotated datasets for diverse LSM image segmentation tasks, which are challenging to create.

Self-supervised learning (SSL) proves advantageous in this context, allowing models to pretrain on large-scale, unannotated datasets and then fine-tune on smaller labeled datasets[10]. Although not widely explored in LSM, SSL is promising due to the high signal-to-noise ratio of LSM data, which is well-suited for self-supervised learning.

[1] H.R. Ueda et al. Tissue clearing and its applications in neuroscience. Nature Reviews Neuroscience 21(2): 61-79, 2020 Jan.
[2] P.K. Poola et al. Light sheet microscopy for histopathology applications. Biomedical engineering letters 9: 279-291, 2019 July.
[3] H.R. Ueda et al. Whole-brain profiling of cells and circuits in mammals by tissue clearing and light-sheet microscopy. Neuron, 106(3): 369-387, 2020 May.
[4] D. Zhang et al. Spatial analysis of tissue immunity and vascularity by light sheet fluorescence microscopy. Nature Protocols: 1-30, 2024 Jan.
[5] J. Almagro et al. Tissue clearing to examine tumour complexity in three dimensions. Nature Reviews Cancer, 21(11): 718-730, 2021 July.
[6] P. Fei et al. Cardiac light-sheet fluorescent microscopy for multi-scale and rapid imaging of architecture and function. Scientific Reports 6: 22489, 2016 Mar.
[7] F. Amat et al. Efficient processing and analysis of large-scale light-sheet microscopy data. Nature protocols 10. 2015: 1679-1696.
[8] N. Kumar et al. A Light sheet fluorescence microscopy and machine learning-based approach to investigate drug and biomarker distribution in whole organs and tumors. bioRxiv 2023.09.16.558068.
[9] M.I. Todorov et al. Machine learning analysis of whole mouse brain vasculature. Nature Methods 17: 442-449, 2020 Mar.
[10] R. Krishnan et al. Self-supervised learning in medicine and healthcare. Nature Biomedical Engineering 6: 1346-1352, 2022 Aug.

Objective

We aim to host a challenge on self-supervised learning for 3D LSM image segmentation, encouraging the development of self-supervised learning methods for general segmentation of various structures in 3D LSM images. Effective self-supervised learning can leverage extensive unannotated 3D LSM images to pretrain models, capturing high-level representations generalizable across different biological structures. These pretrained models can then be fine-tuned on smaller annotated datasets, significantly reducing the annotation effort needed for 3D LSM segmentation.


Task Description

The task is to develop self-supervised learning methods for 3D LSM image segmentation.

Participants will be provided with a training dataset consisting of two parts. The first part contains a vast collection of unannotated whole-brain 3D LSM images from mice and humans,intended for model pretraining. The second part includes annotated cropped patches from whole-brain 3D LSM images, which will be used for model fine-tuning.