Generalizable Self-supervised Monocular Depth Estimation with Mixture of Low-Rank Experts for Diverse Endoscopic Scenes

Liangjing Shao1,2,3,*, Chenkang Du1,2, Benshuang Chen1,2, Xueli Liu4,# Xinrong Chen1,2,#

1. College of Biomedical Engineering, Fudan University 2. Shanghai Key laboratory of Medical Image Computing and Computer Assisted Intervention 3. Department of Electronic Engineering, The Chinese University of Hong Kong 4. ENT Institute and Department of Otolaryngology, Eye & ENT Hospital of Fudan University
#Corresponding Authors

*This work was done when Liangjing was with Fudan University

Demo Videos on Diverse Endoscopic Data

Abstract

Self-supervised monocular depth estimation is a significant task for low-cost and efficient 3D scene perception in endoscopy. However, the variety of illumination conditions and scene features is still the primary challenge for depth estimation in diverse endoscopic scenes. In this work, a self-supervised framework is proposed for monocular depth estimation in diverse endoscopy. Firstly, considering the diverse features in endoscopic scenes with different tissues, a novel block-wise mixture of dynamic low-rank experts is proposed to efficiently finetune the foundation model for endoscopic depth estimation. In the proposed module, based on the input feature, different experts with a small amount of trainable parameters are adaptively selected for weighted inference, from low-rank experts which are allocated based on the generalization of each block. Moreover, a novel self-supervised training framework is proposed to jointly cope with brightness inconsistency and reflectance interference. The proposed method outperform state-of-the-art works on SCARED dataset and SimCol dataset. Furthermore, the proposed network also achieves the best generalization based on zero-shot depth estimation on C3VD, Hamlyn and SERV-CT dataset. The outstanding performance of our model is further demonstrated with sim-to-real test, 3D reconstruction and ego-motion estimation. The proposed method could contribute to accurate endoscopy for minimally invasive measurment and surgery. The codes will be released upon acceptance.