CholecTriplet2022: Show me a tool and tell me the triplet — An endoscopic vision challenge for surgical action triplet detection

Chinedu Innocent Nwoye*, Tong Yu, Saurav Sharma, Aditya Murali, Deepak Alapatt, Armine Vardazaryan, Kun Yuan, Jonas Hajek, Wolfgang Reiter, Amine Yamlahi, Finn Henri Smidt, Xiaoyang Zou, Guoyan Zheng, Bruno Oliveira, Helena R. Torres, Satoshi Kondo, Satoshi Kasai, Felix Holm, Ege Özsoy, Shuangchun GuiHan Li, Sista Raviteja, Rachana Sathish, Pranav Poudel, Binod Bhattarai, Ziheng Wang, Guo Rui, Melanie Schellenberg, João L. Vilaça, Tobias Czempiel, Zhenkun Wang, Debdoot Sheet, Shrawan Kumar Thapa, Max Berniker, Patrick Godau, Pedro Morais, Sudarshan Regmi, Thuy Nuong Tran, Jaime Fonseca, Jan Hinrich Nölke, Estevão Lima, Eduard Vazquez, Lena Maier-Hein, Nassir Navab, Pietro Mascagni, Barbara Seeliger, Cristians Gonzalez, Didier Mutter, Nicolas Padoy

*Corresponding author for this work

Research output: Contribution to journalShort surveypeer-review

2 Citations (Scopus)

Abstract

Formalizing surgical activities as triplets of the used instruments, actions performed, and target anatomies is becoming a gold standard approach for surgical activity modeling. The benefit is that this formalization helps to obtain a more detailed understanding of tool-tissue interaction which can be used to develop better Artificial Intelligence assistance for image-guided surgery. Earlier efforts and the CholecTriplet challenge introduced in 2021 have put together techniques aimed at recognizing these triplets from surgical footage. Estimating also the spatial locations of the triplets would offer a more precise intraoperative context-aware decision support for computer-assisted intervention. This paper presents the CholecTriplet2022 challenge, which extends surgical action triplet modeling from recognition to detection. It includes weakly-supervised bounding box localization of every visible surgical instrument (or tool), as the key actors, and the modeling of each tool-activity in the form of ‹instrument, verb, target› triplet. The paper describes a baseline method and 10 new deep learning algorithms presented at the challenge to solve the task. It also provides thorough methodological comparisons of the methods, an in-depth analysis of the obtained results across multiple metrics, visual and procedural challenges; their significance, and useful insights for future research directions and applications in surgery.

Original languageEnglish
Article number102888
Number of pages21
JournalMedical Image Analysis
Volume89
Early online date12 Jul 2023
DOIs
Publication statusPublished - Oct 2023

Bibliographical note

The organizers would like to thank the IHU and IRCAD research teams for their help with the initial data annotation during the CONDOR project. We also thank Stefanie Speidel, Lena Maier-Hein, Danail Stoyanov, and the entire EndoVis 2022 organizing committee for providing the platform for this challenge.

Funding
This work was supported by French state funds managed within the Plan Investissements d’Avenir by the ANR under references: National AI Chair AI4ORSafety [ANR-20-CHIA-0029-01], Labex CAMI [ANR-11-LABX-0004], DeepSurg [ANR-16-CE33-0009], IHU Strasbourg [ANR-10-IAHU-02] and by BPI France under references: project CONDOR, project 5G-OR [DOS0180017/00].

Software validation and evaluation were performed with servers managed by CAMMA at University of Strasbourg and IHU Strasbourg, as well as HPC resources from Unistra Mésocentre, and GENCI-IDRIS [Grant 2021-AD011011638R2, 2021-AD011011638R3].

Awards for the challenge winners were sponsored by IHU Strasbourg, NVIDIA, and Medtronic Ltd.

Participating teams would like to acknowledge the following funding: CITI: Shanghai Municipal Science and Technology Commission, China [20511105205]. SDS-HD: Twinning Grant [DKFZ+RBCT]; the Surgical Oncology Program of the National Center for Tumor Diseases (NCT) Heidelberg, by the German Federal Ministry of Health under the reference number 2520DAT0P1 as part of the pAItient project, and by HELMHOLTZ IMAGING, a platform of the Helmholtz Information & Data Science Incubator, Germany. European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (NEURAL SPICING; grant agreement No. [101002198]) and Surgical Oncology Program of the National Center for Tumor Diseases (NCT) Heidelberg, Germany. 2AI-ICVS: Fundação para a Ciência e a Tecnologia (FCT), Portugal, Portugal and the European Social Fund, European Union, for funding support through the “Programa Operacional Capital Humano” (POCH) in the scope of the Ph.D. grants [SFRH/BD/136721/2018, SFRH/BD/136670/2018]. Grants [NORTE-01-0145-FEDER-000045,NORTE-01-0145-FEDER-000059], supported by Northern Portugal Regional Operational Programme (NORTE 2020), under the Portugal 2020 Partnership Agreement, through the European Regional Development Fund (FEDER). Also funded by national funds, through the FCT and FCT/MCTES in the scope of the project [UIDB/05549/2020, UIDP/05549/2020]. SHUANGCHUN: Guangdong Climbing Plan, China under Grant [pdjh2023c21602]. CAMP: partially supported by Carl Zeiss AG, Germany .

Data Availability Statement

The CholecT50 dataset and the validation data used in the challenge have been made available to the public and accessible via https://github.com/CAMMA-public/cholect50. The test set spatial labels will be released publicly. The baseline model code will be released as well. Participants can release their code on their own volition. All released code would be linked to the central GitHub repository for the challenge: https://github.com/CAMMA-public/cholectriplet2022.

Keywords

  • Action detection
  • CholecT50
  • Computer-assisted surgery
  • Fine-grained activity recognition
  • Surgical action triplet
  • Tool localization
  • Weak supervision

Fingerprint

Dive into the research topics of 'CholecTriplet2022: Show me a tool and tell me the triplet — An endoscopic vision challenge for surgical action triplet detection'. Together they form a unique fingerprint.

Cite this