Capsule networks as recurrent models of grouping and segmentation

Adrien Doerig* (Corresponding Author), Lynn Schmittwilken, Bilge Sayim, Mauro Manassi, Michael H herzog

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

23 Citations (Scopus)
6 Downloads (Pure)

Abstract

Classically, visual processing is described as a cascade of local feedforward computations. Feedforward Convolutional Neural Networks (ffCNNs) have shown how powerful such models can be. However, using visual crowding as a well-controlled challenge, we previously showed that no classic model of vision, including ffCNNs, can explain human global shape processing. Here, we show that Capsule Neural Networks (CapsNets), combining ffCNNs with recurrent grouping and segmentation, solve this challenge. We also show that ffCNNs and standard recurrent CNNs do not, suggesting that the grouping and segmentation capabilities of CapsNets are crucial. Furthermore, we provide psychophysical evidence that grouping and segmentation are implemented recurrently in humans, and show that CapsNets reproduce these results well. We discuss why recurrence seems needed to implement grouping and segmentation efficiently. Together, we provide mutually reinforcing psychophysical and computational evidence that a recurrent grouping and segmentation process is essential to understand the visual system and create better models that harness global shape computations.
Original languageEnglish
Article numbere1008017
Number of pages19
JournalPLoS Computational Biology
Volume16
Issue number7
DOIs
Publication statusPublished - 21 Jul 2020

Bibliographical note

Funding: AD was supported by the Swiss National Science Foundation grant n.176153 “Basics of visual processing: from elements to figures”. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Data Availability: The human data for experiment 2 and the full code to reproduce all our results are available here: https://github.com/adriendoerig/Capsule-networks-as-recurrent-models-of-grouping-and-segmentation.

Keywords

  • crowding
  • visions
  • psychophysics
  • visual system
  • Boats
  • deep learning
  • recurrent neural networks
  • Neural Networks, Computer
  • Reproducibility of Results
  • Humans
  • Computational Biology
  • Male
  • Normal Distribution
  • Algorithms
  • Models, Biological
  • Vision, Ocular
  • Computer Simulation
  • Image Processing, Computer-Assisted/methods
  • Female
  • Pattern Recognition, Visual
  • FEEDFORWARD
  • OBJECT RECOGNITION
  • INTERFERENCE

Fingerprint

Dive into the research topics of 'Capsule networks as recurrent models of grouping and segmentation'. Together they form a unique fingerprint.

Cite this