Summary |
Previous methods found on camera geometry and projection matrix to select space image region for status classification. By utilizing suitable hand-crafted features, outdoor lighting variation and perspective distortion could be well handled. However, if also considering parking displacement, non-unified car size, and inter-object occlusion, we find the problem becomes more troublesome. To solve these issues in a systematic way, we proposed to design a deep convolutional network to overcome these challenges. |
Scientific Breakthrough |
In this technology, we use three modules to achieve a robust system. -First, we introduce a CNN-based deep network to extract more robust semantic features instead of relying on hand-crafted low-level features. -Second, we integrate a STN into our deep network. The STN aims to adaptively crop, transform, and unify a 3-space input patch according to car sizes, occlusion patterns, and parking displacements. -Third, in order to analytically solve inter-object occlusion problems, we group 3 neighboring spaces as an input unit. A multi-task loss function is designed to jointly consider the status estimation of the middle space and the occlusion patterns among neighboring spaces. A Siamese architecture is used to learn 3-space feature descriptor can preserve the “semantic” distances. |