Computer Vision/Image Processing

  • Multi-view video matching
    img_multiview_1
    img_multiview_2
    We propose a correspondence matching algorithm for multi-view video sequences, which provides reliable performance even when the multiple cameras have significantly different parameters, such as viewing angles and positions. We use an activity vector, which represents the temporal occurrence pattern of moving foreground objects at a pixel position, as an invariant feature for correspondence matching. We first devise a novel similarity measure between activity vectors by considering the joint and individual behavior of the activity vectors. Specifically, we define random variables associated with the activity vectors and measure their similarity using the mutual information between the random variables. Moreover, to find a reliable homography transform between views, we find consistent pixel positions by employing the iterative bidirectional matching. We also refine the matching results of multiple source pixel positions by minimizing a matching cost function based on the Markov random field. Experimental results show that the proposed algorithm provides more accurate and reliable matching performance than the conventional activity-based and feature-based matching algorithms, and therefore can facilitate various applications of visual sensor networks.
  • Image/video dehazing and deraining
    A fast and optimized dehazing algorithm for hazy images and videos is proposed in this work. Based on the observation that a hazy image exhibits low contrast in general, we restore the hazy image by enhancing its contrast. However, the overcompensation of the degraded contrast may truncate pixel values and cause information loss. Therefore, we formulate a cost function that consists of the contrast term and the information loss term. By minimizing the cost function, the proposed algorithm enhances the contrast and preserves the information optimally. Moreover, we extend the static image dehazing algorithm to real-time video dehazing. We reduce flickering artifacts in a dehazed video sequence by making transmission values temporally coherent. Experimental results show that the proposed algorithm effectively removes haze and is sufficiently fast for real-time dehazing applications.
  • Saliency detection
    In this work, we propose a graph-based multiscale saliency detection algorithm, by modeling eye movements as a random walk on a graph. The proposed algorithm first extracts intensity, color, and compactness features from an input image. It then constructs a fully connected graph by employing image blocks as the nodes. It assigns a high edge weight, if the two connected nodes have dissimilar intensity and color features and if the ending node is more compact than the starting node. Then, the proposed algorithm computes the stationary distribution of the Markov chain on the graph as the saliency map. However, the performance of the saliency detection depends on the relative block size in an image. To provide a more reliable saliency map, we develop a coarse-to-fine refinement technique for multiscale saliency maps based on the random walk with restart (RWR). Specifically, we use the saliency map at a coarse scale as the restarting distribution of RWR at a fine scale. Experimental results demonstrate that the proposed algorithm detects visual saliency precisely and reliably. Moreover, the proposed algorithm can be efficiently used in the applications of proto-object extraction and image retargeting.
  • Object tracking
    A fast visual object tracking algorithm using novel object appearance models is proposed in this work. We develop a color histogram model and a patch difference model to extract color and texture feature vectors, respectively. Then, we apply $k$-nearest neighbor classifiers to the color and texture feature vectors and obtain the foreground probability map. We then perform a hierarchical mean shift process on the map to identify the object window. Experimental results demonstrate that proposed algorithm outperforms the conventional algorithms in terms of both tracking accuracy and processing speed.