Jianjin Xu, Zheyang Xiong, Xiaolin Hu, “Frame difference-based temporal loss for video stylization,” arXiv:2102.05822, 2021.A simple loss that does not need the time-consuming estimation of optical flow. |
|
Xiao Li, Jianmin Li, Ting Dai, Jie Shi, Jun Zhu, Xiaolin Hu, “Rethinking Natural Adversarial Examples for Classification Models,” arXiv:2102.1173, Feb 2021.How should we define the natural adversarial examples? We propose the ImageNet-A-Plus dataset, which is modified from ImageNet-A. |
|
Gang Zhang, Ziyi Li, Chufeng Tang, Jianmin Li, Xiaolin Hu, “CEDNet: A cascade encoder-decoder network for dense prediction,” Pattern Recognition. (Accepted).Previous neural networks for object detection and image segmentation are usually built upon the backbones such as ResNet originally designed for classification, which doesn't need high-resolution features. This is not a good strategy. Different backbones for dense prediction are needed. |
Zhi Cheng, Zhanhao Hu, Yuqiu Liu, Hang Su, Xiaolin Hu, “Full-distance evasion of pedestrian detectors in the physical world,” Advances in Neural Information Processing (NeurIPS), Vancouver, Dec 10-15, 2024.Previous physical adversarial attacks can only function at short distances. We propose a method to remedy this. | |
Ziqin Wang, Jiawei Gao, Zeqi Xiao, Jingbo Wang, Tai Wang, Jinkun Cao, Xiaolin Hu, Si Liu, Jifeng Dai, Jiangmiao Pang, “CooHOI: learning cooperative human-object interaction with manipulated object dynamics,” Advances in Neural Information Processing (NeurIPS), Vancouver, Dec 10-15, 2024. (Spotlight)
| |
Haoran He, Peilin Wu, Chenjia Bai, Hang Lai, Lingxiao Wang, Ling Pan, Xiaolin Hu, Weinan Zhang, “Bridging the Sim-to-Real Gap from the Information Bottleneck Perspective,” Conference on Robot Learning. (Oral)
| |
Kai Li, Fenghua Xie, Hang Chen, Kexin Yuan, Xiaolin Hu, “An audio-visual speech separation model inspired by cortico-thalamo-cortical circuits,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 10, pp. 6637-6651, Oct 2024.A brain-inspired model for audio-visual speech separation. The state-of-the-art model on this task. |
|
Xiaopei Zhu, Peiyang Xu, Guanning Zeng, Yingpeng Dong, Xiaolin Hu, “Natural language induced adversarial images,” ACM Multimedia, Melbourne, Australia, Oct 28-Nov 1, 2024. |
|
Xianghao Kong, Jinyu Chen, Wenguan Wang, Hang Su, Xiaolin Hu, Yi Yang, Si Liu, “Controllable navigation instruction generation with chain of thought prompting,” The 18th European Conference on Computer Vision (ECCV), MiCo Milano, Italy, Sep 29th-Oct 4th, 2024. |
|
Xiao Li, Yining Liu, Na Dong, Sitian Qin, Xiaolin Hu, “PartImageNet++ dataset: scaling up part-based models for robust recognition,” The 18th European Conference on Computer Vision (ECCV), MiCo Milano, Italy, Sep 29th-Oct 4th, 2024. We propose a new dataset called PartImageNet++, providing high-quality part segmentation annotations for all categories of ImageNet-1K. |
|
Kai Li, Runxuan Yang, Fuchun Sun, Xiaolin Hu, “IIANet: an intra- and inter-modality attention network for audio-visual speech separation,” The 41st International Conference on Machine Learning (ICML), Vienna, Austria, July 21-27, 2024.Inspired by the cross-modal processing mechanism in the brain, we design intra- and inter-attention modules to integrate auditary and visual information for efficient speech separation. The model simulates audio-visual fusion in different levels of sensory cortical areas as well as higher association areas such as parietal cortex. |
|
Xiao Li, Qiongxiu Li, Zhanhao Hu, and Xiaolin Hu, “On the privacy effect of data enhancement via the lens of memorization,” IEEE Transactions on Information Forensics and Security, vol. 9, pp. 4686-4699, 2024.We investigated several nonintuitive and seemingly contradictory conclusions about privacy, data augmentation and adversarial robustness. |
|
Gang Zhang Junnan Chen, Guohuan Gao, Jianmin Li, Si Liu, Xiaolin Hu, “SAFDNet: A simple and effective network for fully sparse 3d object detection,” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, June 17-21, 2024. (Oral, 90 out of about 11500) |
|
Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu, Jianmin Li, Xiaolin Hu, “Infrared adversarial car stickers”, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, June 17-21, 2024.We hide real cars against infrared car detectors. |
|
Xiao Li, Wei Zhang, Yining Liu, Zhanhao Hu, Bo Zhang, Xiaolin Hu, “Language-driven anchors for zero-shot adversarial robustness,” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, June 17-21, 2024. |
|
Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, Xiaolin Hu, “Hiding from thermal imaging pedestrian detectors in the physical world,” Neurocomputing, vol. 564, article 126963, 2024.Extention of our AAAI 2021 paper (use small bulbs). |
|
Samuel Pegg, Kai Li, Xiaolin Hu, “RTFS-Net: recurrent time-frequency modelling for efficient audio-visual speech separation,” Proc. of the 12th International Conference on Learning Representations (ICLR), Vienna, Austria, May 7-11, 2024.The first time-frequency domain audio-visual speech separation method that outperforms all contemporary time-domain counterparts. It uses only 1/100 parameters of VisualVoice, one of the previous SOTA methods. |
|
Zhongfu Shen, Jiajun Yang, Qiangqiang Zhang, Kuiyu Wang, Xiaohui Lv, Xiaolin Hu, Jian Ma, Song-Hai Shi, “How variable progenitor clones construct a largely invariant neocortex,” National Science Review, vol. 11, no. 1, January 2024, nwad247.
|
Z. Cheng, Z. Deng, X. Hu, B. Zhang, T. Yang, “Efficient reinforcement learning of a reservoir network model of parametric working memory achieved with a cluster population winner-take-all readout mechanism,” Journal of Neurophysiology, vol.114, no. 6, 3296-3305, 2015.Learning of a reservoir network for working memory of monkey brain. |
|
X. Li, S. Qian, F. Peng, J. Yang, X. Hu, and R. Xia, "Deep convolutional neural network and multi-view stacking ensemble in Ali mobile recommendation algorithm competition," The First International Workshop on Mobile Data Mining & Human Mobility Computing (ICDM 2015).The team won the Ali competition. Rank 1st over 7186 teams. . |
|
M. Liang, X. Hu, B. Zhang, “Convolutional neural networks with intra-layer recurrent connections for scene labeling,” Advances in Neural Information Processing Systems(NIPS), Montréal, Canada, Dec. 7-12, 2015.An application of the recurrent CNN. It achieves excellent performance on the Stanford Background and SIFT Flow datasets. |
|
Y. Zhou, X. Hu, B. Zhang, “Interlinked convolutional neural networks for face parsing,” International Symposium on Neural Networks (ISNN), Jeju, Korea, Oct. 15-18, 2015, pp. 222-231.A two-stage pipeline is proposed for face parsing and both stages use iCNN, which is a set of CNNs with interlinkage in the convolutional layers. |
|
M. Liang, X. Hu, “Recurrent convolutional neural network for object recognition,” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, USA, June 7-12, 2015, pp. 3367-3375.cuda-convnet2 configs (used in the paper) pytorch version (by Xiao Li) Typical deep learning models for object recognition have feedforward architectures including HMAX and CNN.This is a crude approximation of the visual pathway in the brain since there are abundant recurrent connections in the visual cortex. We show that adding recurrent connections to CNN improves its performance in object recognition. |
|
X. Zhang, Q. Zhang, X. Hu, B. Zhang, “Neural representation of three-dimensional acoustic space in the human temporal lobe,” Frontiers in Human Neuroscience, vol. 9, article 203, 2015. doi: 10.3389/fnhum.2015.00203Humans are able to localize the sounds in the environment. How the locations are encoded in the cortex remains elusive. Using fMRI and machine learning techniques, we investigated how the temporal cortex of humans encodes the 3D acoustic space. |
|
M. Liang, X. Hu, “Predicting eye fixations with higher-level visual features,” IEEE Transactions on Image Processing, vol. 24, no. 3, pp. 1178-1189, 2015.There is a debate about whether low-level features or high-level features are more important for prediction eye fixations. Through experiments, we show that mid-level features and object-level features are indeed more effective for this task. We obtained state-of-the-art results on several benchmark datasets including Toronto, MIT, Kootstra and ASCMN at the time of submission. |
|
M. Liang, X. Hu, “Feature selection in supervised saliency prediction,” IEEE Transactions on Cybernetics, vol. 45, no. 5, pp. 900-912, 2015.(Download the computed saliency maps here) There is a trend for incorporating more and more features for supervised learning of visual saliency on natural images. We find much redundancy among these features by showing that a small subset of features leads to excellent performance on several benchmark datasets. In addition, these features are robust across different datasets. |
|
Q. Zhang, X. Hu, B. Zhang, “Comparison of L1-Norm SVR and Sparse Coding Algorithms for Linear Regression,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 8, pp. 1828-1833, 2015.The close connection between the L1-norm support vector regression (SVR) and sparse coding (SC) is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the L1-SVR algorithms in efficiency. The SC algorithms are then used to design RBF networks, which are more efficient than the well-known orthogonal least squares algorithm. |
T. Shi, M. Liang, X. Hu, “A reverse hierarchy model for predicting eye fixations,” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, USA, June 24-27, 2014, pp. 2822-2829.We present a novel approach for saliency detection in natural images. The idea is from a theory in cognitive neuroscience, called reverse hierarchy theory, which proposes that attention propagates from the top level of the visual hierarchy to the bottom level. |
|
X. Hu, J. Zhang, J. Li, B. Zhang, “Sparsity-regularized HMAX for visual recognition,” PLOS ONE, vol. 9, no. 1, e81813, 2014.We show that a deep learning model with alternating sparse coding/ICA and local max pooling can learn higher-level features on images without labels. After training on a dataset with 1500 images, in which there were 150 unaligned faces, 6 units on the top layer became face detectors. This took a few hours on a laptop computer with 2 cores, in contrast to Google's 16,000 cores in a similar project. |
|
X. Hu, J. Zhang, P. Qi, B. Zhang, “Modeling response properties of V2 neurons using a hierarchical K-means model,” Neurocomputing, vol. 134, pp. 198-205, 2014.We show that the simple data clustering algorithm, K-means can be used to model some properties of V2 neurons if we stack them into a hierarchical structure. It is more biologically feasible than the sparse DBN for doing the same thing because it can be realized by competitive hebbian learning. This is an extended version of our ICONIP'12 paper. |
|
P. Qi, X. Hu, “Learning nonlinear statistical regularities in natural images by modeling the outer product of image intensities,” Neural Computation, vol. 26, no. 4, pp. 693–711, 2014.This is a hierarchical model aimed at modeling the properties of complex cells in the primary visual cortex (V1). It can be regarded as a simplified version of Karklin and Lewicki's model published in 2009. |
20112010200920082007X. Hu and J. Wang, “Solving the k-winners-take-all problem and the oligopoly Cournot-Nash equilibrium problem using the general projection neural networks.” Proc. of 14th International Conference on Neural Information Processing (ICONIP), Kitakyushu, Japan, Nov. 13-16, 2007, pp. 703-712. S. Liu, X. Hu and J. Wang, “Obstacle Avoidance for Kinematically Redundant Manipulators Based on an Improved Problem Formulation and the Simplified Dual Neural Network”, Proc. of IEEE Three-Rivers Workshop on Soft Computing in Industrial Applications, Passau, Bavaria, Germany, August 1-3, 2007, pp. 67-72. X. Hu and J. Wang, “Convergence of a recurrent neural network for nonconvex optimization based on an augmented Lagrangian function,” Proc. of 4th International Symposium on Neural Networks, Part III, Nanjing, China, June 3-7, 2007. 2006X. Hu and J. Wang, “Solving extended linear programming problems using a class of recurrent neural networks,” Proc. of 13th International Conference on Neural Information Processing, Part II, Hong Kong, Oct. 3-6, 2006. |