Publications

Preprint


Jianjin Xu, Zheyang Xiong, Xiaolin Hu, “Frame difference-based temporal loss for video stylization,” arXiv:2102.05822, 2021.

A simple loss that does not need the time-consuming estimation of optical flow.

Source codes

FDB loss

Xiao Li, Jianmin Li, Ting Dai, Jie Shi, Jun Zhu, Xiaolin Hu, “Rethinking Natural Adversarial Examples for Classification Models,” arXiv:2102.1173.

How should we define the natural adversarial examples?

We propose the ImageNet-A-Plus dataset, which is modified from ImageNet-A.

ImageNet-A+

Haoran Chen, Jianmin Li, Simone Frintrop, and Xiaolin Hu, “Annotation cleaning for the MSR-Video to Text dataset,” arXiv:2102.06448.

After cleaning the annotations, the perfromance of existing models always increases.

 

MSR-VTT

Xiaolin Hu, Zhigang Zeng, “Bridging the functional and wiring properties of V1 neurons through sparse coding,” Neural Computation. (Accepted)

A standard excitatory-inhibitory neural network shows numerous functional and wiring properties of neurons in layer 2/3 of V1 after unsupervised learning on natural images. Many properties are predictions yet to be verified in biological experiments. One interesting property is the small-worldness.

Source codes

small world

Shangqi Guo, Qi Yan, Xin Su, Xiaolin Hu, Feng Chen, “State-temporal compression in reinforcement learning with the reward-restricted geodesic metric,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

 

state-temporal compression

Jianfeng Wang, Xiaolin Hu, “Convolutional neural networks with gated recurrent connections,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021.

Extension of a previous work. We demonstrate the good performance of the Gated RCNN on image classification and object detection.

Source codes

Gated RCNN

 


2021


Ge Gao, Mikko Lauri, Xiaolin Hu, Jianwei Zhang, Simone Frintrop, “CloudAAE: learning 6D object pose regression with on-line data synthesis on point clouds,” IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, May 30 to June 5, 2021.

arXiv version

Source codes

AAE

Gang Zhang, Xin Lu, Jingru Tan, Jianmin Li, Zhaoxiang Zhang, Quanquan Li, Xiaolin Hu, “RefineMask: Towards high-quality instance segmentation with fine-grained features,“ Proc. of the 34th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, June 19-25, 2021.

arXiv version

A coarse-to-fine strategy.

Source codes

refineMask

Chufeng Tang, Hang Chen, Xiao Li, Jianmin Li, Zhaoxiang Zhang, Xiaolin Hu, “Look closer to segment better: boundary patch refinement for instance segmentation,” Proc. of the 34th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, June 19-25, 2021.

arXiv version

A post-processing model applicable to any instance segmentation method. We ranked the 1st on the Cityscapes leaderboard by the submission DDL of CVPR2021.

Source codes

BPR

Jianfeng Wang, Thomas Lukasiewicz, Xiaolin Hu, Jianfei Cai, Zhenghua Xu, “RSG: A simple yet effective module for learning imbalanced datasets,” Proc. of the 34th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, June 19-25, 2021.

Source codes

AI

Xiang Li, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang, “Generalized Focal Loss V2: learning reliable localization quality estimation for dense object detection,” Proc. of the 34th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, June 19-25, 2021.

arXiv version

An extension of the GFL in our NeurIPS 2020 paper.

Source codes

GFLv2

Weiyi Zhang, Shuning Zhao, Le Liu, Jianmin Li, Xingliang Cheng, Thomas Fang Zheng, Xiaolin Hu,“Attack on practical speaker verification system using universal adversarial perturbations,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Virtual, June 6-11, 2021.

A physical attack on speaker verification systems.

Source codes

speaker verification

Xiaopei Zhu, Xiao Li, Jianmin Li, Zheyao Wang, Xiaolin Hu, “Fooling thermal infrared pedestrian detectors in real world using small bulbs,” The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI), Virtual, Feb 2-9, 2021.

If you hold a cardboard embedded with small bulbs designed by us, you would not be detected by YOLOv3.

Source codes

small bulbs

Han Liu, Shifeng Zhang, Ke Lin, Jing Wen, Jianmin Li, Xiaolin Hu, “Vocabulary-wide credit assignment for training image captioning models,” IEEE Transactions on Image Processing, vol. 30, pp. 2450-2460, 2021.

At each generation step, we assign a reward to every word in the vocabulary.

Source codes

credit assignment

Zi Yin, Valentin Yiu, Xiaolin Hu, Liang Tang, “End-to-end face parsing via interlinked convolutional neural networks,” Cognitive Neurodynamics, vol. 15, pp. 169-179, 2021.

Extension of a previous work for face parsing.

Source codes

face parsing

 


2020


Tianren Zhang, Shangqi Guo, Tian Tan, Xiaolin Hu, Feng Chen, “Generating adjacency-constrained subgoals in hierarchical reinforcement learning,” Advances in Neural Information Processing Systems (NeurIPS), Dec 6-12, 2020.

Spotlight paper.

A method for reducing the high-level action space for hierarchical reinforcement learning.

Supplementary Materials

Source codes

pop song structure

Xiang Li, Wenhai Wang, Lijun Wu, Shuo Chen, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang, “Generalized focal loss: learning qualified and distributed bounding boxes for dense object detection,” Advances in Neural Information Processing Systems (NeurIPS), Dec 6-12, 2020.

We propose a joint representation of localization quality and classification for object detection methods.

Source codes

GFL

Weilun Chen, Zhaoxiang Zhang, Xiaolin Hu, Baoyuan Wu, “Boosting decision-based black-box adversarial attacks with random sign flip.” European Conference on Computer Vision, pp. 276-293. Springer, Cham, 2020.

 

GFL

Jian Wu, Xiaoguang Liu, Xiaolin Hu, Jun Zhu, “PopMNet: generating structured pop music melodies using neural networks,” Artificial Intelligence, vol. 286, article 103303, 2020.

Generate the structure of a song first, then generate the melody.

Project page

pop song structure

Yulong Wang, Hang Su, Bo Zhang, Xiaolin Hu, “Learning reliable visual saliency for model explanations, ” IEEE Transactions on Multimedia, vol. 22, no. 7, pp. 1796-1807, 2020.

When you input an image of dog into a deep neural network, you use some existing methods to highlight the region of the dog by setting the output label as "dog", it is OK. But if you set the output label as "cat", you will find some weird results.
reliable explanation

Yulong Wang, Hang Su, Bo Zhang, Xiaolin Hu, “Interpret neural networks by extracting critical subnetworks,” IEEE Transactions on Image Processing, vol. 29, pp. 6707-6720, 2020.

Extension of (Wang et al. CVPR 2018). We extend the idea of critical routes for individual image samples to image categories.

 

melody

Jian Wu, Changran Hu, Yulong Wang, Xiaolin Hu, Jun Zhu, “A hierarchical recurrent neural network for symbolic melody generation,” IEEE Transactions on Cybernetics, vol. 50, no. 6, pp. 2749-2757, 2020.

arXiv:1712.05274

Automatic melody generation

All melodies used in experiments are available

melody

Jianqiao Guo, Yajun Yin, Xiaolin Hu, Gexue Ren, “Self-similar network model for fractional-order neuronal spiking: implications of dendritic spine functions,” Nonlinear Dynamics, vol. 100, pp. 921-935, 2020.

fractional-order

Haoran Chen and Jianmin Li and Xiaolin Hu, “Delving deeper into the decoder for video captioning,” The 24th European Conference on Artificial Intelligence (ECAI), Santiago de Compostela, Spain, August 29-September 2, 2020.

With a few techniques we boost the state-of-the-art results on video captioning benchmark datasets.

Source codes

video captioning

Ge Gao, Mikko Lauri, Yulong Wang, Xiaolin Hu, Jianwei Zhang, Simone Frintrop, “6D object pose regression via supervised learning on point clouds,” IEEE International Conference on Robotics and Automation (ICRA), Paris, France, May 31 to June 4, 2020.

Source codes

point cloud

Qiushan Guo, Xinjiang Wang, Yichao Wu, Zhipeng Yu, Ding Liang, Xiaolin Hu and Ping Luo, “Online knowledge distillation via collaborative learning,” Proc. of the 33th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, June 16-18, 2020.

knowledge distillation

Yudong Wu, Yichao Wu, Ruihao Gong, Yuanhao Lv, Ken Chen, Ding Liang, Xiaolin Hu, Xianglong Liu and Junjie Yan, “Rotation consistent margin loss for efficient low-bit face recognition”, Proc. of the 33th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, USA, June 16-18, 2020.

open set

Yulong Wang, Xiaolu Zhang, Xiaolin Hu, Bo Zhang, Hang Su, “Dynamic network pruning with interpretable layerwise channel selection, ”The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), New York, USA, Feb 7-12, 2020.

Source codes

dynamic pruning

Yulong Wang, Xiaolu Zhang, Lingxi Xie, Jun Zhou, Hang Su, Bo Zhang, Xiaolin Hu, “Pruning from scratch,” The Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI), New York, USA, Feb 7-12, 2020

Supplementary material

arXiv:1909.12579v1

We find that pre-training an over-parameterized model is not necessary for obtaining the target pruned structure. One can prune the model with its random initial weights.

Source codes

pruning-from-scratch

Xiang Li, Jun Li, Xiaolin Hu, Jian Yang, “Line-CNN: end-to-end traffic line detection with line proposal unit,” IEEE Transactions on Intelligent Transportation Systems, vol. 21, no. 1, pp. 248-258, 2020.

An end-to-end model to detect traffic lines at a speed of 30 f/s on a Titan X GPU. It's potentially useful for autonomous driving systems.

lineCNN

 


2019


Fangzhou Liao, Ming Liang, Zhe Li, Xiaolin Hu, Sen Song, “Evaluate the malignancy of pulmonary nodules using the 3-D deep leaky noisy-or network, ” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 11, pp. 3484-3495, 2019.

arXiv:1711.08324

The winning solution to the Kaggle Data Science Bowl 2017. A 500,000 US dollar solution!

Source codes

lung

Chufeng Tang, Lu Sheng, Zhaoxiang Zhang, Xiaolin Hu, “Improving pedestrian attribute recognition with weakly-supervised multi-scale attribute-specific localization,” Proc. of IEEE International Conference on Computer Vision (ICCV), Seoul, Korea, Oct 27–Nov 2, 2019. pp. 4997-5006.

Supplementary Materials

Source codes

 

pedestrain detection

Xiao Jin, Baoyun Peng, Yichao Wu, Yu Liu, Jiaheng Liu, Ding Liang, Junjie Yan, Xiaolin Hu, “Knowledge distillation via route constrained optimization,” Proc. of IEEE International Conference on Computer Vision (ICCV), Seoul, Korea, Oct 27–Nov 2, 2019. pp. 1345-1354.

Oral paper.

A new knowledge distillation method for training a small neural network.

 

knowledge distillation

Xiang Li, Shuo Chen, Xiaolin Hu, Jian Yang, “Understanding the Disharmony Between Dropout and Batch Normalization by Variance Shift,” Proc. of the 32th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, June 15–21, 2019.

This paper explains why the combination of Dropout and Batch Normalization (BN) often leads to worse performance in many modern neural networks.

 

BN-dropout

Xiang Li, Wenhai Wang, Xiaolin Hu, Jian Yang, Selective Kernel Networks,” Proc. of the 32th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, USA, June 15–21, 2019.

A neural network that performs better than ResNet, ResNeXt, SENet etc. for image classification.

Source codes

SKN

Niange Yu, Xiaolin Hu, Binheng Song, Jian Yang, Jianwei Zhang, “Topic-oriented image captioning based on order-embedding,” IEEE Transactions on Image Processing, vol. 28, no. 6, pp. 2743-2754, 2019.

Generate captions for images from different perspectives.

Source codes

image captioning

Shangqi Guo , Zhaofei Yu, Fei Deng, Xiaolin Hu, Feng Chen, “Hierarchical Bayesian inference and learning in spiking neural networks,” IEEE Transactions on Cybernetics, vol. 49, no. 1, pp. 133-145, 2019.

Spiking neural networks for Bayesian inference.

WTA network

Fangzhou Liao, Xi Chen, Xiaolin Hu, Sen Song, “Estimation of the volume of the left ventricle from MRI images using deep neural networks,” IEEE Transactions on Cybernetics, vol. 49, no. 2, pp. 495-504, 2019.

This algorithm got the 4th place in the Kaggle Data Science Bowl 2016

Source codes

heart network

Qingtian Zhang, Xiaolin Hu, Bo Hong, Bo Zhang, “A hierarchical sparse coding model predicts acoustic feature encoding in both auditory midbrain and cortex,” PLOS Computational Biology, 15(2): e1006766, 2019.

We used a hierarchical sparse coding model to reveal acoustic feature encoding mechanism in the auditory system. For example, interestingly, the artificial neurons in top layers exhibited phonetic feature encoding property. We found an important role of response sparseness for these properties to emerge.

Source codes

phoneme-encoding

Wei Feng, Wentao Liu, Tong Li, Jing Peng, Chen Qian, Xiaolin Hu, “Turbo learning framework for human-object interactions recognition and human pose estimation,” The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), Honolulu, Hawaii, USA, Jan 27-Feb 1, 2019.

Learn two tasks simutaneously, which help each other iteratively.

turbo learning

 


2018


Yi Zhang, Weichao Qiu, Qi Chen, Xiaolin Hu, Alan Yuille, “UnrealStereo: controlling hazardous factors to analyze stereo vision”, Proc. of the International Conference on 3DVision, Verona, Italy, September 5-8, 2018.

A synthetic image generation tool enabling to control hazardous factors, such as making objects more specular or transparent, for developing 3D vision algorithms.

denoiser

Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, Jun Zhu, “Defense against adversarial attacks using high-level representation guided denoiser,” Proc. of the 31th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, June 18-22, 2018.

Winning solution of the NIPS 2017 Competition on Adversarial Attacks and Defenses organized by Google Brain.

Source codes1

Source codes2

denoiser

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, Jianguo Li, “Boosting adversarial attacks with momentum,” Proc. of the 31th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, June 18-22, 2018.

Spotlight paper.

Winning solution of the NIPS 2017 Competition on Adversarial Attacks and Defenses organized by Google Brain.

Source codes for non-targeted attack

Source codes for targeted attack

adverarial examples

Yulong Wang, Hang Su, Bo Zhang, Xiaolin Hu, “Interpret neural networks by identifying critical data routing paths,” Proc. of the 31th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, June 18-22, 2018.

We found that images with similar sementic meaning have similar critical routes in deep CNNs.

Source codes

routes

Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, Xiaolin Hu, “High Performance Visual Tracking with Siamese Region Proposal Network,” Proc. of the 31th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, USA, June 18-22, 2018.

siamese

Wentao Liu, Jie Chen, Cheng Li, Chen Qian, Xiao Chu, Xiaolin Hu, “A cascaded inception of inception network with attention modulated feature fusion for human pose estimation,” The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), New Orleans, USA, Feb 2-7, 2018.

Erratum

Three techniques for human pose estimation: 1. inception of inception block, 2. attention to individual levels, 3. cascaded network.

pose

 


2017


Chengxu Zhuang, Yulong Wang, Daniel Yamins, Xiaolin Hu, “Deep learning predicts correlation between a functional signature of higher visual areas and sparse firing of neurons,” Frontiers in Computational Neuroscience, 2017. Doi: 10.3389/fncom.2017.00100

Study the visual system using deep learning models.

Dataset used in the paper

tornado

Jianfeng Wang, Xiaolin Hu, “Gated recurrent convolution neural network for OCR,” Advancies in Neural Information Processing (NIPS), Long Beach, USA, Dec. 4-9, 2017.

A modified version of our RCNN proposed in 2015.

Source codes

GRCNN

Zekun Hao, Yu Liu, Hongwei Qin, Junjie Yan, Xiu Li, Xiaolin Hu, “Scale-aware face detection,” Proc. of the 30th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, USA, July 21–26, 2017.

Prior to face detection, use a CNN to predict the scale distribution of the faces.

hierarchy

Tiancheng Sun, Yulong Wang, Jian Yang, Xiaolin Hu, “Convolution neural networks with two pathways for image style recognition,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4102-4113, 2017.

The gram matrix technique proposed by Gatys et al. is used to classify image styles. Three benchmark datasets are experimented, WikiPaintings, Flickr Style and AVA Style.

Source codes

art

J. Wu, L. Ma, X. Hu, “Delving deeper into convolutional neural networks for camera relocalization,” Proc. of IEEE International Conference on Robotics and Automation (ICRA), Singapore, May 29- June 3, 2017.

We present three techniqus for enhancing the performance of convolutional neural networks for camera relocalizationare.

branchnet

F. Liao, X. Hu, S. Song, “Emergence of V1 recurrent connectivity pattern in artificial neural network,”Computational and Systems Neuroscience (Cosyne), Salt Lake City, Feb. 23 - 26, 2017.

 

 

ai

Y. Zhao, X. Jin, X. Hu, “Recurrent convolutional neural network for speech processing,” Proc. of the 42nd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, USA, March 5-9, 2017.

Applications of recurrent CNN to speech processing.

Source codes

speech

 


2016


Q. Zhang, X. Hu, H. Luo, J. Li, X. Zhang, B. Zhang, “Deciphering phonemes from syllables in blood oxygenation level-dependent signals in human superior temporal gyrus,” European Journal of Neuroscience, vol. 43, no. 6, pp. 773-781, 2016.

This is a "mind reading" work. We managed to decode the phonome information from functional magnetic resonance imaging (fMRI) signals of subjects when they listened to nine syllables. The results indicated that phonemes have unique representations in the superior temporal gyrus (STG). We also revealed certain response patterns of the phonomes in STG.

mind reading

H. Qin, J. Yan, X. Li, X. Hu, “Joint Training of Cascaded CNN for Face Detection,” Proc. of the 29th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, USA, June 26-July 1, 2016, pp. 3456-3465.

 

face detection

S. Wang, Y. Yang, X. Hu, J. Li, B. Xu, “Solving the K-shortest paths problem in timetable-based public transportation systems,” Journal of Intelligent Transportation Systems: Technology, Planning, and Operations, vol. 20, no. 5, pp. 413-427, 2016.

An extended version of the IMECS 2012 paper.

railway

 


2015


Z. Cheng, Z. Deng, X. Hu, B. Zhang, T. Yang, “Efficient reinforcement learning of a reservoir network model of parametric working memory achieved with a cluster population winner-take-all readout mechanism,” Journal of Neurophysiology, vol.114, no. 6, 3296-3305, 2015.

Learning of a reservoir network for working memory of monkey brain.

reservoir network

X. Li, S. Qian, F. Peng, J. Yang, X. Hu, and R. Xia, "Deep convolutional neural network and multi-view stacking ensemble in Ali mobile recommendation algorithm competition," The First International Workshop on Mobile Data Mining & Human Mobility Computing (ICDM 2015).

The team won the Ali competition. Rank 1st over 7186 teams.

.
tianchi competition

M. Liang, X. Hu, B. Zhang, “Convolutional neural networks with intra-layer recurrent connections for scene labeling,” Advances in Neural Information Processing Systems(NIPS), Montréal, Canada, Dec. 7-12, 2015.

caffe configs

An application of the recurrent CNN. It achieves excellent performance on the Stanford Background and SIFT Flow datasets.

ai

Y. Zhou, X. Hu, B. Zhang, “Interlinked convolutional neural networks for face parsing,” International Symposium on Neural Networks (ISNN), Jeju, Korea, Oct. 15-18, 2015, pp. 222-231.

A two-stage pipeline is proposed for face parsing and both stages use iCNN, which is a set of CNNs with interlinkage in the convolutional layers.

Source codes

iCNN

M. Liang, X. Hu, “Recurrent convolutional neural network for object recognition,” Proc. of the 28th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, USA, June 7-12, 2015, pp. 3367-3375.

cuda-convnet2 configs (used in the paper)

caffe configs

torch version

pytorch version (by Xiao Li)

Typical deep learning models for object recognition have feedforward architectures including HMAX and CNN.This is a crude approximation of the visual pathway in the brain since there are abundant recurrent connections in the visual cortex. We show that adding recurrent connections to CNN improves its performance in object recognition.

RCNN

X. Zhang, Q. Zhang, X. Hu, B. Zhang, “Neural representation of three-dimensional acoustic space in the human temporal lobe,” Frontiers in Human Neuroscience, vol. 9, article 203, 2015. doi: 10.3389/fnhum.2015.00203

Humans are able to localize the sounds in the environment. How the locations are encoded in the cortex remains elusive. Using fMRI and machine learning techniques, we investigated how the temporal cortex of humans encodes the 3D acoustic space.

fMRI

M. Liang, X. Hu, “Predicting eye fixations with higher-level visual features,” IEEE Transactions on Image Processing, vol. 24, no. 3, pp. 1178-1189, 2015.

codes

There is a debate about whether low-level features or high-level features are more important for prediction eye fixations. Through experiments, we show that mid-level features and object-level features are indeed more effective for this task. We obtained state-of-the-art results on several benchmark datasets including Toronto, MIT, Kootstra and ASCMN at the time of submission.

saliency

M. Liang, X. Hu, “Feature selection in supervised saliency prediction,” IEEE Transactions on Cybernetics, vol. 45, no. 5, pp. 900-912, 2015.

(Download the computed saliency maps here)

There is a trend for incorporating more and more features for supervised learning of visual saliency on natural images. We find much redundancy among these features by showing that a small subset of features leads to excellent performance on several benchmark datasets. In addition, these features are robust across different datasets.

saliency

Q. Zhang, X. Hu, B. Zhang, “Comparison of L1-Norm SVR and Sparse Coding Algorithms for Linear Regression,” IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 8, pp. 1828-1833, 2015.

MATLAB codes

The close connection between the L1-norm support vector regression (SVR) and sparse coding (SC) is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the L1-SVR algorithms in efficiency. The SC algorithms are then used to design RBF networks, which are more efficient than the well-known orthogonal least squares algorithm.

RBF

 


2014


T. Shi, M. Liang, X. Hu, “A reverse hierarchy model for predicting eye fixations,” Proc. of the 27th IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Columbus, USA, June 24-27, 2014, pp. 2822-2829.

We present a novel approach for saliency detection in natural images. The idea is from a theory in cognitive neuroscience, called reverse hierarchy theory, which proposes that attention propagates from the top level of the visual hierarchy to the bottom level.

rhm

X. Hu, J. Zhang, J. Li, B. Zhang, “Sparsity-regularized HMAX for visual recognition,” PLOS ONE, vol. 9, no. 1, e81813, 2014.

MATLAB codes

We show that a deep learning model with alternating sparse coding/ICA and local max pooling can learn higher-level features on images without labels. After training on a dataset with 1500 images, in which there were 150 unaligned faces, 6 units on the top layer became face detectors. This took a few hours on a laptop computer with 2 cores, in contrast to Google's 16,000 cores in a similar project.

sparse hmax

X. Hu, J. Zhang, P. Qi, B. Zhang, “Modeling response properties of V2 neurons using a hierarchical K-means model,” Neurocomputing, vol. 134, pp. 198-205, 2014.

We show that the simple data clustering algorithm, K-means can be used to model some properties of V2 neurons if we stack them into a hierarchical structure. It is more biologically feasible than the sparse DBN for doing the same thing because it can be realized by competitive hebbian learning. This is an extended version of our ICONIP'12 paper.

kmeans

P. Qi, X. Hu, “Learning nonlinear statistical regularities in natural images by modeling the outer product of image intensities,” Neural Computation, vol. 26, no. 4, pp. 693–711, 2014.

MATLAB codes

This is a hierarchical model aimed at modeling the properties of complex cells in the primary visual cortex (V1). It can be regarded as a simplified version of Karklin and Lewicki's model published in 2009.

outerproduct

 


2013


P. Qi, S. Su, X. Hu, “Modeling outer products of features for image classification,” Proc. of the 6th International Conference on Advanced Computational Intelligence (ICACI), Hangzhou, China, Oct. 19-21, 2013, pp.334-338.

The method described in our 2014 Neural Computation paper was applied on SIFT features for image classification (in the SPM framework), which achieved higher accuracy on two datasets than traditional sparse coding.

ai

M. Liang, M. Yuan, X. Hu, J. Li and H. Liu, “Traffic sign detection by ROI extraction and histogram features-based recognition,” Proc. of the 2013 International Joint Conference on Neural Network (IJCNN), Dallas, USA, Aug. 4-9, 2013, pp. 739-746.

The paper describes our method used for the IJCNN 2013 German Traffic Sign Detection Competition. This method achieved 100% accuracy on the Prohibitory signs!

traffic sign

Y. Wu, Y. Liu, J. Li, H. Liu, X. Hu, “Traffic sign detection based on convolutional neural networks,” Proc. of the 2013 International Joint Conference on Neural Network (IJCNN), Dallas, USA, Aug. 4-9, 2013, pp. 747-753.

The paper describes another method used for the IJCNN 2013 German Traffic Sign Detection Competition. This method ranked 2nd and 4th on the Mandatory and Danger signs, respectively!

traffic sign

 


2012


Y. Yang, Q. He, X. Hu, “A compact neural network for training support vector machines,” Neurocomputing, vol. 86, pp. 193-198, 2012.

A simple analog circuit is proposed for solving SVM. It takes advantages of the nonlinear properties of operational amplifiers.

svm

X. Hu and J. Wang, “Solving the assignment problem using continuous-time and discrete-time improved dual networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 23, no. 5, pp. 821-827, 2012.

Assign n entities to n slots and each assignment has a cost.

assignment

X. Hu, P. Qi, B. Zhang, “Hierarchical K-means algorithm for modeling visual area V2 neurons,” Proc. of 19th International Conference on Neural Information Processing (ICONIP), Doha, Qatar, Nov. 12-15, 2012, pp. 373-381.

An extended version is in our 2014 neurocomputing paper.

ai

Y. Yang, S. Wang, X. Hu, J. Li, B. Xu, “A modified k-shortest paths algorithm for solving the earliest arrival problem on the time-dependent model of transportation systems,” Proc. of International MultiConference of Engineers and Computer Scientists (IMECS), Hong Kong, March 14-16, 2012, pp. 1560-1567.

If one wants to go to city B from city A by train and wants to arrive at A as early as possible, could you provide some "good" itinararies? Here is a fast solution. It gives you K best solutions for any citis A and B of mainland China within 30ms on a small server when K<100.

railway

2011

X. Hu, J. Wang, “Solving the assignment problem with the improved dual neural network,” Proc. of 8th International Symposium on Neural Networks, Guilin, China, May 29-June 1, 2011, pp. 547-556.

2010

X. Hu and B. Zhang, “A Gaussian attractor network for memory and recognition with experience-dependent learning,” Neural Computation, vol. 22, no. 5, pp. 1333-1357, 2010.

X. Hu, C. Sun and B. Zhang, “Design of recurrent neural networks for solving constrained least absolute deviation problems,” IEEE Transactions on Neural Networks, vol. 21, no. 7, pp. 1073-1086, July 2010.

X. Hu, “Dynamic system methods for solving mixed linear matrix inequalities and linear vector inequalities and equalities,” Applied Mathematics and Computation, vol. 216, pp. 1181-1193, 2010.

2009

X. Hu and B. Zhang, “An alternative recurrent neural network for solving variational inequalities and related optimization problems,” IEEE Transactions on Systems, Man and Cybernetics - Part B, vol. 39, no. 6, pp. 1640-1645, Dec. 2009.

X. Hu and B. Zhang, “A new recurrent neural network for solving convex quadratic programming problems with an application to the k-winners-take-all problem,” IEEE Transactions on Neural Networks, vol. 20, no. 4, pp. 654–664, April 2009.

X. Hu, “Applications of the general projection neural network in solving extended linear-quadratic programming problems with linear constraints,” Neurocomputing, vol. 72, no. 4-6, pp. 1131-1137, Jan. 2009.

X. Hu, J. Wang and B. Zhang, “Motion planning with obstacle avoidance for kinematically redundant manipulators based on two recurrent neural networks,” Proc. of IEEE International Conference on Systems, Man, and Cybernetics, San Antonio, USA, Oct. 2009, pp. 143-148.

X. Hu, B. Zhang, “Another simple recurrent neural network for quadratic and linear programming”, Proc. of 6th International Symposium on Neural Networks, Wuhan, China, May 26-29, 2009, pp. 116-125.

2008

X. Hu and J. Wang, “An improved dual neural network for solving a class of quadratic programming problems and its k-winners-take-all application,” IEEE Transactions on Neural Networks, vol. 19, no. 12, pp. 2022–2031, Dec. 2008.

X. Hu, Z. Zeng, B. Zhang, “Three global exponential convergence results of the GPNN for solving generalized linear variational inequalities”, Proc. of 5th International Symposium on Neural Networks, Beijing, China, Sep. 24-28, 2008.

2007

X. Hu and J. Wang, “Design of general projection neural networks for solving monotone linear variational inequalities and linear and quadratic optimization problems,” IEEE Transactions on Systems, Man and Cybernetics - Part B, vol. 37, no. 5, pp. 1414-1421, Oct. 2007.

X. Hu and J. Wang, “Solving generally constrained generalized linear variational inequalities using the general projection neural networks,” IEEE Transactions on Neural Networks, vol. 18, no. 6, pp. 1697-1708, Nov. 2007.

X. Hu and J. Wang, “A recurrent neural network for solving a class of general variational inequalities,” IEEE Transactions on Systems, Man and Cybernetics - Part B, vol. 37, no. 3, pp. 528-539, 2007.

X. Hu and J. Wang, “Solving the k-winners-take-all problem and the oligopoly Cournot-Nash equilibrium problem using the general projection neural networks.” Proc. of 14th International Conference on Neural Information Processing (ICONIP), Kitakyushu, Japan, Nov. 13-16, 2007, pp. 703-712.

S. Liu, X. Hu and J. Wang, “Obstacle Avoidance for Kinematically Redundant Manipulators Based on an Improved Problem Formulation and the Simplified Dual Neural Network”, Proc. of IEEE Three-Rivers Workshop on Soft Computing in Industrial Applications, Passau, Bavaria, Germany, August 1-3, 2007, pp. 67-72.

X. Hu and J. Wang, “Convergence of a recurrent neural network for nonconvex optimization based on an augmented Lagrangian function,” Proc. of 4th International Symposium on Neural Networks, Part III, Nanjing, China, June 3-7, 2007.

2006

X. Hu and J. Wang, “Solving pseudomonotone variational inequalities and pseudoconvex optimization problems using the projection neural network,” IEEE Transactions on Neural Networks, vol. 17, no. 6, pp. 1487-1499, 2006.

X. Hu and J. Wang, “Solving extended linear programming problems using a class of recurrent neural networks,” Proc. of 13th International Conference on Neural Information Processing, Part II, Hong Kong, Oct. 3-6, 2006.

 



© 2021 Xiaolin Hu. All rights reserved.