Please wait a minute...
Computational Visual Media  2020, Vol. 6 Issue (4): 467-476    doi: 10.1007/s41095-020-0181-9
Research Article     
Kernel-blending connection approximated by a neural network for image classification
Xinxin Liu1(),Yunfeng Zhang1,(✉)(),Fangxun Bao2(),Kai Shao1(),Ziyi Sun1(),Caiming Zhang1,2()
1 Shandong University of Finance and Economics, Jinan 250014, China
2 Shandong University, Jinan 250100, China
Download: PDF (707 KB)      HTML  
Export: BibTeX | EndNote (RIS)      

Abstract  

This paper proposes a kernel-blending connection approximated by a neural network (KBNN) for image classification. A kernel mapping connection structure, guaranteed by the function approximation theorem, is devised to blend feature extraction and feature classification through neural network learning. First, a feature extractor learns features from the raw images. Next, an automatically constructed kernel mapping connection maps the feature vectors into a feature space. Finally, a linear classifier is used as an output layer of the neural network to provide classification results. Furthermore, a novel loss function involving a cross-entropy loss and a hinge loss is proposed to improve the generalizability of the neural network. Experimental results on three well-known image datasets illustrate that the proposed method has good classification accuracy and generalizability.



Key wordsimage classification      blending neural network      function approximation      kernel mapping connection      generalizability     
Received: 31 March 2020      Published: 30 November 2020
Fund:  National Natural Science Foundation of China(Grant Nos. 61972227 and 61672018);Natural Science Foundation of Shandong Province(Grant No. ZR2019MF051);Primary Research and Develop-ment Plan of Shandong Province(Grant No. 2018GGX101013)
Corresponding Authors: Yunfeng Zhang     E-mail: liuxxin26@163.com;yfzhang@sdufe.edu.cn;fxbao@sdu.edu.cn;shaokai17862921498@126.com;17862921505@163.com;czhang@sdu.edu.cn
About author: Xinxin Liu received her B.E. degree from the School of Computer Science and Technology, North China Institute of Science and Technology, Langfang, China, in 2016. She is currently working toward her M.S. degree at Shandong Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics. Her research interests include particle swarm optimization, machine learning, and image processing.|Yunfeng Zhang received his B.E. degree in computational mathematics and application software from Shandong University of Technology, Jinan, China, in 2000, his M.S. degree in applied mathematics from Shandong University in 2003, and his Ph.D. degree in computational geometry from Shandong University in 2007. He is now a professor in Shandong Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics. His current research interests include computer-aided geometric design, digital image processing, computational geometry, and function approximation.|Fangxun Bao received his M.Sc. degree from the Department of Mathematics of Qufu Normal University, China, in 1994, and his Ph.D. degree from the Department of Mathematics of Northwest University, Xi’an, China, in 1997. His current position is full professor in the Department of Mathematics, Shandong University. His research interests include computer-aided geometric design and computation, computational geometry, and function approximation.|Kai Shao received his B.E. degree from the School of Computer Science and Technology at Shandong University of Finance and Economics in 2018. He is currently working toward his M.S. degree at Shandong Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics. His research interests include medical image processing and deep learning.|Ziyi Sun received her B.E. degree from the School of Computer Science and Technology at Shandong University of Finance and Economics in 2018. She is currently working toward her M.S. degree at Shandong Provincial Key Laboratory of Digital Media Technology, Shandong University of Finance and Economics. Her research interests include image processing and deep learning.|Caiming Zhang is a professor and doctoral supervisor of the School of Computer Science and Technology at Shandong University. He is now also the dean and professor of the School of Computer Science and Technology at Shandong Economic University. He received his B.S. and M.E. degrees in computer science from Shandong University in 1982 and 1984, respectively, and his Dr.Eng. degree in computer science from Tokyo Institute of Technology, Japan, in 1994. From 1997 to 2000, he held a visiting position at the University of Kentucky, USA. His research interests include CAGD, CG, information visualization, and medical image processing.
Cite this article:

Xinxin Liu,Yunfeng Zhang,Fangxun Bao,Kai Shao,Ziyi Sun,Caiming Zhang. Kernel-blending connection approximated by a neural network for image classification. Computational Visual Media, 2020, 6(4): 467-476.

URL:

http://cvm.tsinghuajournals.com/10.1007/s41095-020-0181-9     OR     http://cvm.tsinghuajournals.com/Y2020/V6/I4/467

Fig. 1 Flowchart of our KBNN image classification method.
Fig. 2 Kernel mapping connection.
LayerTypeKernelStridePaddingChannels
DataInputN/AN/AN/AN/A
CONV 1Convolution5×52SAME64
POOL 1Average pooling3×32VALID64
CONV 2Convolution3×32SAME128
CONV 3Convolution3×32SAME256
POOL 2Max pooling2×22VALID256
GAPAverage pooling1×11VALID256
Kernel mappingFully connected1×1N/AN/A128
Linear outputOutput1×1N/AN/A10
Table 1 Configuration of KBNN architecture
C on error.
">
Fig. 3 Impact of penalty parameter C on error.
Fig. 4 Smoothed test error on MNIST for cross-entropy loss, hinge loss, and proposed loss: (a) epochs 0-400, (b) epochs 300-400.
Fig. 5 Smoothed test error on CIFAR-10 for cross-entropy loss, hinge loss, and proposed loss: (a) epochs 0-250, (b) epochs 180-250.
MethodDLSVMNiu and Suen’sCNN+softmaxCDBMPCANetDeep NCAEDrpluKBNN
Error0.870.190.680.820.622.091.040.36
Table 2 Classification errors (%) on MNIST
MethodDLSVMResNst110+L-GMML-DNNNINMaxout NetworksDrop-ConnectKBNN
Error11.94.968.128.819.389.321.54
Table 3 Classification errors (%) using CIFAR-10
MethodStochastic PoolingLearned PoolingMaxout NetworksNINML-DNNResNetKBNN
Error42.5143.7138.5735.6834.1828.6232.71
Table 4 Classification errors (%) on CIFAR-100
[1]   Cortes, C.; Vapnik, V. Support-vector networks. Machine Learning Vol. 20, 273-297, 1995.
[2]   Bagarinao, E.; Kurita, T.; Higashikubo, M.; Inayoshi, H. Adapting SVM image classifiers to changes in imaging conditions using incremental SVM: An application to car detection. In: Computer Vision-ACCV 2009. Lecture Notes in Computer Science, Vol. 5996. Zha, H.; Taniguchi, R.; Maybank, S. Eds. Springer Berlin Heidelberg, 363-372, 2010.
[3]   Guo, Y. Q.; Jia, X. P.; Paull, D. Effective sequential classifier training for SVM-based multitemporal remote sensing image classification. arXiv preprint arXiv:1706.04719, 2017.
[4]   Hinton, G. E.; Osindero, S.; Teh, Y. W. A fast learning algorithm for deep belief nets. Neural Computation Vol. 18, No. 7, 1527-1554, 2006.
[5]   Bengio, Y.; Courville, A.; Vincent, P. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 35, No. 8, 1798-1828, 2013.
[6]   LeCun, Y.; Boser, B. E.; Denker, J. S.; Henderson, D.; Howard, R.; Hubbard, W.; Jackel, L. D. Back propagation applied to handwritten zip code recognition. Neural Computation Vol. 1, No. 4, 541-551, 1989.
[7]   Eitel, A.; Springenberg, J. T.; Spinello, L.; Riedmiller, M.; Burgard, W. Multimodal deep learning for robust RGB-D object recognition. arXiv preprint arXiv:1507.06821, 2015.
[8]   Shi, W. W.; Gong, Y. H.; Tao, X. Y.; Cheng, D.; Zheng, N. N. Fine-grained image classification using modified DCNNs trained by cascaded softmax and generalized large-margin losses IEEE Transactions on Neural Networks and Learning Systems Vol. 30, No. 3, 683-694, 2018.
[9]   Niu, X. X.; Suen, C. Y. A novel hybrid CNN-SVM classifier for recognizing handwritten digits Pattern Recognition Vol. 45, No. 4, 1318-1325, 2012.
[10]   Sun, X.; Park, J.; Kang, K.; Hur J. Novel hybrid CNN-SVM model for recognition of functional magnetic resonance images. In: Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 1001-1006, 2017.
[11]   Hubel, D. H.; Wiesel, T. N. Receptive fields and functional architecture of monkey striate cortex. The Journal of Physiology Vol. 195, No. 1, 215-243, 1968.
[12]   Fukushima, K. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics Vol. 36, No. 4, 193-202, 1980.
[13]   Zeiler, M. D.; Fergus, R. Visualizing and understanding convolutional networks. In: Computer Vision - ECCV 2014. Lecture Notes in Computer Science, Vol. 8689. Fleet, D.; Pajdla, T.; Schiele, B.; Tuytelaars, T. Eds. Springer Cham, 818-833, 2014.
[14]   Sermanet, P.; Eigen, D.; Zhang, X.; Mathieu, M.; Fergus, R.; LeCun, Y. OverFeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.
[15]   Zhang, F. L.; Wu, X.; Li, R.-L.; Wang, J.; Zheng, Z. H.; Hu, S. M. Detecting and removing visual distractors for video aesthetic enhancement. IEEE Transactions on Multimedia Vol. 20, No. 8, 1987-1999, 2018.
[16]   Wen, Y. H.; Gao, L.; Fu, H. B.; Zhang, F. L.; Xia, S. H. Graph CNNs with motif and variable temporal block for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence Vol. 33, 8989-8996, 2019.
[17]   Ioffe, S.; Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015.
[18]   Lin, M.; Chen, Q.; Yan, S. C. Network in network. arXiv preprint arXiv:1312.4400, 2013.
[19]   Cybenko, G. Approximation by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems Vol. 2, No. 4, 303-314, 1989.
[20]   LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proceedings of the IEEE Vol. 86, No. 11, 2278-2324, 1998.
[21]   Krizhevsky, A.; Hinton, G. Learning multiple layers of features from tiny images. Master Thesis. University of Toronto, 2009.
[22]   Tang, Y. Deep learning using support vector machines. arXiv preprint arXiv:1306.0239, 2015.
[23]   Wan, W. T.; Zhong, Y. Y.; Li, T. P.; Chen, J. S. Rethinking feature distribution for loss functions in image classification. arXiv preprint arXiv:1803.02988, 2018.
[24]   Lee, H.; Grosse, R.; Ranganath, R.; Ng, A. Y. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, 609-616, 2009.
[25]   Chan, T. H.; Jia, K.; Gao, S. H.; Lu, J. W.; Zeng, Z. N.; Ma, Y. PCANet: A simple deep learning baseline for image classification? IEEE Transactions on Image Processing Vol. 24, No. 12, 5017-5032, 2015.
[26]   Hosseini-Asl, E.; Zurada, J. M.; Nasraoui, O. Deep learning of part-based representation of data using sparse autoencoders with nonnegativity constraints. IEEE Transactions on Neural Networks and Learning Systems Vol. 27, No. 12, 2486-2498, 2016.
[27]   Bristow, H.; Eriksson, A.; Lucey, S. Fast convolutional sparse coding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 391-398, 2013.
[28]   Xu, C. Y.; Lu, C. Y.; Liang, X. D.; Gao, J. B.; Zheng, W.; Wang, T. J.; Yan, S. C. Multi-loss regularized deep neural network. IEEE Transactions on Circuits and Systems for Video Technology Vol. 26, No. 12, 2273-2283, 2016.
[29]   Goodfellow, I. J.; Warde-Farley, D.; Mirza, M. Courville, A.; Bengio, Y. Maxout networks. arXiv preprint arXiv:1302.4389, 2013.
[30]   Wan, L.; Zeiler, M.; Zhang, S.; LeCun, Y.; Fergus, R. Regularization of neural networks using dropconnect. In: Proceedings of the 30th International Conference on Machine Learning, Vol. 28, 1058-1066, 2013.
[31]   Malinowski, M.; Fritz, M. Learnable pooling regions for image classification. arXiv preprint arXiv:1301.3516, 2013.
[32]   Zeiler, M. D.; Fergus, R.Stochastic pooling for regularization of deep convolutional neural networks. arXiv preprint arXiv:1301.3557, 2013.
[33]   He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770-778, 2016.
[1] Kai-Xuan Chen, Xiao-Jun Wu. Component SPD matrices: A low-dimensional discriminative data descriptor for image set classification[J]. Computational Visual Media, 2018, 04(03): 245-252.
[2] Lin Chen,Meng Yang. Semi-supervised dictionary learning with label propagation for image classification[J]. Computational Visual Media, 2017, 3(1): 83-94.
[3] Shaoning Zeng,Yang Xiong. Weighted average integration of sparse representation and collaborative representation for robust face recognition[J]. Computational Visual Media, 2016, 2(4): 357-365.
[4] Changmin Choi,YoonSeok Lee,Sung-Eui Yoon. Discriminative subgraphs for discovering family photos[J]. Computational Visual Media, 2016, 2(3): 257-266.