English  |  正體中文  |  简体中文  |  Post-Print筆數 : 11 |  Items with full text/Total items : 88613/118155 (75%)
Visitors : 23494550      Online Users : 150
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    政大機構典藏 > 理學院 > 資訊科學系 > 學位論文 >  Item 140.119/125643
    Please use this identifier to cite or link to this item: http://nccur.lib.nccu.edu.tw/handle/140.119/125643


    Title: 基於半色調轉換的影像對抗例防禦機制
    Defense mechanism against adversarial attacks using density-based representation of images
    Authors: 黃辰瑋
    Huang, Chen-Wei
    Contributors: 廖文宏
    Liao, Wen-Hung
    黃辰瑋
    Huang, Chen-Wei
    Keywords: 深度學習
    對抗例防禦
    半色調
    輸入轉化
    deep learning
    adversarial defense
    halftoning
    input recharacterization
    Date: 2019
    Issue Date: 2019-09-05 16:15:04 (UTC+8)
    Abstract: 對抗例是一種刻意使深度學習模型分類錯誤的輸入資料,只需要在輸入中加入微小的干擾便可以使輸出結果大幅改變。目前的研究中已提出了許多方法來保護神經網路避免受到對抗例攻擊的影響,而其中多數防禦方法已被證實無法有效抵抗對抗例的攻擊。為了解決這個問題,我們提出了輸入轉化,一種有效消除對抗例干擾以維持模型準確率的防禦方法。
    輸入轉化分成兩個階段: 正向轉換及反向重構。我們希望透過具有破壞性的雙向轉換方式,使被刻意加入的干擾失效。在這項研究中,我們使用半色調轉換及半色調還原作為轉化方法進行實驗,並透過卷積層視覺化等方式進行結果分析。我們使用Tiny-ImageNet中的200個類別,共約26萬張128x128的灰階及半色調圖片作為訓練資料。
    現有對抗例防禦研究中,大多採用梯度模糊、輸入轉換及對抗例訓練等機制作為防禦策略,其中又以對抗例訓練最具防禦效果。然而,對抗例訓練需生成對抗例並加入訓練樣本中,這在大部分的應用是不實際的。我們所提出的方法較類似於輸入轉換,使用半色調轉換方式將圖片由連續色調轉換為二進制,希望藉圖片不同的表現形態使對抗例攻擊失效。同時也使用半色調還原對其進行還原,嘗試藉由還原過程消除對抗例干擾。此方法不須對抗例訓練龐大的訓練成本,僅需對輸入資料進行前處理。
    本論文提出的方法在VGG-16架構上對於灰階模型top5可達76.5%、半色調模型top5可達80.4%、混合模型更是達到85.14%。面對FGSM、I-FGSM、PGD對抗例攻擊混合模型top5仍可維持80.97%、78.77%、81.56%的準確率。雖然準確率仍受到影響,但對抗例效果卻大幅下降,與現有輸入轉換防禦機制相比,我們的方法平均可提升準確率約10%。
    Adversarial examples are slightly modified inputs that are devised to cause erroneous inference of deep learning models. Recently, many methods have been proposed to counter the attack of adversarial examples. However, new ways of generating attacks have also surfaced accordingly. Protection against the intervention of adversarial examples is a fundamental issue that needs to be addressed before wide adoption of deep learning based intelligent systems. In this research, we utilize the method known as input recharacterization to effectively remove the perturbations found in the adversarial examples in order to maintain the performance of the original model.
    Input recharacterization typically consists of two stages: a forward transform and a backward reconstruction. Our hope is that by going through the lossy two-way transformation, the purposely added 'noise' or 'perturbation' will become ineffective. In this work, we employ digital halftoning and inverse halftoning for input recharacterization, although there exist many possible choices. We apply convolution layer visualization to better understand the network architecture and characteristics. The data set used in this study is Tiny ImageNet, consisting of 260 thousand 128x128 grayscale images belonging to 200 classes.
    Most of defense mechanisms rely on gradient masking, input transform and adversarial training. Among these strategies, adversarial training is widely regarded as the most effective. However, it requires adversarial examples to be generated and included in the training set, which is impractical in most applications. The proposed approach is more similar to input transform. We convert the image from intensity-based representation to density-based representation using halftone operation, which hopefully invalidates the attack by changing the image representation. We also investigate whether inverse halftoning can eliminate the adversarial perturbation. The proposed method does not require extra training of adversarial samples. Only low-cost input pre-processing is needed.
    On the VGG-16 architecture, the top-5 accuracy for the grayscale model is 76.5%, the top-5 accuracy for halftone model is 80.4%, and the top-5 accuracy for the hybrid model (trained with both grayscale and halftone images) is 85.14%. With adversarial attacks generated using FGSM, I-FGSM, and PGD, the top-5 accuracy of the hybrid model can still maintain 80.97%, 78.77%, 81.56%, respectively. Although the accuracy has been affected, the influence of adversarial examples is significantly discounted. The average improvement over existing input transform defense mechanisms is approximately 10%.
    Reference: [1] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, R. Fergus, Intriguing properties of neural networks, arXiv preprint arXiv:1312.6199, 2014.
    [2] IBM Research AI tutorial:
    http://research.ibm.com/labs/ireland/nemesis2018/pdf/tutorial.pdf
    [3] Daniel Lowd and Christopher Meek. Adversarial learning. In Proceedings of the 11th ACM SIGKDD international conference on Knowledge discovery in data mining, pages 641–647. ACM, 2005.
    [4] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
    [5] Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
    [6] Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. Adversarial examples in the physical world. arXiv: Computer Vision and Pattern Recognition, 2016.
    [7] F. Tramer, A. Kurakin, N. Papernot, I. Goodfellow, D. Boneh, and P. McDaniel. Ensemble Adversarial Training: ` Attacks and Defenses. ArXiv e-prints, May 2017.
    [8] Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. In Security and Privacy (EuroS&P), 2016 IEEE European Symposium on, pages 372–387. IEEE, 2016.
    [9] Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR, abs/1312.6034, 2013.
    [10] Seyed Mohsen Moosavi Dezfooli, Alhussein Fawzi, and Pascal Frossard. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), number EPFL-CONF-218057, 2016.
    [11] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In IEEE Symposium on Security and Privacy, 2017.
    [12] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083, 2017.
    [13] Marco Barreno, Blaine Nelson, Russell Sears, Anthony D Joseph, and J Doug Tygar. Can machine learning be secure? In Proceedings of the 2006 ACM Symposium on Information, computer and communications security, pages 16–25. ACM, 2006.
    [14] Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks are universal approximators. Neural networks, 2(5):359–366, 1989.
    [15] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. international conference on machine learning, pages 448–456, 2015.
    [16] Cristian Bucilua, Rich Caruana, and Alexandru Niculescu-Mizil. Model compression. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’06, pages 535–541, New York, NY, USA, 2006. ACM.
    [17] Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
    [18] Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In Security and Privacy (SP), 2016 IEEE Symposium on, pages 582–597. IEEE, 2016.
    [19] Shixiang Gu and Luca Rigazio. Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068, 2014.
    [20] Anh Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. computer vision and pattern recognition, pages 427–436, 2015.
    [21] Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick D Mcdaniel. On the (statistical) detection of adversarial examples. arXiv: Cryptography and Security, 2017.
    [22] Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. On detecting adversarial perturbations. international conference on learning representations, 2017.
    [23] Chuan Guo, Mayank Rana, Moustapha Cisse and Laurens van der Maaten. Countering Adversarial Images Using Input Transformation. arXiv:1711.00117v3 [cs.CV] 25 Jan 2018
    [24] Weilin Xu, David Evans, and Yanjun Qi. Feature squeezing: Detecting adversarial examples in deep neural networks. CoRR, abs/1704.01155, 2017.
    [25] Leonid Rudin, Stanley Osher, and Emad Fatemi. Nonlinear total variation based noise removal algorithms. Physica D, 60:259–268, 1992.
    [26] Yuri Boykov, Olga Veksler, and Ramin Zabih. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(11):1222–1239, 2001.
    [27] 陳乃瑋,基於卷積核冗餘的神經網路壓縮機制,政治大學資訊科學系碩士論文,2018。
    [28] Repository for Scale-recurrent Network for Deep Image Deblurring
    https://github.com/jiangsutx/SRN-Deblur#scale-recurrent-network-for-deep-image-deblurring
    Description: 碩士
    國立政治大學
    資訊科學系
    106753021
    Source URI: http://thesis.lib.nccu.edu.tw/record/#G0106753021
    Data Type: thesis
    DOI: 10.6814/NCCU201900958
    Appears in Collections:[資訊科學系] 學位論文

    Files in This Item:

    File SizeFormat
    302101.pdf3554KbAdobe PDF0View/Open


    All items in 政大典藏 are protected by copyright, with all rights reserved.


    社群 sharing

    著作權政策宣告
    1.本網站之數位內容為國立政治大學所收錄之機構典藏,無償提供學術研究與公眾教育等公益性使用,惟仍請適度,合理使用本網站之內容,以尊重著作權人之權益。商業上之利用,則請先取得著作權人之授權。
    2.本網站之製作,已盡力防止侵害著作權人之權益,如仍發現本網站之數位內容有侵害著作權人權益情事者,請權利人通知本網站維護人員(nccur@nccu.edu.tw),維護人員將立即採取移除該數位著作等補救措施。
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback