Vegetation and Non-Vegetation Classification using Object Detection Techniques and Deep Learning from Low/Mixed Resolution Satellite Images Vegetation and Non-Vegetation Detection Techniques and Deep Learning
Main Article Content
Abstract
Vegetation cover classification using mixed or low-resolution scalar images is challenging. Fortunately, recently deep learning object detection methods have emerged as a replacement to the conventional machine learning methods for the detection and classification of land use and land cover. This paper presents a deep learning object detection approach for land use and land cover detection using low/mixed resolution satellite images acquired from Google Earth satellite images. Google Earth images are accessible freely using the Google Earth Pro desktop application. Our dataset consists of two (02) classes (vegetation and non-vegetation) with a total of 450 labeled images captured from different parts of Pakistan. We present a comparison of the recent anchor-free object detection model YOLOX with the anchor-based object detection model YOLOR for solving real-time problems. The end-to-end differentiability, efficient GPU utilization, and absence of hand-crafted parameters make anchor-free models a compelling choice in object detection, and yet not been explored on Land cover classification using satellite images. Our experimental study shows that YOLOX delivers an overall accuracy of 83.50% on Vegetation and 86% on Non-Vegetation classes, which outperformed YOLOR by 30% on Vegetation classes and 34% on non-Vegetation classes for our dataset. We also show how an object detection system can be used for Vegetation and Non-Vegetation classification tasks, which can then be used for change monitoring and assisting in developing geographical maps using low/mixed resolution freely available satellite images
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Pakistan Journal Emerging Science and Technologies (PJEST) in collaboration with Govt. Islamia Graduate College Civil Lines Lahore, Pakistan is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
Huang, L., Liu, B., Li, B., Guo, W., Yu, W., Zhang, Z., & Yu, W. (2018, January). OpenSARShip: A Dataset Dedicated to Sentinel-1 Ship Interpretation. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 11(1), 195–208.
https://doi.org/10.1109/jstars.2017.2755672
Basu, S., Ganguly, S., Mukhopadhyay, S., DiBiano, R., Karki, M., & Nemani, R. (2015). DeepSat. Proceedings of the 23rd SIGSPATIAL International Conference on Advances in Geographic Information Systems. https://doi.org/10.1145/2820783.2820816
Yang, Y., & Newsam, S. (2010, November 2). Bag-of-visual-words and spatial extensions for land-use classification. https://doi.org/10.1145/1869790.1869829
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., . . . Fei-Fei, L. (2015). ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3), 211–252.
https://doi.org/10.1007/s11263-015-0816-y
O. A. B. Penatti, K. Nogueira and J. A. dos Santos, "Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?," 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Boston, MA, USA, 2015, pp. 44-51,
doi: 10.1109/CVPRW.2015.7301382.
Prasad, D. K. (2012, December 1). Survey of The Problem of Object Detection In Real Images. Retrieved from https://www.researchgate.net/publication/235216716_Survey_of_The_Problem_of_Object_Detection_In_Real_Images
Lowe, D. (2004, November 1). Distinctive Image Features from Scale-Invariant Keypoints. https://doi.org/10.1023/b:visi.0000029664.99615.94
Viola, P., & Jones, M. J. (2001, January 1). Robust Real-Time Object Detection. Retrieved from https://www.researchgate.net/publication/215721846_Robust_Real-Time_Object_Detection
N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), San Diego, CA, USA, 2005, pp. 886-893 vol. 1, doi: 10.1109/CVPR.2005.177.
Liu, L., Ouyang, W., Wang, X., Fieguth, P., Chen, J., Liu, X., & Pietikäinen, M. (2019, October 31). Deep Learning for Generic Object Detection: A Survey. https://doi.org/10.1007/s11263-019-01247-4
Taufique, A. M. N., Minnehan, B., & Savakis, A. (2020). Benchmarking Deep Trackers on Aerial Videos. Sensors, 20(2), 547. https://doi.org/10.3390/s20020547
K. Li, G. Cheng, S. Bu and X. You, "Rotation-Insensitive and Context-Augmented Object Detection in Remote Sensing Images," in IEEE Transactions on Geoscience and Remote Sensing, vol. 56, no. 4, pp. 2337-2348, April 2018, doi: 10.1109/TGRS.2017.2778300.
Ge, Z. (2021, July 18). YOLOX: Exceeding YOLO Series in 2021. Retrieved from
https://arxiv.org/abs/2107.08430
Wang, C. Y. (2021, May 10). You Only Learn One Representation: Unified Network for Multiple Tasks. Retrieved from https://arxiv.org/abs/2105.04206
Uijlings, J.R.R. (2013). Selective Search for Object Recognition. Retrieved from https://ivi.fnwi.uva.nl/isis/publications/bibtexbrowser.php?key=UijlingsIJCV2013&bib=all.bib
Girshick, Ross & Donahue, Jeff & Darrell, Trevor & Malik, Jitendra. (2013). Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.
doi:10.1109/CVPR.2014.81.
Ren, Shaoqing & He, Kaiming & Girshick, Ross & Sun, Jian. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence. 39. doi:10.1109/TPAMI.2016.2577031.
Redmon, Joseph & Divvala, Santosh & Girshick, Ross & Farhadi, Ali. (2016). You Only Look Once: Unified, Real-Time Object Detection. 779-788.
doi:10.1109/CVPR.2016.91.
Redmon, Joseph & Farhadi, Ali. (2017). YOLO9000: Better, Faster, Stronger. 6517-6525.
doi:10.1109/CVPR.2017.690.
Redmon, J., & Farhadi, A. (2018, April 8). YOLOv3: An Incremental Improvement. Retrieved from https://www.researchgate.net/publication/324387691_YOLOv3_An_Incremental_Improvement
Liu, Wei & Anguelov, Dragomir & Erhan, Dumitru & Szegedy, Christian & Reed, Scott & Fu, Cheng-Yang & Berg, Alexander. (2016). SSD: Single Shot MultiBox Detector. 9905. 21-37.
doi:10.1007/978-3-319-46448-0_2.
Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020, April 22). YOLOv4: Optimal Speed and Accuracy of Object Detection. Retrieved from https://www.researchgate.net/publication/340883401_YOLOv4_Optimal_Speed_and_Accuracy_of_Object_Detection
Glenn jocher, U. (n.d.). Comprehensive Guide to Ultralytics YOLOv5. Retrieved from https://docs.ultralytics.com/yolov5/
Liao, Minghui & Shi, Baoguang & Bai, Xiang. (2018). TextBoxes++: A Single-Shot Oriented Scene Text Detector. IEEE Transactions on Image Processing. PP.
doi:10.1109/TIP.2018.2825107.
Nosaka, Ryusuke & Ujiie, Hidenori & Kurokawa, Takaharu. (2018). Orientation-Aware Regression for Oriented Bounding Box Estimation. 1-6.
doi:10.1109/AVSS.2018.8639332.
Lei, Jiahui & Gao, Chongjun & Hu, Jing & Gao, Changxin & Sang, Nong. (2019). Orientation Adaptive YOLOv3 for Object Detection in Remote Sensing Images.
doi:10.1007/978-3-030-31654-9_50.
Law, Hei & Deng, Jia. (2020). CornerNet: Detecting Objects as Paired Keypoints. International Journal of Computer Vision. 128.
doi:10.1007/s11263-019-01204-1.
Tian, Z. (2019, April 2). FCOS: Fully Convolutional One-Stage Object Detection. Retrieved from https://arxiv.org/abs/1904.01355
Carrio, A., Sampedro, C., Rodriguez-Ramos, A., & Campoy, P. (2017). A Review of Deep Learning Methods and Applications for Unmanned Aerial Vehicles. Journal of Sensors, 2017, 1–13.
https://doi.org/10.1155/2017/3296874
Zhu, X. X., Tuia, D., Mou, L., Xia, G. S., Zhang, L., Xu, F., & Fraundorfer, F. (2017). Deep Learning in Remote Sensing: A Comprehensive Review and List of Resources. IEEE Geoscience and Remote Sensing Magazine, 5(4), 8–36. https://doi.org/10.1109/mgrs.2017.2762307
HUANG, C. & DAVIS, L. & Townshend, J.. (2002). An assessment of support vector machines for land cover classié cation. International Journal of Remote Sensing. 23.
doi:10.1080/01431160110040323.
Szegedy, Christian & Liu, Wei & Jia, Yangqing & Sermanet, Pierre & Reed, Scott & Anguelov, Dragomir & Erhan, Dumitru & Vanhoucke, Vincent & Rabinovich, Andrew. (2015). Going deeper with convolutions. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1-9.
doi:10.1109/CVPR.2015.7298594.
Ma, Lei & Li, Manchun & Ma, Xiaoxue & Cheng, Liang & Du, Peijun & Liu, Yongxue. (2017). A review of supervised object-based land-cover image classification. ISPRS Journal of Photogrammetry and Remote Sensing. 130.
doi:10.1016/j.isprsjprs.2017.06.001.
Helber, Patrick & Bischke, Benjamin & Dengel, Andreas & Borth, Damian. (2018). Introducing EuroSAT: A Novel Dataset And Deep Learning Benchmark For Land Use And Land Cover Classification. doi:10.1109/IGARSS.2018.8519248.
Jia, Yangqing & Shelhamer, Evan & Donahue, Jeff & Karayev, Sergey & Long, Jonathan & Girshick, Ross & Guadarrama, Sergio & Darrell, Trevor. (2014). Caffe: Convolutional Architecture for Fast Feature Embedding. MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia.
doi:10.1145/2647868.2654889.
Yang, Y., & Newsam, S. (2010, November 2). Bag-of-visual-words and spatial extensions for land-use classification. https://doi.org/10.1145/1869790.1869829
Xia, Gui-Song & Hu, Jingwen & Hu, Fan & Shi, Baoguang & Bai, Xiang & Zhong, Yanfei & Lu, Xiaoqiang & Zhang, Liangpei. (2017). AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification. IEEE Transactions on Geoscience and Remote Sensing. 55. 3965 - 3981.
doi:10.1109/TGRS.2017.2685945.
Basu, Saikat & ganguly, sangram & Mukhopadhyay, Supratik & Dibiano, Robert & Karki, Manohar & Nemani, Ramakrishna. (2015). DeepSat: a learning framework for satellite imagery. 1-10.
doi:10.1145/2820783.2820816.
Penatti, Otavio & Nogueira, Keiller & dos Santos, Jefersson. (2015). Do deep features generalize from everyday objects to remote sensing and aerial scenes domains?. 44-51.
doi:10.1109/CVPRW.2015.7301382.