Main Article Content

Abstract

Penelitian ini membandingkan kinerja Vision Transformer (ViT-Base) dan MobileNet untuk deteksi objek pada perangkat edge. Evaluasi dilakukan pada Jetson Orin Nano dengan menggunakan empat parameter: akurasi, latensi, konsumsi energi, dan efisiensi komputasi. Hasil pengujian menunjukkan MobileNet mencapai akurasi 100%, latensi 41,38 ms, konsumsi energi 0,3937 joule/frame, dan efisiensi 0,8332 %/msW. Sementara itu, ViT-Base memperoleh akurasi 93,72%, latensi 63,58 ms, konsumsi energi 0,5306 joule/frame, dan efisiensi 0,4466 %/msW. MobileNet lebih unggul dalam hal akurasi, efisiensi, kecepatan, dan penggunaan energi. Temuan ini membuktikan, MobileNet direkomendasikan untuk aplikasi edge real-time yang menuntut respon cepat dan hemat daya, serta sesuai untuk skenario yang membutuhkan akurasi tinggi.

Keywords

Vision Transformer MobileNet Komputasi Tepi Efisiensi Komputasi

Article Details

References

  1. M. Nurul Achmadiah, N. Setyawan, and A. D. Risdhayanti, “Deteksi Kepadatan Objek di Stasiun Kereta Api Berbasis ViT-Base pada Jetson Orin Nano,” Jurnal Elektronika dan Otomasi Industri, vol. 12, no. 1, pp. 153–161, May 2025, doi: 10.33795/elkolind.v12i1.7495.
  2. S. Terabe, T. Kato, H. Yaginuma, N. Kang, and K. Tanaka, “Risk Assessment Model for Railway Passengers on a Crowded Platform,” Transportation Research Record: Journal of the Transportation Research Board, vol. 2673, no. 1, pp. 524–531, Jan. 2019, doi: 10.1177/0361198118821925.
  3. L. Jiao et al., “A Survey of Deep Learning-Based Object Detection,” IEEE Access, vol. 7, pp. 128837–128868, 2019, doi: 10.1109/ACCESS.2019.2939201.
  4. S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans Pattern Anal Mach Intell, vol. 39, no. 6, pp. 1137–1149, Jun. 2017, doi: 10.1109/TPAMI.2016.2577031.
  5. Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick, “Mask R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, pp. 2961–2969.
  6. A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” Apr. 2020.
  7. W. Liu et al., “SSD: Single Shot MultiBox Detector,” Dec. 2015, doi: 10.1007/978-3-319-46448-0_2.
  8. M. Nurul Achmadiah, A. Ahamad, C.-C. Sun, and W.-K. Kuo, “Energy-Efficient Fast Object Detection on Edge Devices for IoT Systems,” IEEE Internet Things J, vol. 12, no. 11, pp. 16681–16694, Jun. 2025, doi: 10.1109/JIOT.2025.3536526.
  9. M. N. Achmadiah, N. Setyawan, A. A. Bryantono, C.-C. Sun, and W.-K. Kuo, “Fast Person Detection Using YOLOX With AI Accelerator For Train Station Safety,” in 2024 International Electronics Symposium (IES), IEEE, Aug. 2024, pp. 504–509. doi: 10.1109/IES63037.2024.10665874.
  10. J. Pan et al., “EdgeViTs: Competing Light-Weight CNNs on Mobile Devices with Vision Transformers,” 2022, pp. 294–311. doi: 10.1007/978-3-031-20083-0_18.
  11. N. Setyawan, M. N. Achmadiah, C.-C. Sun, and W.-K. Kuo, “Multi-Stage Vision Transformer for Batik Classification,” in 2024 International Electronics Symposium (IES), IEEE, Aug. 2024, pp. 449–453. doi: 10.1109/IES63037.2024.10665807.