Case study
How Computer Vision in Logistics Reduced Truck Waiting Time by 30% for a Leading Japanese Logistics Company
- Team size: 4
- Development time: 2 years
Explore more case study
Background
According to estimates cited by Japan’s Ministry of Land, Infrastructure, Transport and Tourism (MLIT), transportation capacity could fall short by approximately 14% in 2024 and 34% by 2030 if no corrective actions are taken. The Ministry of Land, Infrastructure, Transport and Tourism states that truck transportation is a critical industry supporting Japan’s economy and daily life, while also noting that driver shortages have become increasingly severe in recent years.
The policy environment is also changing quickly. MLIT’s logistics-efficiency portal shows that, from the 2026 fiscal year onward, certain shippers and trucking-related businesses will face new obligations such as medium- to long-term planning, periodic reporting, and other compliance measures designed to improve logistics efficiency and reduce bottlenecks such as waiting time and cargo handling time.
For our client, a leading logistics solutions company headquartered in Japan, these national pressures translated into a very practical bottleneck: trucks and drivers were spending too much time on manual pallet counting and inventory reconciliation at warehouses and customer sites.
GEM engaged with the client to apply computer vision in logistics and build an AI-powered pallet detection solution that could turn a slow, manual process into a one-second mobile workflow while preserving auditability and operational control.
Challenges
Manual, slow, and error-prone pallet counting operations
- Warehouse and dispatch staff manually counted stacked pallets one by one, often across dozens of trailers per shift.
- Pallets are visually similar, frequently stacked at varying heights, and partially occluded by adjacent stacks, increasing the rate of miscounts.
- Inconsistent warehouse lighting and viewing angles further reduced counting accuracy.
- Manual re-counts were required to satisfy internal audit and customer reconciliation processes.
- Inventory disputes between drivers, warehouse staff, and customers extended truck dwell time at every stop.
Operational pressure from Japan’s driver regulation
- New overtime caps required logistics operators to materially reduce drivers’ non-driving time.
- The ongoing driver shortage demanded productivity gains rather than headcount expansion.
- Any solution had to run on the mobile devices drivers and warehouse staff already carry, with no requirement for new hardware installations across hundreds of sites.
- Solutions needed to work reliably offline and in real-world warehouse conditions, not just in controlled environments.
- Audit-grade record keeping was essential to meet customer SLAs and internal compliance requirements.
Solution
GEM delivered a mobile-first, AI-powered pallet detection system that converts pallet counting into a two-step user interaction: take a photo, and receive an instant verified count. This is where computer vision in logistics became directly operational, not just conceptual.
Workstream 1: Computer Vision Model for Pallet Detection
GEM’s AI team trained a custom object detection model based on the YOLOv5 architecture, fine-tuned on a curated dataset of warehouse pallet imagery covering varied stack heights, pallet colors, lighting conditions, and occlusion scenarios encountered across the client’s distribution network.
The model achieved sub-second inference, approximately one second per image, while remaining lightweight enough for on-device execution on standard Android handsets. This made computer vision in logistics practical in day-to-day warehouse conditions, rather than only in controlled demo environments.
Workstream 2: Android Mobile Application
GEM developed a production-grade Android application optimized for warehouse and dispatch staff. The user experience was deliberately minimized to two steps.
- The user points their device at the pallet stack and captures a photo.
- The application returns the pallet count, with bounding-box visualization for operator verification.
The app supports offline-first operation, so counts can be performed in warehouse zones with limited connectivity and synced once the device returns online. This mobile-first approach made computer vision in logistics accessible to the frontline teams already doing the work.
Workstream 3: Cloud Backend and Audit Repository
GEM implemented a Django-based backend deployed on AWS to serve as the system of record for every pallet detection event. Every count, including the source image, the model’s bounding-box output, the operator, the timestamp, and the location, is automatically stored, creating a fully searchable audit trail for cross-checking, customer dispute resolution, and inventory reconciliation.
Operations managers can review aggregated counts and trends through a web dashboard. In other words, computer vision in logistics was paired with auditability, making the solution useful not only for speed, but also for accountability and traceability.
Tech stack
- Android (mobile application)
- Deep Learning (computer vision)
- Python and YOLOv5 (object detection model)
- Django (backend services and REST API)
- AWS (cloud hosting, storage, and model serving)
Output
- Delivered a production-grade Android application for one-shot pallet counting at warehouses, loading docks, and customer sites.
- Trained and deployed a custom YOLOv5 detection model with approximately one-second inference per image on standard mobile devices.
- Built a Django-based backend on AWS that automatically stores every detection event, source image, and bounding-box output for future audit and reconciliation.
- Implemented a two-step user workflow that requires no prior training for warehouse and dispatch staff.
- Enabled offline-first operation with automatic synchronization for warehouse zones with limited connectivity.
- Established a centralized inventory dashboard for operations managers to review daily counts and historical trends.
- Created a searchable audit repository to support customer dispute resolution and internal compliance reviews.
Impacts
Operational efficiency
- Truck waiting time at warehouses and customer sites reduced by 30%.
- Pallet counting time reduced from minutes of manual counting to approximately 1 second per image.
- Manual re-counts and miscount disputes effectively eliminated through verifiable photographic evidence.
- Inventory reconciliation workflows became standardized and consistent across the client’s distribution network.
Business transformation
- Drivers gained back significant non-driving time, helping the client absorb the impact of Japan’s driver overtime regulations without proportional capacity loss.
- Warehouse staff productivity increased, partially offsetting the structural driver shortage facing Japan’s logistics sector.
- The automatic audit trail strengthened the client’s compliance posture and reduced commercial friction with downstream customers.
- The mobile-first deployment model allowed the solution to scale across distribution sites without new on-site hardware investment.
- The client established a reusable AI capability and data foundation for future computer-vision use cases across their logistics operations. This is the long-term value of computer vision in logistics: not just one workflow, but a scalable base for more automation.
Closing remarks
By replacing minutes of manual pallet counting with a one-second mobile interaction, the AI-powered pallet detection system gave our client a practical, scalable response to Japan’s problem, turning warehouse dwell time back into delivery capacity and building a long-term AI foundation for the next wave of logistics automation.
Discover another AI transformation success story: Intelligent shipper and carrier matching platform

