Faculty of Engineering, LTH

Denna sida på svenska This page in English

Digit@LTH: Events

Aerial View Image-Goal Localization with Reinforcement Learning

Figure 3a from John Backsund och Anton Samuelssons Master's Thesis
Figure 3a from John Backsund och Anton Samuelssons Master's Thesis


From: 2022-06-10 10:15 to 11:00
Place: MH:309A
Contact: kalle [at] maths [dot] lth [dot] se
Save event to your calendar

Master's Thesis Seminar

John Backsund and Anton Samuelsson presents their Master's Thesis
Friday June 10 at 10:15-11:00
in MH:309A
and on zoom

Aerial View Image-Goal Localization with Reinforcement Learning

With an increased amount and availability of unmanned aerial vehicles (UAVs) and
other remote sensing devices (e.g. satellites) we have recently seen an explosion in computer vision methodologies tailored towards processing and understanding aerial view data. One application for such technologies is in the area of search-and-rescue (SAR), where the task is to localize and assist one or several people who are missing, for example after a natural disaster. In many cases the rough location may be known and a UAV can be deployed to explore a given, confined area to precisely localize the missing people. In such a time- and resource-constrained setting, controlling the UAV in an informed and intelligent manner – as opposed to exhaustively scanning the whole area along a pre-defined trajectory – could significantly improve the likelihood of succeeding with the mission. In this master thesis we approach this type of problem by abstracting it as an aerial view image-goal localization task within a framework that emulates a SAR-like setup without requiring access to actual UAVs. In this framework, an agent operates on top of a given satellite image and is tasked with localizing a specific goal, specified as a rectangular region within the satellite image, from a given location in the image. The agent is never allowed to observe the underlying satellite image in its entirety, not even at low resolution, and thus it has to operate solely based on sequentially observed partial glimpses when navigating towards the goal location. To tackle our suggested aerial view image-goal localization task, we propose AiRLoc, a fully trainable reinforcement learning (RL)-based model. AiRLoc can be trained with no annotations of any kind and is hence able to learn the localization task in an entirely self-supervised manner. Extensive experimental results suggest that AiRLoc outperforms heuristic search methods as well as non-RL-based machine learning methods. The results also indicate that providing AiRLoc with mid-level vision capabilities (specifically, a pre-trained semantic segmentation network) can lead to even better performance. We also conduct a proof-of-concept study which suggests that AiRLoc – with or without semantic segmentation as input – outperforms humans on average.

Aleksis Pirinen, RISE
Kalle Åström, Matematikcentrum, Lunds Universitet

Alexandros Sopasakis, Matematikcentrum, Lunds Universitet