Lessons Learned Report: Super-Resolution for Detection Tasks in Engineering Problem-Solving
Martin Feder, Michal Horovitz, Assaf Chen, Raphael Linker, Ofer M., Shir

TL;DR
This paper evaluates the effectiveness of super-resolution algorithms in agricultural detection tasks, highlighting their limitations, potential benefits like spectral channel learning, and providing practical recommendations for deployment.
Contribution
It offers a detailed analysis of super-resolution use in agro-detection, emphasizing domain-specific challenges and proposing guidelines for effective application.
Findings
Super-resolution may not always improve detection accuracy in agricultural problems.
Algorithms can help learn missing spectral channels and synchronize them.
Limitations include domain-specific constraints and the need for tailored approaches.
Abstract
We describe the lessons learned from targeting agricultural detection problem-solving, when subject to low resolution input maps, by means of Machine Learning-based super-resolution approaches. The underlying domain is the so-called agro-detection class of problems, and the specific objective is to learn a complementary ensemble of sporadic input maps. While super-resolution algorithms are branded with the capacity to enhance various attractive features in generic photography, we argue that they must meet certain requirements, and more importantly, that their outcome does not necessarily guarantee an improvement in engineering detection problem-solving (unlike so-called aesthetics/artistic super-resolution in ImageNet-like datasets). By presenting specific data-driven case studies, we outline a set of limitations and recommendations for deploying super-resolution algorithms for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSmart Agriculture and AI · Remote Sensing in Agriculture · Tree Root and Stability Studies
[Lessons Learned Report]
Super-Resolution for Detection Tasks in Engineering Problem-Solving
Martin Feder
Michal Horovitz
Assaf Chen
Raphael Linker and Ofer M. Shir111Corresponding author: [email protected].
Abstract
We describe the lessons learned from targeting agricultural detection problem-solving, when subject to low resolution input maps, by means of Machine Learning-based super-resolution approaches. The underlying domain is the so-called agro-detection class of problems, and the specific objective is to learn a complementary ensemble of sporadic input maps. While super-resolution algorithms are branded with the capacity to enhance various attractive features in generic photography, we argue that they must meet certain requirements, and more importantly, that their outcome does not necessarily guarantee an improvement in engineering detection problem-solving (unlike so-called aesthetics/artistic super-resolution in ImageNet-like datasets). By presenting specific data-driven case studies, we outline a set of limitations and recommendations for deploying super-resolution algorithms for agro-detection problems. Another conclusion states that super-resolution algorithms can be used for learning missing spectral channels, and that their usage may result in some desired side-effects such as channels’ synchronization.
This technical report summarizes a research project targeting the topic of super-resolution. It is structured as follows: Section 1 will introduce the challenge and provide necessary backgrounds. Section 2 will present the taken approach, and particularly describe Super Resolution. Section 3 will then describe our experimental system, and Section 4 report on our practical observation when applying our approach. Finally, Section 5 will summarize the report and list a set of recommendations for the practitioner.
1 Background: Learning a complementary ensemble of sporadic input maps
1.1 The challenge and motivation
Recent developments in Artificial Intelligence (AI) combined with the advent of modern sensing technologies allow nowadays effective automated identification of detailed real-world features. Such identification has the potential to enable sophisticated machine-driven operations, e.g., autonomous vehicle control or agricultural field management, to mention a few. Indeed, the broad domain of Precision Agriculture (PA) requires accurate real-time feature detection capabilities in the field, particularly considering the recent climatic trends, the growing resiliency to pesticides, and other challenges. The core idea behind PA is the analytical consideration of the spatial variability in the field. The practical consequences of PA translate to the farming management concept, wherein the application of specific measures are executed in order to optimize crops’ input and output, i.e., maximizing crops’ yield while minimizing costs and environmental impact. Toward this end, real-time data on soil, weather, crop maturity and stress conditions, are taken in order to measure inter and intra-field variability, and to respond accordingly. Data processing is usually done via a dedicated decision support system, which is designed to deal with massive amounts of data and to assist in an informed decision-making process, both for the immediate real-time short term as well as for the long term.
A particular setup of interest in PA is concerned with spectral reflectance in the visible (VIS), near-infrared (NIR), and thermal range altogether in very high spatial resolution obtained on-demand with unmanned aerial vehicles (UAVs) and drones combined with satellite imagery (e.g., Landsat, Sentinel-2, and Vens) acquired in lower spatial resolutions, yet in a scheduled (permanent) manner. Although spatial resolution of the aforementioned satellites is considered low when compared to sensors mounted on UAVs and drones (3-30 meters vs. 0.5-10 cm), they provide an almost daily temporal resolution. At the same time, UAVs and drones mounted sensors are operated only on-demand basis. The inherent trade-off between spatial and temporal sensing resolutions is projected at some level onto the AI’s learnability (where the focus here is on classification problems). Scheduled satellites’ sensors provide low grade information which when processed into training data is likely to enable only a “weak” learner and yet they constitute a persistent source of information. UAVs and drones mounted sensors provide high quality data which carries the potential to enable training into “strong” learners but lack the availability. The main aim of this project is therefore to address this trade-off by composition, i.e., to be able to precisely classify prescribed states with the aid of multiple sensors, even when data are missing or being insufficient for accurate identification by independent training. Since satellite imagery is broadly accessible, success in this proposed process will save time and resources. In PA, for instance, it will enable cheap scans of vast areas to locate heterogeneity at the field level, and pinpoint suspected locations to be further explored by on-demand sensing.
1.2 Novelty
Our research suggests an AI-approach to handle multiplicity of input maps in order to accomplish successful feature identification given that learning of individual input maps is inaccurate. The novelty lies in the consideration of the input space as an ensemble of sporadic maps, which are to be collectively treated to enable image recognition that was heretofore viewed as either irrelevant or as too complex to address. The core treatment idea is either to individually learn and then Bayesian-infer into a single hypothesis, or to fuse the ensemble and learn a single (aggregated) hypothesis. We capitalized on advanced Machine Learning (ML) algorithms for implementing our proposed approaches, and to validate them on remotely-sensed multi-sensor maps within PA. Overall, combining scheduled and on-demand maps, and utilizing scheduled sensors aided by on-demand sensors are innovative approaches for pragmatic problem-solving in PA, introducing cost efficiency and outstanding merit. Effectively merging numerous sensors of various resolutions, bands, and costs would contribute to fully exploit each and every one of them in order to overcome challenging learning problems in PA.
1.3 State-of-the-art
Machine learning
AI by means of ML is rooted in indeterminate hypotheses concerning a problem-space. The process of turning a hypothesis into determinate using a solved problem-space is termed supervised training. Successful ML training is defined as an iteratively-updated model resulting in the ability to correctly predict a solution for an unseen problem-instance. Real-world problem-spaces, e.g., the agricultural industry, tend to be complex, due to the sheer volume of factors that need addressing. Image recognition is a type of important ML problems of particular interest for the agricultural industry. In recent years, Deep Neural Networks (DNNs) have become the leading method for visual recognition, since they yield better results than previous approaches, such as statistical methods, other ML algorithms, and image processing techniques [RDS*+*15]. Prior work concerning image recognition in PA does exist, including work involving DNNs using UAVs [DAR*+*18, DWHC*+*17, HPL17, MHM16, MOSD12, SCP*+*18, STT*+*16]. The most relevant ML development with respect to the current challenge is the topic of Super-Resolution (SR), to be presented in Section 2.1.
Remote sensing
Remote sensing for crop management aims at providing spatial and spectral information for crop classification, crop condition, yield forecast, and weed/disease detection and management. Current satellite-based remotely sensed products can cover large areas, but they are limited by both their temporal (revisit times – 2 and 5 days for Vens and Sentinel-2 satellites, respectively) and spatial (pixel size – 5 and 10 m for Vens and Sentinel-2, respectively) resolutions, when compared to a UAV or a drone. One of satellite imaging’s challenges is dealing with pixels that have multiple objects with different spectral signatures (e.g., plants and soil). Such pixels are called mixed-pixels. Images acquired by UAVs and drones, which have a much higher spatial resolution, contain many more pure (as opposed to mixed) pixels, which makes vegetation detection and differentiation much easier. Similarly, high spatial resolution allows for a precise estimation of the vegetation cover fraction. Beside their spatial and temporal resolution, remote sensing platforms differ in terms of the sensors they carry. To date, the measurements most commonly used in applications related to agriculture consist of passive measurements in the VIS ( 380-740nm), NIR ( 750nm-1.4m) and thermal ranges ( 8-15m). Regardless of the spectral range, the sensor is either broadband, multi-spectral (i.e., a small number of relatively wide bands) or hyper-spectral (i.e., a typically high number of narrow bands). Notably, sensing platforms carried most by satellites include only a few, relatively wide, spectral bands, that can obviously not be changed according to the users’ needs. By comparison, UAV and drones’ platforms are much more flexible and, if used on-demand for a specific need as suggested in the present work, the sensing platform carried by the UAV or the drone can be selected to best fit the monitoring needs.
2 Approach
2.1 Super-Resolution (SR) Algorithms
The capacity of learning algorithms to induce high resolution imagery from lower resolution inputs, based upon pre-training, is referred to super-resolution (SR) [SVE19]. Various algorithms and approaches have been devised (see, e.g., [DPW18, TNNP19, KMC19, MVFM19]). While SR algorithms are branded with the capacity to enhance various attractive features in generic photography, we argue that they must meet certain requirements when applied to engineering detection challenges (unlike aesthetics or artistic challenges). Most importantly, problem-solving of “Aesthetic” versus “Detection/Engineering” SR-tasks differs already at the process definition, especially at the input-output levels; When SR is applied to an Aesthetic task, it performs the primary operation of transforming a degraded input imagery into an enhanced, visually appealing imagery – which constitutes the output. SR is explicit then. However, when SR is applied within a learning pipeline its role is reduced to performing an auxiliary operation of enhancing data instances that undergo training/testing. The output is the classification response of the learning model. SR may be considered implicit then. The comparison between the two tasks is illustrated by means of process diagrams in Figure 1.
Consequently, SR models that were successfully trained on ImageNet-like datasets are not necessarily potent for engineering detection problem-solving. Altogether, we argue that there could not exist a generic SR algorithm with the capability to enhance any given input of an arbitrary sort (equivalently, to some degree, to the No Free Lunch set of theorems [WM97]). In other words, the capacity to enhance low-resolution imagery is feature-dependent, and clearly requires suitable pre-training that relies on instances of the same domain. Furthermore, additional requirements exist, especially concerning the available spatial resolution versus the scale of the targeted features. One of the primary goals of this paper is to outline a set of recommendations concerning the feasibility and infeasibility of SR usage.
2.2 Formulation
Given a multi-source remotely-sensed set of input maps, the primary objective is to devise an automated procedure for classification of predefined states that achieves successful learning. All input maps are assumed to cover a specific area of interest, e.g., an agricultural field, whose targeted states are well-defined (for instance, a binary classification problem with a healthy crop state versus a diseased state). Given a classification problem, the system encompasses at most remote sensors numbered from to , where the first sensors are scheduled sensors and the remaining sensors are on-demand sensors, . The underlying classification problem is assumed to be static in the sense that each individual map carries information to infer or refute the targeted states (unlike dynamic problems that require a time-series of maps to deduce the state).
Explicitly, we denote by the set of all input maps of sensor . Then, all input maps constitute a disjoint union of two subsets, the set of all the scheduled maps
[TABLE]
and the set of all the on-demands maps
[TABLE]
When no sensing takes place, the related map will be in a void state.
2.2.1 Assumptions
- A1
For practical learning purposes, each map may be segmented into an effective feature space of dimension .
- A2
We assume that each set is learnable by a classifier with an expected error rate of , when averaged over all input maps within the set (formally, this assumption translates into PAC-learnability [SSBD14] using sample size , with pragmatic values, distinguishable from a random learning hypothesis). Note that this learnability perspective focuses on the existence of a learning function, while assuming the existence of a potent learning algorithm.
- A3
We assume that learning of the scheduled sets inevitably leads to high error rates (due to their low spatial resolution). Since straightforward boosting techniques fail, on-demand sensing and the learning of the acquired sets are much needed for sufficiently accurate classification.
- A4
The on-demand sensing is hard and expensive to perform, while the scheduled sets are easily accessible. Hence, combining these two classes of maps’ sets is necessary.
2.2.2 Concept Outline
Collecting data along a dedicated period, e.g., a single agricultural season, may serve to construct a training set comprising all types of maps. Tagging (labeling) of the targeted states is assumed to be applied to the dataset. A consecutive period may become the testing phase. The training and the testing are applied in the following manner. At first, the learning data is limited only to the “scheduled maps” (simulating a scenario where only scheduled sensors are available). Upon successful training, the learner is examined at classifying the targeted states with “scheduled maps” as the testing data. If the classification output is provided with high accuracy (with respect to the tagged states of the dataset), no action is taken; else, that is below some accuracy threshold, “on-demand maps” are to be used, according to some selection criterion (simulating a scenario where on-demand sensors are deployed), to play a role in both the training and the testing parts.
2.2.3 Algorithm
The main research question is how to effectively learn such input maps as a classification problem given targeted states and by employing a sensible number of PAC-learning classifiers.
The plan is to treat this challenge by obtaining an SR-learner to the scheduled maps using the acquired on-demand maps. The obtained SR-learner is denoted as . Formally, the application of on an input map induces another map (or, possibly, a set of maps, since is defined as a set of maps by itself) , which is denoted by . Altogether, the proposed pipeline is the following (Algorithm 1), where its termination criterion is set with respect to the error rates of the low-/high-resolution and SR classifiers:222The error rates’ notation is rooted in for low-resolution and for high-resolution, but the careful reader should not be confused by the fact that .
3 Experimental System: Irrigation Monitoring
Precision irrigation can be defined as matching water application to the crop needs which is rarely uniform in space, time and amount. One of the main sources of crop growth variability in semi-arid climates is due the lack of irrigation uniformity, which can be due to bad design of the irrigation system or operational failures (leaks, clogging). Awareness of the extent and severity of possible uniformity problems requires spatial and temporal crop water status information at sufficient resolution and accuracy, attainable mainly by aerial sensing and imagery [ENAEB11, HLWA*+*15, ZCSJ11]. New enabling technologies for precision irrigation information acquisition became affordable and popularized with the appearance in the open market of low cost, high performance small-sized UAV and drones carrying high-resolution digital camera, as well as fast revisiting, low cost satellite services in the VIS (RGB), NIR and long wave (Thermal) infrared ranges. Most irrigation systems, fixed and especially mobile, suffer from a lack of uniformity in terms of water coverage. Mapping the crop water status in space and time is crucial for variable rate irrigation (VRI) application and for adapting irrigation according to the specific crop water requirements. Crop thermal imaging via remote sensing technology enables mapping the state of water in the field. Spectral vegetation indices (VI) built from combinations of channels at various wavelengths allow for better information extraction from remotely sensed data because they reduce the effects of soil, view angle and topography, while enhancing the focus on the visibility of the vegetation [HJDM*+*13]. The objective of this experimental system’s research was to test the efficiency of ML models to detect irrigation variability of a lateral move and pivot irrigation machines, and to recognize malfunction in some cases.
In what follows, we describe preliminary computational tasks that we addressed with respect to given irrigation monitoring datasets. The following section will discuss the experimental observations concerning the application of SR algorithms.
3.1 Insights and Limitations
Since the Vens satellite resolution was believed to be insufficient for classifying irrigation features, we down-scaled images taken by drones to synthesize (“simulate”) satellite lower-resolution imagery. The idea was to define the resolution at which the irrigation signal can still be detected. Since the desired substitute imagery was a mosaica of multi-spectral images, and due to the spatial misalignment across bands, there existed an initial best resolution constraint.
3.2 Irrigation Learnability
Here we aimed at the classification of various irrigation policies through statistical and classical machine learning methods in Neve Yaar. Description: Use irrigation regimens to establish the separability of the irrigation areas using different sensory inputs.
Sensory image and index-based sensory were explored across the dates, to investigate whether the problem of separating the plots by irrigation regimen was learnable in each. Moreover, each image was downscaled in order to also estimate its learnability in lower resolutions. The thermal statistics of the dataset is shown in Figure 6, depicting the field segments on which an irrigation policy was applied (bottom row) and their associated thermal histograms (upper row).
4 Application of SR
In what follows, we outline the concrete SR-related tasks that we addressed with respect to given irrigation monitoring datasets.
4.1 Baseline Deployment
Description: Use the Never-Yaar multi-spectral data as a proxy for up-scaling from the Vens multi-spectral data.
Since the goal was to enhance readily available Vens satellite imagery with on-demand high resolution mosaicas, the acquired imagery (satellite and drone mosaicas from 4 dates) was aligned against an irrigation map, georeferenced to rectify alignment imprecision, trimmed to a bounding box around the irrigation map, and resized to a base common resolution of 2.5 cm per pixel (through down-scale pooling or up-scaling by linear extrapolation), resulting in a 3D-matrix image-set. We trained models with different input-output pairings using different channels from the same date. Since the objective was to produce a model that could enhance an image, we sought to understand the circumstances at which a model could reliably add information to an input. Super-resolution training was accomomplished by reducing the resolution of the input space for any given analysis. Super-resolution was made by downscaling by 2, 4 and 8 factor reductions of the input space. The super-resolution layout was up to 10 sequential convolution and deconvolution layers and 300 thousand parameters per layer or less. Results were visually examined and assessed using image similarity metrics. We started with an input space that contained all of the channels with a target of specific channels and proceeded to exclude input bands. The multi-spectral imagery was trained against RGB target. reduced to lower resolution, and the targets were switched between RGB, thermal imagery and NDVI.
4.2 Regime Identification
Description: Use the Never-Yaar multispectral data to classify patches by irrigation regimen by means of SR.
The former step in 4.1 was to establish a baseline SR. The added step was meant to try and classify the post-SR image (whether it was approximated to be multispectral, thermal or NDVI) with the irrigation regimen as the target. We first attempted this both directly, through the irrigation plot labeling, but also through
Results: (The algorithm failed to converge at the original resolution for the test split. likewise for the downscaled resolutions)
4.3 Irrigation segments using Goshrim
Description: Use Goshrim to test for the ability to identify localized irrigation patterns.
The Neve-Yaar plot was a planned experiment of different irrigation regimens. That said, since the plot assignment was of large areas, the difference in application would not have necessarily been locally homogeneous. In contrast, though the Goshrim plot was not experimentally planned for differential irrigation, an irrigation malfunction resulted in irrigation patterns. Since the irrigation patterns were clearly observable and differentiable it was expected that the patterning would be more easily detectable than in the Neve Yaar dedicated irrigation plots. At the same time, there was no definition of irrigation policies here. Hence, a pseudo-irrigation binary classification problem was approximated by first passing a temperature threshold using the thermal sensory and later by manually adjusting the output map for whatever areas consistency could be assessed and correct visually.
4.4 SR using Vens alone
Description: The Venus satellite takes raw images at a resolution of 5m. A dedicated algorithm is then applied to create an undistorted image. The algorithm relies on aligning several images from proximal dates. The alternatively algorithm requires reducing the raw image to a lower resolution.
The choice of dataset and problem were to circumvent the spatial and temporal limitations (limited plot size, time repetitions) inherent in Neve Yaar by working directly with Vens-satellite data. The task was to replicate the algorithm that corrects the raw image (5m), and possibly improve upon it by not requiring multiple . The goal was to investigate whether we can correct the original image (5m) for inference without reducing the resolution or requiring multiple images.
The result metrics suggest the model output is closer than the raw image. Visually, however, the model has not converged, irrespective of the metrics.
5 Summary and Recommendations
The Newe Yaar dataset was not learnable at the high-resolution level – evidently, the classification of the irrigation policies was not generalizable, which rendered the proposed pipeline (Algorithm 1) unfit for this task. Moreover, an attempt to obtain data fusion of maps taken across different dates was unsuccessful due to lack of normalization. Following that, we conducted several computational tasks that focused on the challenge of SR learning.
Next, we describe our concluding checklist for the data scientist who wishes to address a Detection/Engineering SR problem.
5.1 A Concluding Checklist
Assess the feature size (linked to the targeted state) versus the pixel size at the scheduled maps. 2. 2.
Assess the resolution gaps (e.g., satellites versus drones) – is the factor reasonable? 3. 3.
Sensory noise (with respect to a calibrated “white screen”): Assess the signal-to-noise ratio 4. 4.
Spatial noise (due to geo-referencing): Assess the registration error (minor shifts can alter tagging) 5. 5.
Address this question: Is it possible to fuse datasets across multiple dates (with respect to normalization and/or noise)?
Acknowledgments
This research was supported by the Ministry of Science and Technology, Israel.
The authors thank B. Yazmir for his contributions that ignited the SR line of work and for conducting a comprehensive literature survey in that direction.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[DAR + 18] J. Duarte-Carvajalino, D. Alzate, A. Ramirez, J. Santa-Sepulveda, A. Fajardo-Rojas, and M. Soto-Suárez, Evaluating Late Blight Severity in Potato Crops Using Unmanned Aerial Vehicles and Machine Learning Algorithms , Remote Sensing 10 (2018), 1513.
- 2[DPW 18] Jonathan P. Dash, Grant D. Pearse, and Michael S. Watt, Uav multispectral imagery can complement satellite data for monitoring forest health , Remote Sensing 10 (2018), no. 8.
- 3[DWHC + 17] Chad De Chant, Tyr Wiesner-Hanks, Siyuan Chen, Ethan L. Stewart, Jason Yosinski, Michael A. Gore, Rebecca J. Nelson, and Hod Lipson, Automated identification of northern leaf blight-infected maize plants from field imagery using deep learning , Phytopathology 107 (2017), no. 11, 1426–1432.
- 4[ENAEB 11] AH El Nahry, RR Ali, and AA El Baroudy, An approach for precision farming under pivot irrigation system using remote sensing and gis techniques , Agricultural Water Management 98 (2011), no. 4, 517–531.
- 5[HJDM + 13] E Raymond Hunt Jr, Paul C Doraiswamy, James E Mc Murtrey, Craig ST Daughtry, Eileen M Perry, and Bakhyt Akhmedov, A visible band index for remote sensing leaf chlorophyll content at the canopy scale , International Journal of Applied Earth Observation and Geoinformation 21 (2013), 103–112.
- 6[HLWA + 15] Amir Haghverdi, Brian G Leib, Robert A Washington-Allen, Paul D Ayers, and Michael J Buschermohle, Perspectives on delineating management zones for variable rate irrigation , Computers and Electronics in Agriculture 117 (2015), 154–167.
- 7[HPL 17] Zhongling Huang, Zongxu Pan, and Bin Lei, Transfer learning with deep convolutional neural network for SAR target classification with limited labeled data , Remote Sensing 9 (2017), no. 12, 907.
- 8[KMC 19] Aleem Khaliq, Vittorio Mazzia, and Marcello Chiaberge, Refining satellite imagery by using uav imagery for vineyard environment: A cnn based approach , 2019 IEEE International Workshop on Metrology for Agriculture and Forestry (Metro Agri For), 2019, pp. 25–29.
