Received: October 09, 2023; Published: October 20, 2023
Corresponding author: Igor Agbossou, University of Franche-Comté, ThéMA UMR 6049 Laboratory, IUT NFC, Belfort, France
Urbanization is a global phenomenon; with more than half of the world’s population residing in urban areas. Sustainable development and efficient management of cities require a comprehensive understanding of urban environments, including accurate assessments of urban land cover (ULC) and land use. In this context, modern urban multi-source high-resolution and heterogeneous data capturing technologies and machine learning techniques have emerged as powerful tools for urban diagnostics. This paper presents a novel approach to urban land cover assessment (ULCA) using deep learning methods. We leverage high-resolution urban data collection sources and state-of-the-art convolutional neural networks (CNNs) to extract rich features from urban landscapes. Unlike traditional methods that rely on handcrafted features, our approach automatically learns discriminative representations, allowing for more accurate and adaptable land cover classification. Some results show that deep learning algorithms significantly improve the accuracy and timeliness of ULCA metrics compared to traditional approaches.
Keywords: Urban diagnostics; land cover assessment; remote sensing; deep learning; sustainable urban planning
The global landscape is undergoing an unprecedented transformation characterized by rapid urbanization. According to the United Nations, over half of the world’s population now resides in urban areas, a trend projected to continue in the coming decades. Urbanization brings both opportunities and challenges, demanding innovative solutions for sustainable urban development. Central to this endeavor is the need for accurate and comprehensive urban diagnostics [1-3], particularly in the assessment of land cover and land use which constitute a major challenge [4,5]. Urbanization, often seen as a symbol of progress, has led to significant economic growth and improved living standards for millions. However, it also brings forth complex challenges. The sprawling expansion of cities, coupled with population growth, places immense pressure on urban infrastructure, resources, and the environment. This necessitates a deep understanding of urban landscapes to address issues related to urban planning, environmental management, disaster mitigation, and the overall well-being of urban populations. At the heart of this understanding lies the ability to assess land cover, which encompasses the physical and functional characteristics of urban areas. Accurate land cover assessment is essential for various applications, including urban planning [6- 8], transportation management , green space conservation , and disaster resilience [11-13]. Traditional methods for land cover assessment, relying on manual interpretation or supervised classification of remote sensing data, have limitations in terms of scalability and accuracy, especially in the context of rapidly changing urban environments. Urban diagnostics, encompassing land cover assessment, is indispensable for informed decisionmaking in urban planning  and development. It provides insights into the composition of urban areas, helping urban planners allocate resources effectively, optimize infrastructure, and mitigate environmental impacts. The ability to monitor land cover changes over time is crucial for adaptive urban governance, enabling cities to respond proactively to evolving challenges.
Moreover, land cover assessment plays a pivotal role in addressing global concerns such as cli-mate change [15,16], biodiversity conservation, and disaster risk reduction [17,18]. The accurate identification of land cover types allows for the monitoring of urban heat islands , the preservation of green spaces, and the management of flood-prone areas. In recent years, deep learning, a subfield of machine learning inspired by neural networks, has shown remarkable promise in various domains, including computer vision, natural language processing, and healthcare. In urban diagnostics and land cover assessment, deep learning techniques have gained significant traction [20-22]. Convolutional Neural Networks (CNNs) have demonstrated exceptional performance in image analysis tasks [23-25]. Deep learning offers a transformative approach to land cover assessment by automatically learning intricate features from high-resolution satellite and aerial imagery. Unlike traditional methods that rely on hand-crafted features, deep learning models can adapt to the complexity and heterogeneity of urban landscapes.
The primary purpose of this research is to elucidate the potential of deep learning in advancing urban diagnostics through feature-based land cover assessment. To achieve this objective, the following research questions will be addressed:
a) What are the primary urban diagnostic metrics that can be effectively measured and predicted through the application of spatiotemporal forecasting algorithms?
b) What specific data sources and validation techniques must be employed to ensure the precision and dependability of the forecasting outcomes in the context of feature-based land cover assessment?
c) How can deep learning algorithms be effectively harnessed to capture and comprehend the intricate spatiotemporal dynamics inherent to urban systems, particularly in the domain of land cover assessment?
d) What are the practical applications and insights derived from implementing the proposed methodology within the framework of a real-world case study focused on a mediumsized city’s urban diagnostics and land cover assessment?
We present a comprehensive framework that leverages deep learning techniques to extract meaningful features from urban imagery, enabling accurate land cover classification. The paper is structured as follows: in Section 2, we provide a thorough review of the existing literature on urban diagnostics and land cover assessment, highlighting the limitations of traditional methods and the emergence of deep learning. Section 3 delves into the methodology employed in our approach, encompassing data acquisition, preprocessing, the architecture of deep learning models, hyperparameter tuning, and data augmentation. Section 4 presents the results of our experiments, showcasing the effectiveness of deep learning in ULC classification. We employ various evaluation metrics and visualizations to validate our approach. Section 5 concludes this paper outlines potential future research directions.
Urbanization, characterized by the rapid growth of cities and the concentration of populations in urban areas, is a defining global phenomenon of our time and constitutes a key factor in the evolution of urban landscapes . As of 2021, more than half of the world’s population resided in urban environments , a figure projected to continue its upward trajectory [28,29]. This rapid urban expansion has led to significant economic development and opportunities [30,31], but it has also posed substantial challenges, necessitating a comprehensive understanding of urban environments through the lens of urban diagnostics which are the essence of an informed urban development. Urban diagnostics is a wide assessment of issues and opportunities in the city that is vital to understanding its needs and how the city can move toward achieving comfortable livability. The assessment also presents areas or sectors where investments can be made so that development work is not haphazard.
It encompasses a broad spectrum of activities aimed at systematically assessing, analyzing, and monitoring urban areas. At its core, this field seeks to provide critical insights into the dynamics of urbanization, including land use, land cover, infrastructure, demographics, and environmental conditions. In the realm of urban planning, the assessment of ULC plays a pivotal role. It constitutes the foundational knowledge upon which urban planners base their decisions, thereby shaping the development and sustainability of cities. The importance of this assessment is underscored by its influence on various critical urban planning aspects. Firstly, understanding the distribution and composition of land cover and land use is instrumental in the delineation of zoning areas within urban landscapes. Zoning, a fundamental concept in urban planning , designates specific areas for residential, commercial, industrial, and recreational purposes. Accurate land cover and land use assessment informs the establishment of these zones, ensuring the rational allocation of urban space in accordance with community needs and development goals .
Secondly, the data derived from such assessments directly informs resource allocation strategies. Efficient allocation of resources, such as infrastructure investments, public services, and utilities, relies on a precise understanding of land use patterns. For instance, areas with predominantly residential land use may require different resource allocations compared to industrial or commercial zones. Ensuring optimal resource distribution is essential for both economic efficiency and the quality of urban life. Moreover, transportation planning is intricately tied to ULCA. Knowledge of where residential, commercial, and industrial areas are concentrated informs decisions regarding transportation infrastructure, road networks, public transit routes, and accessibility. Accurate land use assessments are, therefore, critical for developing sustainable and efficient transportation systems within urban areas . In the realm of urban development, the assessment of urban infrastructure assumes a position of paramount importance. It serves as a foundational pillar upon which the efficiency, sustainability, and functionality of cities rest.
This assertion is substantiated by an array of scientific evidence and established principles in urban planning and management. The efficient provision of utilities, such as water supply and wastewater management, is essential for minimizing resource wastage and pollution. Sustainable infrastructure practices reduce the ecological footprint of urban areas and contribute to environmental resilience. Equally significant is the role of infra-structure assessment in ensuring social equity and inclusivity within cities. Inadequate infrastructure, particularly in marginalized neighborhoods, can lead to disparities in access to basic services and opportunities. Hence, infrastructure assessment is pivotal in identifying and addressing disparities, fostering social cohesion, and promoting equitable urban development.
Additionally, demographic data significantly aids in the provision of services within urban areas. Urban services, including healthcare, education, housing, and transportation, are closely tied to the composition and size of the population. Accurate demographic data enables the efficient allocation of re-sources and the design of services that cater to the unique requirements of different demographic groups . This understanding allows for the identification of long-term demographic trends, which are invaluable for long-range urban planning [36,37]. Lastly, environmental conservation efforts are profoundly influenced by an understanding of land cover and land use. The identification of green spaces, wetlands, natural habitats, and areas with specific environmental significance hinges on precise land use assessments. Such information is essential for crafting environmental conservation policies and practices aimed at safeguarding urban ecosystems and biodiversity.
As we delve into the realm of urban diagnostics through land cover assessment, it becomes evident that our ability to accurately understand and characterize the ever-evolving urban landscape relies heavily on the methods and tools we employ. Traditional urban land cover assessment approaches have long served as the cornerstone of this endeavor, providing valuable insights into the distribution and composition of land use within urban areas [38,39]. However, these approaches are not immune to challenges that limit their effectiveness in capturing the complexity of urban environments. In this section, we explore the multifaceted challenges encountered in current urban land cover assessment approaches and present innovative solutions poised to transform the field. These challenges encompass issues of spatial and temporal resolution [40-42], manual feature engineering [43,44], data variability and heterogeneity [45-47], limited generalization , and resource-intensive data labeling . Having conducted an examination of existing methodologies, we have successfully identified five major challenges. The outcomes of our investigation have been documented and presented in Table 1. To address these challenges, we delve into the potential of deep learning, a cuttingedge technology that offers promising avenues for revolutionizing how we assess ULC.
Moreover, the urban landscape undergoes rapid and intricate changes driven by both human activities and natural processes. These dynamics pose several funda-mental questions and challenges:
a) How can we construct robust remote sensing observation models that can accurately capture the complexity of urban scenes?
b) How can we improve the automatic extraction of information from complex urban environments, especially in shaded areas and regions with diverse urban features?
c) How can we effectively capture rapidly changing urban information, including time-sensitive targets and dynamic land cover and land use transitions?
And for reason, urban remote sensing observations are characterized by their multi-dimension, multi-scale, and multimode nature. For multi-dimension, urban remote sensing demands observations in both horizontal and vertical dimensions. While some applications may rely solely on horizontal observations, others necessitate vertical observations. For instance, energy demand estimation and precise positioning require vertical observations. Understanding the impact of urban structure on biophysical processes often requires both horizontal and vertical data. Concerning multi-scale, UVF in urban remote sensing can be categorized into three scales: point, line, and plane. Image feature points, representing specific locations, fall into the point scale. Roads, which are linear features, belong to the line scale. Impervious surfaces and land cover classifications, representing broader areas, are categorized as plane objects .
In multi-mode, UVF encompasses both static and time-sensitive objects. Buildings, as static entities, form a crucial component of the urban landscape. Conversely, vehicles, representing timesensitive objects, contribute to the dynamic nature of urban scenes. Furthermore, dynamic land cover and land use transitions within urban areas, particularly in developing countries, are also classified as time sensitive objects. Understanding and effectively capturing multi-dimension, multi-scale, and multi-mode UVF data are essential for comprehensive ULCA. Such assessments hold substantial significance in addressing global challenges, including climate change mitigation, urban ecological sustainability, and urban sprawl management. Typical ULC types, including buildings, roads, parking lots, water bodies, vegetation, and bare soils within the urban landscape, are illustrated in Figure 2. In the pursuit of comprehensive ULCA, the collection of diverse and precise data is a pivotal undertaking. Aerial photogrammetry and remote sensing platforms undeniably offer valuable spatial information and texture features of urban targets, with a primary focus on building sur-face characteristics.
However, it is important to note that these methods may overlook a substantial wealth of geometric and textural data pertaining to building facades, which play a significant role in the urban landscape. Effective ULCA strategies necessitate the gathering of information spanning a wide spectrum of urban factors, encompassing not only land cover but also demographic data, economic indicators, environmental conditions, infrastructure performance, and social factors. These multi-dimensional datasets facilitate a holistic understanding of urban landscapes and dynamics, aiding in informed decision-making for urban planning and management. OpenStreetMap 3D (OSM 3D) is an open-source platform that offers detailed 3D maps of urban areas, providing comprehensive information about urban infrastructure and building geometry. This resource proves instrumental in applications such as urban planning, disaster management, and geospatial analysis. City JSON serves as a versatile format for the exchange of 3D urban data, encompassing buildings, roads, landmarks, and more. Its flexibility facilitates seamless data sharing and integration, supporting a wide array of urban analysis tasks.
Cityscapes offer high-quality images coupled with semantic annotations of urban scenes. These annotated datasets are instrumental in the development and evaluation of algorithms for urban scene understanding, including land cover and object detection. Google Earth provides access to high-resolution satellite imagery and 3D models of urban environments. These resources are invaluable for extracting detailed building and infrastructure information, offering a comprehensive view of urban landscapes. Light Detection and Ranging (LIDAR) technology employs laser scanning to capture high-precision 3D data on urban structures and infrastructure. This technology enables the creation of detailed point cloud models that offer in-sights into urban topography and building characteristics. The integration of data from these diverse sources empowers ULCA efforts with a multi-faceted view of urban environments, enhancing both the accuracy and efficiency of land cover assessment. In the subsequent sections, we describe how deep learning techniques, combined with these multi-faceted UVF data, offer innovative solutions to the challenges posed by urban land cover assessment.
Deep learning (DL) stands as a formidable and contemporary technique in the realm of image processing, finding profound applicability in the analysis of remote sensing (RS) images. This section unveils a sophisticated multilevel DL architecture tailored for ULCA from a multitemporal multisource dataset. It is crucial to under-score that RS data and human activity records, while offering complementary in-sights, portray distinct facets of the urban landscape. RS data delineate what the land is, whereas human activity records articulate how the land is utilized. Consequently, the dataset employed must be harmonized with the specific problem at hand and the target objects of study. In essence, the fusion of semantic descriptive information about the land with visual information is imperative. Our proposed framework endeavors to unearth contemporary urban land cover patterns, unraveling features concealed within the available multisource data.
The initial step in our methodology revolves around the meticulous processing of multisource data. This encompasses RS images, points of interest (POIs), areas of interest (AOIs), and building footprint data. These disparate data streams are harmonized into compatible formats suitable for integration and analysis.
Multimodal Feature Processing: Our model encompasses three pivotal components:
a) Inception-based Visual Feature Extractor: This component is purpose-built to extract intricate visual features from high-resolution RS images. The Inception-based architecture excels at capturing spatial patterns, textures, and nuanced information concealed within the RS imagery .
b) BERT-based Semantic Feature Extractor: Engineered to delve into the semantic domain, this extractor derives feature vectors from building-related data. BERT (Bidirectional Encoder Representations from Transformers) is adept at discerning intricate semantic relationships within textual data [56,57]. In our context, it is adapted to process non-visual data sources, such as building records and land-use annotations.
c) Feature Fusion and Classification: The extracted visual and semantic feature vectors are subsequently channeled into a feature fusion and classification block.
This block orchestrates the fusion of the extracted visual and semantic features. The resulting fused feature vectors are then funneled into a classification model responsible for the ultimate inference of land cover types. This holistic fusion of visual and semantic information em-powers our DL model to discern complex urban land-use patterns and, in turn, enriches the accuracy and interpretability of the ULCA results. In Figure 3, we illustrate the overarching framework for integrating visual and semantic features to discern urban land-use patterns. This innovative approach promises to enhance the capacity of urban diagnostics by unveiling hidden features within multifaceted multisource data.
For validating the effectiveness of DL algorithms in ULCA, the city of Lyon, nestled in the Auvergne-Rhône-Alpes region of France (Figure 4), was selected as the experimental area. Lyon, renowned for its cultural heritage, culinary delights, and bustling urban life, serves as an ideal testbed due to its complex and dynamic urban landscape. With a population exceeding 500,000 residents and covering an area of 47.87 square kilometers, Lyon encapsulates the intricacies of a vibrant metropolis. The choice of Lyon as our experimental area provides a valuable opportunity to evaluate and refine DL Algorithms for ULCA in a real-world urban context, thus ensuring the applicability and robustness of these algorithms in diverse urban environments. To assemble a comprehensive dataset for experimentation, we harnessed a diverse set of data sources, including: Cityscapes Dataset, a rich repository of high-quality urban images with semantic annotations, indispensable for training and validation, high-resolution satellite images from Google Earth, contributing to the visual dataset, open-source 3D maps providing detailed urban infrastructure and building geometries; City JSON, a versatile format facilitating the exchange of 3D urban data, encompassing buildings, roads, and landmarks and LIDAR data capturing precise urban structures and infrastructure.
The dataset we curated comprises diverse data types, including RS images. These images possess a spatial resolution of approximately 1 meter and encompass three spectral bands, enabling spectral analysis. Building Footprint Data: This dataset is in vector format and includes attributes such as “floor count,” offering insights into building heights and volumes. POIs and AOIs: These datasets label specific locations with distinct shape forms. POIs are represented as geospatial points, while AOIs take the form of geospatial polygons. Both datasets feature consistent type categories and encompass essential information such as name, type, and address. Socio-economic Indicators: These indicators encompass vital socio-economic metrics, including population density, employment rates, public services, and transportation systems. Data spanning from 2000 to 2023 were sourced from authoritative bodies such as the National Institute of Statistics and Economic Studies (INSEE) and the Urban Community of Lyon (Grand Lyon). Table 2 offers a synoptic view of the dataset, providing an overview of the key data types and their temporal coverage. The comprehensive dataset ensures that our experiments encompass a wide array of urban features and characteristics, enabling a robust evaluation of DL Algorithms for ULCA in the dynamic context of Lyon.
In our pursuit of advancing urban diagnostics through DL algorithms for ULCA, we conducted a series of carefully designed experiments to evaluate the performance and robustness of our methodology. This subsection outlines the key experimental settings, including data preprocessing, model architectures, and evaluation metrics. Before delving into model training and evaluation, a series of data preprocessing steps were undertaken to ensure data compatibility and quality. Data Integration: Multisource data, comprising RS images, building footprint data, POIs, AOIs, socio-economic indicators, and LIDAR data, were integrated into a unified dataset. Spatial Alignment: All data sources were spatially aligned to a common reference system to facilitate seamless integration. Temporal Alignment: Temporal data spanning from 2000 to 2023 were synchronized to ensure temporal consistency across the dataset.
Normalization: RS images and LIDAR data were normalized to a standardized scale to enhance model convergence. Our experimental framework leveraged state-of-the-art DL architectures to harness the power of visual and semantic information fusion. For Visual Feature Extractor, an Inception-based convolutional neural network (CNN) was employed to extract visual features from RS images, capturing spatial patterns and textures effectively. For Semantic Feature Extractor, a BERT-based model was used to derive semantic features from non-visual data sources, such as building records and socio-economic indicators, enabling nuanced understanding of land cover. Feature Fusion Block: To harmonize the extracted visual and semantic features, a dedicated fusion block was implemented, facilitating the effective merging of information from diverse sources. Classification Model: A classification layer was added to the model to perform the final inference of land cover types.
Our deep learning-based approach yielded promising results in terms of urban land cover assessment. The models demonstrated strong classification performance across multiple land cover categories, showcasing their ability to discern complex urban features and patterns. The experiments were conducted on the carefully curated dataset using the proposed model architectures and experimental settings outlined in previous sections. The key performance metrics assessed include:
Overall Accuracy (OA): This metric measures the proportion of correctly classified land cover instances, providing an overall assessment of model performance.
TP = True Positives (correctly classified positive instances)
TN = True Negatives (correctly classified negative instances)
FP = False Positives (incorrectly classified positive instances)
FN = False Negatives (incorrectly classified negative instances)
Precision: Precision quantifies the accuracy of positive land cover predictions, indicating the model’s ability to minimize false positives.
Recall: Recall assesses the model’s capacity to identify all relevant land cover instances, capturing the rate of true positives.
F1-Score: The F1-score strikes a balance between precision and recall, offering a comprehensive evaluation of classification performance.
The evaluation is based on ablation experiments conducted at multiple scales to scrutinize the contributions of each feature extraction component. Furthermore, we compare the performance of our proposed DL method with contemporary state-of-the-art methods to establish the superiority of our approach. To this end, we conducted experiments at various spatial scales, with sample sizes set at 96×96, 128×128, and 164×164 pixels, consistent with established settings from previous research [(He et al., 2020; Yao et al., 2022)]. Each sample encapsulated an image block of the specified size alongside associated spatial size semantic text. Through rigorous ablation experiments, we ascertained that our method excels in leveraging feature-based algorithms. Notably, our approach achieved a significant enhancement in the F1- score metric, demonstrating a remarkable increase over previous methods. This signifies the pivotal role played by our feature extraction modules in elevating the accuracy and robustness of urban land cover classification.
Urban areas, as complex interdependent systems, present a myriad of challenges and opportunities influenced by a multitude of factors. Traditional approaches to understanding urban challenges often adhere to predefined criteria, potentially over-looking critical issues. In contrast, this study advocates for a more independent and objective urban diagnostics process, one that holistically and exploratively identifies challenges and their intricate interactions within a city. Our research has introduced a novel Deep Learning method tailored for urban land cover classification, leveraging high-resolution aerial images and complementary datasets. The proposed multilayer DL architecture, anchored by an ensemble of Convolutional Neural Networks (CNNs), represents a significant advancement in the field. Key architectural elements include the adoption of Dense Net-inspired network connectivity patterns, inception modules for adaptable receptive fields, and spatial and channel relation-enhanced blocks to capture global context information effectively.
Furthermore, our approach introduces parallel multi-kernel deconvolution modules and spatial paths in the de-coding stage to facilitate the aggregation of features across multiple scales. Extensive ablation studies conducted on the Lyon dataset underscore the effectiveness of these proposed modules in enhancing the accuracy and robustness of urban land cover classification. The urban diagnostics methodology outlined in this paper, originally applied to a large city like Lyon, demonstrates its adaptability and scalability to various population densities and geographical scales. This diagnostic framework is globally transferable and scalable, offering a common geographic foundation to harmonize datasets and documents with diverse characteristics. Additionally, it allows for the extension of diagnostic processes to other regions, both smaller and larger, via a layered geographic approach. This scalability and adaptability empower urban planners and policymakers to apply our methodology to a wide array of urban settings, aiding in the identification and prioritization of challenges.
While our research lays a strong foundation for feature-based urban land cover assessment, several avenues for future work emerge:
a) Integration of Additional Data Sources: Incorporating new data sources, such as real-time sensor data and social media feeds, can further enrich our diagnostic process, enabling realtime monitoring of urban dynamics.
b) Semantic Segmentation: Exploring advanced semantic segmentation techniques could enhance the granularity of land cover classification, allowing for the identification of specific urban features with higher precision.
c) Explainable AI (XAI): Developing XAI techniques for our DL model can provide transparent insights into classification decisions, fostering trust and interpretability in urban diagnostics.
d) Scalability to Megacities: Extending our methodology to megacities with diverse and complex challenges could uncover unique insights into the urban fabric.
e) Cross-Disciplinary Collaboration: Collaborating with experts from various fields, including environmental science, economics, and sociology, can lead to a more comprehensive understanding of urban challenges.
We think that our research paves the way for an innovative and adaptable urban diagnostics framework powered by deep learning. As cities continue to evolve, embracing data-driven approaches like ours can facilitate informed decision-making, sustainable development, and the creation of more resilient urban environments.
Bio chemistryUniversity of Texas Medical Branch, USA
Department of Criminal JusticeLiberty University, USA
Department of PsychiatryUniversity of Kentucky, USA
Department of MedicineGally International Biomedical Research & Consulting LLC, USA
Department of Urbanisation and AgriculturalMontreal university, USA
Oral & Maxillofacial PathologyNew York University, USA
Gastroenterology and HepatologyUniversity of Alabama, UK
Department of MedicineUniversities of Bradford, UK
OncologyCirculogene Theranostics, England
Radiation ChemistryNational University of Mexico, USA
Analytical ChemistryWentworth Institute of Technology, USA
Minimally Invasive SurgeryMercer University school of Medicine, USA
Pediatric DentistryUniversity of Athens , Greece
The annual scholar awards from Lupine Publishers honor a selected number Read More...
We know the financial complexity of Individual read more...
The annual scholar awards from Lupine Publishers honor a selected number read more...