Distributed topology-based terrain analysis using Apache Spark
High-quality and extensive LiDAR data support and enhance large-scale terrain modeling. Triangulated Irregular Networks (TINs) are widely used representations for modeling a terrain topology, even on irregularly sampled raw data. One major application of TINs is for the efficient extraction of morphological features. Morphological features are defined by critical points on the terrain (such as peaks, valleys, and ridges) and their connectivity, which are fundamental for terrain analysis in many applications, including urban analysis, forest monitoring, and bathymetric simulations. However, existing data structures for TINs experience a prohibitive memory cost when computing the connectivity of the terrain and when extracting its morphological features, especially on large datasets comprising billions of points. We address this problem by proposing a novel framework for efficient and scalable topological analysis of large TINs using Apache Spark. The proposed framework, called Morse- Spark, is based on a novel data structure for encoding a TIN on distributed frameworks and integrates distributed algorithms inspired by Discrete Morse theory to extract connectivity relations, critical points, and their regions of influence. To prove the effectiveness and scalability of such a framework, we compare Morse-Spark against three well-established software libraries for topology-based TIN analysis. Our experimental evaluation with real-world TINs shows that Morse-Spark can effectively handle datasets at least 20 times bigger than any other approach for topological analysis.