Two Primary Geospatial Architecture Issues with Cognitive Processing

Veracity is still struggling to support integrated information products – need to learn from challenges to move to knowledge products

The common theme is the various geospatial feature classes that are managed are typically managed in silo’s which cause permanent and major veracity issues in projection, granularity, level of detail, provenance, intended use vs. requested use, and many more when attempting to integrate the data to answer knowledge questions. For instance, geospatial data for use in inventory and asset management which have become reference data stewards for major datasets typically will not have the accuracy and data quality to support modeling, simulation, accurate mapping, etc. and definitely not artificial intelligence rules to provide strong enough confidence-based intervals.

Xentity has been architecting in the Government space for over 45 data programs and multiple national, agency, state, and local geospatial programs. In doing this, Xentity has developed whitepapers, architecture methods, and common architectural patterns to address designing with knowledge first and working backwards to data to establish that accurate, usable pattern. The following breaks the down the needs to start with knowledge first in a Land & Resource Management (KID Paper) business focus area.

The issues with geospatial is that to achieve geospatial knowledge integration, the data needs to be designed with the knowledge question first, build information management production and product generation from that, and align the planning and data acquisition for common reference data and end-user thematic static or streaming data to achieve the knowledge product.

This needs to be addressed first prior to being considered as a reliable source for cognitive processing.Most Cognitive systems are not addressing the geospatial where dimension.

Geospatial data complexity beyond xy point data creates much broader challenges, but much more value

Most are focused on content, math, language, culture and temporal analytics. Those that are entering into geospatial are taking on the point data challenge. The approach changes when moving to integrating disparate feature network/line data (i.e. utility, hydrographic, roads, railroad) due to issues noted above, and the cognitive rule-base issues and possibly engine processing concepts change.

 

Geospatial Data Types
Reference Data – Static

(i.e. landscape topographic)

thematic data -static

(i.e. specific agency mission data)

thematic data -dynamic

(i.e. IoT, Sensors)

Data Dimension Complexity
VolumeLowMediumHigh
VarietyLowHighMedium
VeracityMediumMediumHigh
VelocityLowLowHigh
ValueHighMediumMedium
Data Type Complexity
PointBasicBasicIntermediate
LineHighHighHigh
PolygonHighHighHigh
Raster/pixelIntermediateIntermediateIntermediate

Its clearly more difficult when then furthering into polygonal data. Raster/Pixel data can vary in complexity from intermediate to high mostly due to variance in veracity, but high data quality raster data could be considered intermediate for feature extraction in real-time (i.e. autonomous cars for moving object classification).

All of these remain high when considering tempo-spatial analysis both in current state comparing against future modeling or leveraging the historical data which can vary, but usually have quality degradation which change of limit use in knowledge or cognitive or even usual information processing.

The next steps will be interesting to see how cognitive systems incorporate these two issues of handling the veracity issues of geospatial data and the various complexity principles noted.

So, while moving to faster batch, service, and performance architectures will likely be needed to support these concepts, the rules-based engine will need to understand how to handle the complexities beyond xy point data first.