Good data fuels successful apps App developers have one shot with app users to get them to keep an app […]
Creating Predictability in the Government’s Geospatial Data Supply Chain…
This article expands upon the presentation on What does geodata.gov mean to data.gov presented at the First International Open Government Data Conference in November 2010 and as well the GAO releases report on FGDC Role and Geospatial Information which emphasizes similar focus on getting the data right.
Would it be valuable to establish predictability in the government geospatial data supply chain?
As examples, what if one could be guaranteed that every year or two the United States Census Bureau produced in cooperation with state and local authorities or that HHS produced a high quality updated county boundary dataset would produce a geocoded attributed list of all the hospitals in the country validated by the health care providers. Of course it would be valuable and could provide the means to minimize redundant data purchasing, collection and processing.
If the answer is “of course”, then why haven’t we done so already?
It is a simple concept, but one without an implementation strategy. Twenty years after the establishment of Circular A-16 and FGDC metadata content standards, we are still looking at metadata from a dataset centric point of view -that is for “what has been” and not for “what will be”. Knowing what is coming and when it is coming enables one to plan.
The model can be shifted to the “what will be” perspective, if we adopt a system’s driven data lifecycle perspective. Which would mean we look at Data Predictability and Crowdsourcing.
It may seem ironic, in the age of crowd sourcing, to argue for predictable data lifecycle releases of pedigreed information and seemingly deny the power of the crowd. But the fact remains, the civilian government entities in the US systematically collect and produce untold volumes of geospatial information (raster, vector, geo-code able attributes) through many systems including earth observation systems, mission programs using human capital, business IT systems, regulatory mandates, funding processes and cooperative agreements between multiple agencies and all levels of government. The governments in the US are enormous geospatial data aggregators but much of this work is accomplished in systems that owners and operators view as special but not “spatial”.
An artificial boundary or perception has been created that geospatial data is different than other types of data and by extension so are the supporting systems.
There remain challenges with data resolution, geometry types and attribution etc., but more importantly there is a management challenge here. All of these data aggregation systems have or could have a predictable data lifecycle accompanied by publishing schedules and processing authority metadata. Subsequently, the crowd and geospatial communities could use its digital muscle to complement these systems resources if that is their desire and all government programs would be informed by having predictable data resources.
What is required is communicating the system’s outputs, owner and timetables.
Once a data baseline is established, the geospatial users and crowd could determine the most valuable content gaps and use their resources more effectively; in essence, creating an expanded and informed community. To date, looking for geospatial information is more akin to an archaeological discovery process than searching for a book at the library.
What to do?
Not to downplay the significance of the geospatial and subject matter experts publishing value added datasets and metadata into clearinghouses and catalogs, but we would stand to gain much more by determining which finite number of systems aggregate and produce the geospatial data and creating a predictable publishing calendar.
In the current environment of limited resources, Xentity seeks to support efforts such as the FGDC, data.gov, and other National Geospatial Data Assets and OMB to help shift the focus on these primary sources of information that enable the community of use and organize the community of supply. This model would include publishing milestones from both past and futures that could be used to evaluate mission and geospatial end user requirements, allow for crowd sourcing to contribute and simplify searching for quality data.
This wildfire season has definitely seen a different trend. Not necessarily a massive overall national increase in fire, but definitely […]
Silicon Review has selected Xentity to be included in their Top 50 Fastest Growing Tech Companies List. Silicon described its […]
The workshop theme and community output notes may be of high interest. The focus was more on how Federal Geodata “Operations” / Assets can improve and help Geoscientists through improved interagency coordination.
There are excellent breakout notes on roadblocks, geoscience perspective, concrete steps, etc. across the following topics on the following URL (Google Docs under “notes” links): http://tw.rpi.edu/web/Workshop/Community/GeoData2014/Agenda
Day 1 Breakouts (Culture/Management)
Governmental open data
Interagency coordination of geodata – progress and challenges
Feedback from the academic and commercial sectors
Collaborating environment and culture building
Day 2 Breakouts (Tech)
data citation and data integration frameworks – technical progress
Experience and best practices on data interoperability
Connections among distributed data repositories – looking forward
The workshop has some fruits coming out of it.
- About 50 people. NOAA and USGS on Fed side primarily.
- Pushing forward on agenda to see if we have progressed on ideation pragmatism since Geodata2011.
- Focus is on Cultural and Financial issues limiting inter-agency connection.
- Term agile government came up often… with some laughs, but some defenders (Relates to our smartleangovernment.com efforts with ACT-IAC)
- Scientists hear Architecture as Big IT contracts and IT infrastructure, not process improvement, data integration, goal/mission alignment, etc., so there is clear vernacular issues.
- FGDC and tons of other standards/organizing bodies seen as competing and confusing
- data.gov and open data policy hot topic (Seen as good steps, low quality data) – “geoplatform” mentioned exactly “zero” times (doh!)
- geo data lifecycle primarily on the latter end of cycle (citations, discovery, publication for reusability, credit) but not much on the project coordination, data acq coordination, no marketplace chatter, little on coordinating sensor investment
- General questions on how scientists were interested on how intel groups can be reached
- Big push on ESIP
- Concrete steps suggested were best practices to agencies and professors
- Data Management is not taught, so what do we expect? You get what you pay for.
- Finally, big push on how to tie grassroots efforts and top-down efforts together – grassroots agreed we need to showcase more, earlier, and get into the communities top-down folks are looking at.
- Not high Federal representation there, and agreed with limited Government travel budgets, we need to bring these concepts to them and meet on their meetings, agendas, conferences, circuits, and push these concepts and needs.
Again, lots of great notes from breakouts on roadblocks, geoscience perspective, concrete steps, etc. across the following topics on the following URL (Google Docs under “notes” links):
Questions we posed in general sessions:
- Of Performance, Portfolio, Architecture, Evangelism, Policy, or other what is more important to the GeoScientist that needs to be addressed in order to improve inter-agency coordination
- You noted you want to truly disrupt or re-invent the motivation and other aspects in the culture, what was discussed related to doing such – inter-agency wiki commons a la intelipedia? Gamification and how incents resource Management MMORPGs – i.e. transparent and fun way to incentivize data maturity? Crowdsourcing a la mechanical turk to help in cross-agency knowledge-sharing? Hackathon/TC Disrupt Competitions to help in showcase? Combindation – i.e. Gamify metadata lifecycle with crowd model?
- After registering data, metadata, good citations, and doing all the data lifecycle management, and if we are to “assume internet”, who is responsible for the SEO Rank to help people find scientific data in the internet – who assures/enhances schema.org registrations? who aligns signals to help with keywords, and thousands of other potential signals? especially in response to events needing geoscience data? who helps push data.gov and domain catalogs to be harvested by others?