Data.gov logo

Upgrading and Supporting DOI Data.gov

Data.gov is a web service managed and hosted by the U.S. General Services Administration (GSA). Since it’s launch in May 2009, data.gov has been the open data repository for US federal, state, local and tribal government information made available to the public.

In 2010, Xentity supported data.gov’s initial architecture efforts along with geodata.gov migration, initial ideation phases of geoplatform.gov, and early outreach and communication efforts and events. Xentity has also supported the Department of Interior’s (DOI) data.doi.gov platform in its migration to a data.gov Comprehensive Knowledge Archive Network (CKAN) 2.3 platform. CKAN is a web-based open-source management system for the storage and distribution of open data. So for a project like data.gov, operating on a CKAN-based platform is a must. Xentity also supported ongoing operations. This included harvest coordination with data.gov.

Major Issues Related To Harvesting And Configuration

At the time this project began, DOI had at least 12 major issues related to data harvesting. Namely, the data.doi.gov website contained two to three times more records than DOI had provided via CKAN. Also required was the need to reduce agile-managed backlog of open tasks and ensure the data and operational integrity of the catalog. At the time, current CKAN 2.3 installation and configuration would have benefited from upgrading to CKAN 2.8. Xentity hoped to help upgrade from 2.3 to 2.8. Because they were forced to use the older version, Xentity was challenged to support 27 existing extensions. Each of which required testing validation. 

To combat these issues, Xentity provided the DOI with a Data.doi.gov CKAN 2.3 platform, a data.gov baseline install, and operations and maintenance of the CKAN 2.8 development environment. Xentity also provided Catalogue Service for Web (CSW) development and deployment support for data.gov. The support was based on the CSW specification that defines common interfaces. Xentity used these interfaces to discover, browse, and query metadata about source data, services, and other potential resources. They also provided system configuration documentation for CKAN, leveraging their experience with open data and CKAN.

Xentity expected additional issues for DOI data harvesting would likely arise post upgrade. So they planned to respond to these additional issues as well. The intent was that the DOI would benefit from proactive planning by Xentity. For example, the current Error reporting and handling of the harvester could benefit from improved error reporting of which Xentity in 2017 researched along with DOI, USGS, and other bureaus.  These identified improvements could benefit the federal community as a whole on trialing and handling errors during harvesting.

Smooth Sailing For Data.Gov

Xentity’s support resolved the aforementioned 12 major issues, along with the aforementioned Post Upgrade DOI issues on data.gov that had been anticipated. CKAN has now been provided and upgraded to 2.8 for data.gov. CKAN 2.8 contained a simpler DevOps-based installation and other improvements compared to the previous versions. The upgrade also provided the most stable version to data.gov. Xentity provided configuration documents to help support the updates and upgrades. These efforts have ensured a smoother-running, up-to-date system for data.gov.