Boulder County Logo

Boulder Datasets In Need Of Migration

In 2018, Xentity was tasked with aiding the city of Boulder in migrating existing datasets from Open Colorado CKAN to Boulder’s open data catalog through the establishment of federated harvesting between the two parties. CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data.

In the case of Boulder’s open data catalog, we see the sharing of various kinds of data from categories such as recreation, permits and licenses, and public safety. Open Data provides value by making this information available to the public. Also, providing them the knowledge of data results such as surveys, fire response areas and times, and bicycle traffic counts. What the public does with that data is up to them, but with this in mind, it becomes extremely important to have that data available and in a specific site such as a city’s open data catalog.

Throughout this project, Xentity was to install extensions including a FileStore CKAN extension. They had to run a custom script to migrate data and later a full migration. Metadata needed to be harvested and then pushed from Boulder CKAN. All previous datasets on Open Colorado require removal. Data from Boulder CKAN required federating to Open Colorado CKAN. Xentity also required training with technical staff for a knowledge transfer of the process and scripts. And finally, Xentity had to perform final testing to ensure the desired results for user experience.

Moving Old Data To A New Data Catalog

The City of Boulder contracted Xentity to migrate existing datasets from the Open Colorado CKAN instance to the new Boulder Open Data Catalog instance of CKAN and setup federated harvesting between Boulder and OpenColorado.org. This would move Boulder-themed datasets to a specialized, easily accessible catalog for Boulder-based data.  With a new open data catalog released, existing datasets required proper migration to the new catalog.

Data catalogs are typically necessary to any organization. They synthesize all the details about an organization’s data assets across multiple data dictionaries. They accomplish this by organizing them into a simple, easy to digest format. So, when creating a new catalog, data assets require quick and thorough movement. This keeps the data together and accessible. It becomes a very delicate process, involving the installation of extensions, testing credentials, and setting up automated data harvesting for the sake of efficiency

In response to these issues, Xentity installed, tested and setup a FileStore CKAN extension. This extension serves as a storage pool for CKAN’s datasets. Regarding data migration, they ran migration tests before providing the full migration to Boulder CKAN. Then they removed previous datasets from OpenColorado. Next, they set up harvesting capabilities to push metadata. Subsequently, they set up a harvest job to federate data from place to place. And finally, Xentity performed final testing to make sure the desired results were provided.

New And Up To Date

Through Xentity’s efforts in migrating data to a specified open data catalog specifically for Boulder-based data, the city of Boulder now sees automated syncing CKAN to CKAN harvesting when moving data from catalog to catalog. With the requirements all met, the migration was successful. And with all data from the previous catalogs have been now moved over to the new catalog for public use, the public can access data on Boulder in a single, accessible place.

Updates – CKAN Work in Year 4 and Year 5

Through a new 4th and 5th year of the project, Xentity is continuing to provide CKAN technology platform migration, harvesting support, and data stewardship operations and maintenance support for the DOI.