Boulder Datasets In Need Of Migration

In 2018, Xentity was tasked with aiding the city of Boulder in migrating existing datasets from Open Colorado CKAN to Boulder’s open data catalog through the establishment of federated harvesting between the two parties. CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing, sharing, finding and using data. In the case of Boulder’s open data catalog, we see the sharing of various kinds of data from categories such as recreation, permits and licenses, and public safety. Open Data provides value by making this information available to the public and providing them the knowledge of data results such as surveys, fire response areas and times, and bicycle traffic counts. What the public does with that data is up to them, but with this in mind, it becomes extremely important to have that data available and in a specific site such as a city’s open data catalog.

Throughout this project, Xentity was to install extensions including a FileStore CKAN extension. They had to run a custom script to migrate data and later a full migration. Metadata needed to be harvested and then pushed from Boulder CKAN. All previous datasets on Open Colorado needed to be removed. Data from Boulder CKAN needed to be federated to Open Colorado CKAN. Training with technical staff needed to be held for a knowledge transfer of the process and scripts. And finally, Xentity had to perform final testing to ensure the desired results for user experience.

Moving Old Data To A New Data Catalog

The City of Boulder contracted Xentity to migrate existing datasets from the Open Colorado CKAN instance to the new Boulder Open Data Catalog instance of CKAN and setup federated harvesting between Boulder and OpenColorado.org. This would move Boulder-themed datasets to a specialized, easily accessible catalog for Boulder-based data.  With a new open data catalog released, existing datasets needed to be properly migrated to the new catalog. Data catalogs are typically necessary to any organization because it synthesizes all the details about an organization’s data assets across multiple data dictionaries by organizing them into a simple, easy to digest format. So, when a new catalog is created, data assets need to be moved over quickly and thoroughly to keep the data together and accessible. It becomes a very delicate process, involving the installation of extensions, testing credentials, and setting up automated data harvesting for the sake of efficiency

In response to these issues, Xentity installed, tested and setup a FileStore CKAN extension to serve as a storage pool for CKAN’s datasets. Regarding data migration, they ran migration tests before providing the full migration to Boulder CKAN. Then they proceeded to remove previous datasets from OpenColorado, set up harvesting capabilities for metadata to be pushed, and set up a harvest job to federate data from place to place. And finally, Xentity performed final testing to make sure the desired results were provided.

New And Up To Date

Through Xentity’s efforts in migrating data to a specified open data catalog specifically for Boulder-based data, the city of Boulder now sees automated syncing CKAN to CKAN harvesting when moving data from catalog to catalog. With the requirements all met, the migration was successful. And with all data from the previous catalogs have been now moved over to the new catalog for public use, the public can access data on Boulder in a single, accessible place.