Duplicate Entries Removed from Data.gov Catalog

In the last few days, you may have noticed a significant drop in the total number of datasets displayed on the Data.gov catalog. The decrease was not due to an actual reduction in the number of datasets, but the removal of duplicate entries for many datasets. This was not as a result of duplicates in the sources of the metadata in the Data.gov catalog (agency metadata harvest sources), but due to a problem with the Data.gov harvester. The duplicate entries were removed during the weekend of February 23-24 and archives of the duplicates have been retained to allow agencies to fully review these changes.

Harvesting for most federal agency dataset sources occurs on a daily, automated basis. As agencies update their metadata inventories, the Data.gov catalog is updated at the next scheduled harvest. As a result, the total number of datasets on the Data.gov catalog normally fluctuates. In the coming weeks and months, we expect to see an increase in datasets provided by a greater number of federal agencies under the recently enacted Open Government Data Act (Title II of the Foundations for Evidence-Based Policymaking Act, Pub. L. 114-435). We will provide updates on our progress in working with the federal agencies in providing access to more federal data under the statute.

Comments are closed.