Data cleaning using OpenRefine
The process of generating data can be messy, especially when data are hand collected by multiple people. This month's Data and Donuts will discuss how to wrangle messy tabular data using OpenRefine (http://openrefine.org/), a free, open source tool for working with messy data. We will discuss the concepts of faceting, clustering, and splitting data. We will also show you how to export scripts to help you automate the cleaning process.