Clean Data: The Role of HR in Collection, Context and Collaboration

Clean Data: The Role of HR in Collection, Context and Collaboration

This article was updated on July 17, 2018.

HR leaders depend on data. Hiring practices, employee evaluations and salary discussions all rely on the availability and accuracy of data. The problem? While big data tools are great at collecting and storing large information sets, they're not so great at "cleaning" this data, leaving HR pros with the potential to improve best practices but lacking a viable resource.

Why does clean data matter so much? As noted by Marc Rind, vice president of product development and chief data scientist for ADP, this data is "understandable, repeatable and normalized" — but it's not naturally occurring. How can HR empower both department aims and help the organization at large when it comes to leveraging clean data?

Why Clean Data?

Much like IT, the role of HR has shifted. No longer a niche department, HR is a critical part of long-term business strategy and revenue discussions. As a result, clean data is paramount. HR must be able to understand at scale what employees are doing, what skills they possess and what this means for overall performance. According to Rind, this lets organizations "quantify what those people do and relate it to actual business outcomes." In turn, businesses can move past performance levels or skill sets as isolated metrics and find their relevance to corporate strategy at large.


The first step in clean data? Collection. While it's impossible to obtain fully cleaned data from initial capture, HRVoice points out that there are ways to minimize potential issues, such as using pull-down menus to omit typos, implementing automating checks to detect inaccurate data entry and regularly auditing for errors to backtrack and discover root causes.

For example, enterprises might encounter an issue with inaccurate skills reporting. Tracing it back could uncover confusion on the part of staff concerning how to classify particular skill sets. By addressing this issue up front, overall data cleanliness goes up, while time spent "fixing" this data is reduced. As noted by Harvard Business Review, projects can spend up to 12 percent of their time addressing data quality issues, which may result in cost overruns.


Next up? Context. In HR, context is everything — improving business functions isn't possible without an accurate picture of how specific roles and jobs impact the organization as whole. Rind points to the use of machine learning to "fill in the blanks," such as when it comes to job titles. Not all titles are clear about employee function, and many positions report to multiple managers. Salary data is easily misconstrued without information about current market conditions, employee experience and time spent with the organization. By tapping the "signals" that naturally occur around specific data points, it's possible to add context and improve HR efficacy.


It's hard to know when specific data sets will become valuable. As noted by TechRepublic, however, many organizations get carried away when it comes to tossing "bad" data and cleaning current data. While the result is streamlined and actionable, tossing nonideal information eliminates the ability to mine this data for insight. Rind suggests adding "as much understandable information to the data, and you can then normalize it after the fact." It's part art, part science — HR leaders need to identify the cutoff between "bad" and "useless" data, then throw out the latter and improve the former so it can be used to collaborate and inform decision-making down the line. Put simply? While it might not be spotless right now, imperfect data could offer clear benefits down the line.

HR departments run on data — but they run better with clean data. While HR pros can't be expected to acts as internal data scientists, they're nonetheless instrumental when it comes to empowering data collection, improving data context and amassing data for long-term collaboration.

To see what the three-step success model is for turning people data into business impact, check out the Better Decisions Start With HR Insights guide.