Data-driven decision making is all the rage these days and it makes sense. Studies show that organizations leveraging big data see an average increase in profits of 8-10% and a reduction in operational costs of around 10%. Yet, an astonishing number of companies – 72% to be exact – have yet to forge a data culture. Why is it the case? Why do executives readily admit the enormous benefits of data and analytics, yet fail in their efforts to drive any meaningful change within their organizations? To make a really long answer short, it is due to the perception of data-driven transformations as costly, both in terms of the financial and human resources required.

What many executives do not seem to fully grasp is that embedding data-driven decision making into the organizational fabric is a multi-step process that can start with incremental changes. Building a strong data foundation requires developing certain habits, and today we will talk about one such habit (perhaps the most crucial one) – keeping your data clean. 

In one of our earlier articles (The ABCs of Transitioning Out of Spreadsheets) we discussed how you can transition your data out of a spreadsheet. One of the steps we strongly recommend taking is cleaning your data before moving it onto the next platform. Research shows that data scientists spend 80% of their time finding, cleansing, and organizing data. Only 20% is spent on actually analyzing it and deriving valuable insights. If down the line you want to improve the efficiency of your data scientists, start cleaning your data now. Clean your data while it is still within the capacities of a spreadsheet and is relatively manageable. Here are a few tips to get you started.

1. Remove duplicates

First things first, eliminate redundant data. This should always be your first step as it will save you time when applying other data cleansing techniques. Here’s how to easily find and remove duplicate data:

  • Select the entire spreadsheet
  • Click on ‘Data’ → ‘Data Tools’ → ‘Remove Duplicates’
  • Identify the columns you want to focus on or select ‘Select All’
  • Click on ‘Remove Duplicates’
  • You will be provided with a count of how many values will be deleted and how many will remain.

2. Fill in the blank spots

Too many cells with missing data can significantly skew your findings. If there is a way to find the missing bits of information, do so and add it to the empty cells. If not, you should at least enter a ‘Not Available’ or ‘0’ value into them. If you are working with a relatively small spreadsheet, identifying such cells is pretty easy. However, it can be quite painful with larger spreadsheets if done manually, which is why you can save time and sanity using the following shortcut:

  • Select a few columns/rows or the entire spreadsheet. 
  • Click on ‘Edit’ → ‘Find’ → ‘Go To’ → ‘Go To Special’
  • From the list that appears, select ‘Blanks’ and click ‘OK’
  • Once all the blank cells are selected, enter the value you want to appear in them by pressing ‘Control’ + ‘Enter’. Tada! You’re done!

 3. Identify and fix all broken formulas and errors

If your spreadsheet is slightly advanced, it probably contains multiple formulas and numerous dependent and independent variables. You want to make sure that none of the dependencies are broken and your spreadsheet is free of errors. Here’s an easy way to spot errors:

  • Select a few columns/rows or the entire spreadsheet. 
  • Click on ‘Edit’ → ‘Find’ → ‘Go To’ → ‘Go To Special’
  • From the list that appears, select ‘Formulas’ → ‘Errors’ → ‘OK’
  • All the cells with broken formulas will be selected. Highlight them in a different color and start fixing those errors.

4. Turn cells into rows and vice versa 

Sometimes it makes sense to reorganize your data and have data that’s currently in rows appear in columns and vice versa. Here’s how you can do it in under a minute:

  • Select the rows and columns that you want inverted
  • In the upper-left corner (next to the formula line) look at how many rows and columns you’ve selected
  • Find an empty spot on your spreadsheet and select the necessary number of cells (e.g. if your data is currently in 3 rows and 8 columns, you want to select 8 rows and 3 columns worth of empty cells)
  • In the formula line type in ‘=TRANSPOSE’ and select the range (the original data you’re inverting the structure of)
  • Press ‘CTRL’ + ‘SHIFT’ + ‘ENTER’. That’s it.

5. Eliminate extra space within cells

There’s a hassle-free way to identify all cells that have an extra space in between words and to fix the issue.

  • Select a few columns/rows or the entire spreadsheet. 
  • Click on ‘Edit’ → ‘Find’ → ‘Replace’
  • Type in the double space in the ‘Find what’ section, a single space in the ‘Replace with’ section, and click ‘Replace All’

6. Run a spell check

Seems like an obvious step, doesn’t it? Maybe that’s why so many companies forget to do it. 

  • Select the rows/columns you want spell checked or the entire spreadsheet
  • Go to ‘Tools’ → ‘Spelling’
  • Decide what you want to do with the misspelled words: add them to a dictionary, ignore, or correct.

7. Split text in one cell into multiple cells

Say your spreadsheet contains a list of everyone you’d like to send a mass email to. You’d like to personalize the email by addressing everyone by their first name. The problem is that your spreadsheet lists a person’s first and last name in the same cell. Here’s how you can separate them with very little effort.

  • Select the columns/rows you want to split
  • Go to ‘Data’ → ‘Text to Columns’
  • Select ‘Delimited’ → ‘Next’ and check every way in which the text within a cell is currently separated (e.g. ‘John Smith’ is separated by a space; ‘Smith, John’ – by a comma, etc.)
  • Click ‘Next’ → Check that the preview of the new format looks good → ‘Finish’

8. Sort your data to identify errors

Try sorting your data by various columns. This will change the view of your spreadsheet and sort it according to the values in the selected column. Use this exercise to scroll through your data and spot irregularities. Repeat the exercise by sorting your data by other columns to increase your chances of spotting errors.

  • Select the spreadsheet
  • Click on ‘Data’ → ‘Sort’
  • Select the column you want to sort your data by and click ‘OK’

These are just a few spreadsheet hacks that will allow you to clean up your data and maintain it clean. Learn these shortcuts and turn them into a habit, and you’ll be surprised by how much more organized and neat your spreadsheet data becomes. Remember that clean data is an absolute must if you want to derive accurate insights from it. So start small. Start now. Start with what you have.

Leave a Reply

Your email address will not be published. Required fields are marked *