Data Management in Excel

Part 2 the Spreadsheet Design series will focus on Data Management, an often overlooked component of any large Excel project. This may well be because Excel is not meant to manage data; its primary purpose is to display data or derive analysis off of it.

Most corporations have elaborate data warehouse systems to properly store and maintain organizational data. Data warehouses can store data from countless systems—sometimes hundreds, or even thousands—in a centralized server system which include IT architecture to allow powerful database query and join functionality. While these systems are great for maintaining data, they are not so good at analyzing it. This is where Excel typically comes in for most users.

The approaches in this blog do not advocate for using Excel as your central database tool or means to maintain organizational data. That would be grossly inefficient and prone to countless errors. However, there is a small space in the data management world for Excel to exist. That is creation, maintenance, and output of forecasted and projected data.

A typical data warehouse will contain data from various CRM, HR, and accounting systems which tend to look back-in-time at existing data, providing a current snapshot or trend. Even if your organization has software to analyze past data and provide simple projections, they tend to be unreliable when pulling inputs that are specific to your organization or market. Trend analysis may be fine if an organization is non-cyclical, but for any industries which are heavily impacted by market conditions or business changes, the storage of more dynamic forward-looking data may be required.

In summary, the position of this blog is follows:

  • Import historical data into Excel from external sources whenever available.
  • Create data to facilitate forward looking assumptions, whether it be for forecasting, analysis, or budgeting.
  • Use Excel to produce dynamic reports to easily satisfy the demands of management.

This part of the Spreadsheet Design series will provide a general overview of how you can implement a strong data management system for your Excel application. This includes developing a user-friendly interface for importing data from other sources, creating simple data management tools, and an overview of how to properly output and review the data. To summarize, data management has three primary components: inputs, maintenance, and outputs.

From an Excel design perspective, there are a few techniques for establishing a strong data management system within your Excel application.

Separate the user from the data.

How many times have you gone into existing data (usually from another system) to tweak categories or other attributes to make things work for a report? Making these tweaks directly in the dataset generally a very bad practice, and it’s something you should not let users or your Excel application do. By using data input forms, listboxes, and data validators, you can eliminate the need to modify the actual data. All of the data management posts in this series related to keeping the user and data separate.

Formulas should be used very sparingly within datasets.

It’s a good practice to avoid using formulas in data altogether. Raw data is static and should never change. Even if the data produced by the formulas will never change, the formulas may add significant calculation overhead to your workbook, especially for large datasets.

If a formula is absolutely necessary to create additional data within existing datasets, it’s a good idea to copy/paste values, or even use a VBA script to hard code the calculations.

Ensure proper formatting.

In datasets, make sure that numbers and dates are not stored as text. Text values tend to be difficult to work with in analysis.

Follow normal form, sometimes.

If you are not familiar with database normalization, understanding it may provide you with some ideas on how to structure data in your Excel application. While database normalization provides good guidance for maintaining a database, Excel is not a database management system. Normal form works well with relational databases where SQL is built to efficiently utilize it. Attempting to follow normal form with Excel will have mixed results, as may approaches will lead to additional system overhead. Regardless, normal form will provide a general framework on how to structure data within your Excel application.

Data Management Exercises

Adding New Data to Datasets in Excel

This blog’s first venture into VBA will provide an overview of a user-friendly method of inputting data into datasets.

Data Validation with Excel’s Built-in Data Validation Tool

Many developers will admit that when it comes to creating a powerful application, more effort can be committed to controlling user inputs than actual planning and design.

Data Validation with Excel’s Built-In Formulas

Today’s post continues the discussion on data validation, introducing a new approach that provides some level of flexibility.

Data Validation with VBA

With VBA, it’s possible to develop user input forms with validation features and responses similar to those found on professional websites and desktop applications.

Building a Custom Listbox in Excel without VBA

As discussed in previous posts, separating users from raw datasets is a cornerstone of excellent spreadsheet design. Today’s post provides a non-VBA solution to displaying a dynamic listbox of data to users.

Building a Custom Listbox in Excel with VBA

While the previous post provided an overview on a non-VBA solution to creating custom listboxes, implementing listboxes with VBA provides access to robust customization options and features.

Modifying Data in Spreadsheets with VBA

Modifying data on a spreadsheet while maintaining the integrity of the dataset can be difficult. Learn about a technique that can help manage modifications.

Leave a Reply