What is Low Base Data and How Do You Hide It? | KnowledgeHound Support Center

This support article describes how the Hide Low Bases feature works within survey data analysis and how it is presented in different chart types.

Link to the Glossary

Please refer to the above Glossary while reading to understand some common terms used in this article.

Low Base Threshold

All client sites have an attribute called the “low base threshold” that has been set by the client. This value is set so that the user can have a visual indication when the data they are viewing has a low base size. To review, a base is defined by the count of respondents who saw a question response. If the base size of a particular data point is below the threshold, it does not mean it is invalid, it simply means that it represents a very small sample set and should be assessed with that fact in mind. The current default value for the platform is set to 100.

Let's see how Hide Low Bases works in a two variable spreadsheet case

For example, let's say a question 'What is your Gender?' was cross tabbed by the question “What is your favorite color?”. For this client, the low base threshold has been set to 40.

Looking at the Base row you can see that the “Blue”, “Green” and “Red” columns all have bases above 40, whereas the “Yellow” and “Orange” columns both have bases below 40. In the spreadsheet visualization we represent low base data by coloring it in red and placing the low base icon in the base row cell. As a user viewing this spreadsheet, you can see that all the data in the “Yellow” and “Orange” columns have low base sizes and that data should be analyzed with caution.

Looking at that same example, if the user turns on “Hide Low Bases”, all low base data is removed from the visualization so that the user can see only the most valuable data.

In this visualization the “Yellow” and “Orange” columns have been completely removed.

Let's look at a more complicated spreadsheet example

Things can get more complicated when more variables are added, but the concepts and the visual indications remain the same. When viewing a spreadsheet with two row variables (“What is your Gender?” and “Language”) and one column variable (“What is your favorite color?”), you get the following visual:

In this visual, all data in the “Red” column is low base (base size less than 40) and so the entire column is in red text. The “Green” column is a bit more interesting here. The top half of the data, representing respondents who selected both “English” and “Green”, is not low base, that is, each cell has a base size above 40. The bottom half of this column (“Green” and “French”) shows data that does have a low base, that is, each cell has a base size below 40.

Looking at that same example, if the user turns on “Hide Low Bases”, all low base data is removed from the visualization so that the user can see only the most valuable data.

In this case the entire “Red” column is dropped, but only the bottom half of the data from the “Green” column is hidden. When a user turns on “hide low base” only data that has a low base size is removed/hidden.

Bar/Column Chart

Let’s view the exact same data in a bar chart.

In the KnowledgeHound platform, low base data is represented using a diagonal striped fill in bar/column charts. You can see here that all of the “Red” series data is low base, and so each red bar is filled with the diagonal red stripes. Since only the “Language equals French” data is low base for the “Green” series, only half of these bars have the low base pattern applied.

Looking at that same example, if the user turns on “Hide Low Bases”, all low base data is removed from the visualization so that the user can see only the most valuable data.

In this visual all of the low base data has either been removed or hidden from view. The “Red” series was all low base data, so the entire series has been removed from the graph. In the case of the “Green” series, the “English” data remains, since it was not low base, but the “French” data has been removed. The entire “Blue” series is not low base, so it remains untouched.

Stacked Chart

Let’s view the exact same data in a stacked column chart.

In this visual both “Red” bars (“English - Red” and “French - Red”) are shown to have low base sizes, as is the “French Green” bar. All the data in these categories contains a base size below 40. Similar to the bar chart, KnowledgeHound denotes this using the diagonal striped fill for those bars.

Looking at that same example, if the user turns on “Hide Low Bases”, all low base data is hidden within the visualization so that the user can see only the most valuable data.

Here the “English - Red”, “French - Red” and “French - Green” bars have all been hidden in the visual.

Line Chart

Let’s view similar data in a line chart.

In this chart you can identify the low base data points by the hollow markers and red data labels. You can see that all data points in the “Male-Red” and “Female-Red” series are low base. Additionally, the “Male-Green” and the “Female-Green” data points for “Q2-2020” also have a low base.

Looking at that same example, if the user turns on “Hide Low Bases”, all low base data is removed from the visualization so that the user can see only the most valuable data.

You can see the “Male-Red” and “Female-Red” series are completely removed, and the “Q2-2020” data points for “Male-Green” and the “Female-Green” have also been dropped from the visual.

Summary

To wrap up, every client site in KnowledgeHound has an attribute called “low base threshold”, which is a number set by the client as a general threshold for categorizing low base data. In every chart type the KnowledgeHound platform has a unique way of displaying low base data, so that the user can be aware of its presence in a visual. If a user would like to completely remove or hide the low base data, they can simply turn on the “Hide Low Base” feature using the button in the action bar, and that data will be removed from any visual.