It’s not uncommon for organizations that have migrated to Google Cloud Platform (GCP) to encounter sticker shock.
After all, one of the major selling points of GCP—and all public clouds—is the ability to reduce your costs compared to managing your own datacenter on-premises or at a co-location.
There are a number of reasons why costs on GCP may be higher than expected, including:
Still, while these issues can certainly add to your overall costs, in our experience the main culprit for skyrocketing expenses in GCP is also one of its most powerful features: BigQuery.
It’s also a single tool for both storage and compute in GCP, and is not only compatible with a wide range of popular tools (Looker, Azure Power BI, Tableau) but it can be accessed via an array of methods, including:
Combined, these and other features add up to a uniquely powerful analytics platform for enterprises. But in order to reap the benefits of BigQuery—and avoid running up costs—you need to use it as it’s been designed to be used.
In other words, the ways you’re used to running queries in a traditional database are not going to be effective in BigQuery. In fact, they can be outright detrimental to your business.
The first thing to know about BigQuery is that the amount of data you store, the number of queries you run—these are what you’re charged for.
The second thing to know is that, unlike traditional databases, BigQuery is column-based.
Because BigQuery is column-based, it’s able to return results very quickly compared to row-centric traditional databases.
The tradeoff, though, is that unless you limit the number of columns you ask BigQuery to read, it’s easy to overspend. That old database trick of simply using SELECT *? will only force BigQuery to scan every byte within the entire data set whether or not you want it to—at a cost you definitely don’t want.
One of the best ways to start cutting your costs is to use the built-in cost control measures of BigQuery.
For specific projects, you can set a soft limit on usage by setting up a billing alert that will hit your administrator’s inbox as you near your monthly usage. You can also specify limits per project or per user using custom quotas.
Beyond setting limitations, BigQuery provides you with relatively easy cost-monitoring tools via the Cloud Console dashboard.
Another critical step is to use the query validator or dry-run to estimate the costs of your query before setting it into motion.
As for setting up your data sets for queries, here are some important tips to follow:
Together, these tips and tools will go a long way toward helping you avoid unexpected costs in GCP.
They will also help you fully leverage BigQuery for rapid, quality results as you mine your data for insights.
For more on avoiding GCP sticker shock, check out our offerings for governance and cost control. You can also contact one of our experts to get started right away.