How to Use Table Functions in BigQuery

Jash Bhatt
5 min readNov 3, 2023

Introduction

If you are a BigQuery user, you probably know what views are. Views are virtual tables that are defined by a SQL query. They are useful for saving complex queries, hiding sensitive data, and creating logical layers of data.

But views have some limitations. For example, you can’t pass parameters to views, so you have to write different queries for different scenarios. You also can’t use views in DDL statements, such as CREATE TABLE.

That’s where table functions come in handy. Table functions are user-defined functions that return a table. You can use them anywhere that you can use a table, such as in the FROM clause of a SELECT statement, or in a JOIN clause.

In this story, I will show you how to create and use table functions in BigQuery, and why they are more fun and profitable than views.

Creating Table Functions

To create a table function, you use the CREATE TABLE FUNCTION statement. A table function contains a query that produces a table. The function returns the query result.

For example, let’s say you have a public dataset called bigquery-public-data.usa_names.usa_1910_current that contains baby names data from the US Social Security Administration. You want to create a table function that takes an INT64 parameter for the year and returns the most popular names for that year.

You can write the following table function:

CREATE OR REPLACE TABLE FUNCTION mydataset.popular_names_by_year(y INT64) AS (
SELECT year, name, SUM(number) AS total
FROM `bigquery-public-data.usa_names.usa_1910_current`
WHERE year = y
GROUP BY year, name
ORDER BY total DESC
LIMIT 10
);

This table function takes an INT64 parameter called y and uses it in the WHERE clause of the query. The query returns the top 10 names by total number for the given year.

Using Table Functions

You can call a table function in any context where a table is valid. For example, you can call the table function in the FROM clause of a SELECT statement:

SELECT *
FROM mydataset.popular_names_by_year(2020)

This query returns the following result:

You can also join the output from a table function with another table. For example, let’s say you have another public dataset called bigquery-public-data.wikipedia.pageviews_2020 that contains Wikipedia pageviews data for the year 2020. You want to join this table with the table function output and see how many pageviews each name had on Wikipedia in 2020.

You can write the following query:

SELECT n.name, n.total AS babies, SUM(w.views) AS pageviews
FROM mydataset.popular_names_by_year(2020) AS n
JOIN `bigquery-public-data.wikipedia.pageviews_2020` AS w
ON n.name = w.title
GROUP BY n.name, n.total
ORDER BY pageviews DESC

This query joins the table function output with the Wikipedia pageviews table on the name column and aggregates the views by name. The query returns the following result:

As you can see, the table function makes it easy to reuse the same query logic with different parameters and join it with other tables.

Why Table Functions are Better than Views

Table functions have some benefits over views that make them more fun and profitable to use. Here are some of them in a nutshell:

  • Parameters: You can pass parameters to table functions, which makes them more flexible and dynamic.
  • DDL statement: You can use table functions in DDL statements, such as CREATE TABLE.
  • UDFs: You can use table functions in user-defined functions (UDFs).
  • Scripting: You can use table functions in scripting.

Table functions let you do more with less code, and save you time and money.

Limitations

Table functions are very useful, but they are not perfect. They have some common restrictions that you should be aware of before using them. Here are some of the limitations of table functions in BigQuery.

  • Parameters must be scalar values. Table functions in BigQuery cannot take tables as input parameters. This means that you cannot pass a table name or a subquery result to a table function. You can only pass simple values, such as numbers, strings, booleans, or dates.
  • The query body must be a SELECT statement and cannot modify anything. For example, data definition language (DDL) and data manipulation language (DML) statements are not allowed in table functions. You cannot create, alter, or delete tables or other objects inside a table function. You can only select data from existing tables or other sources.
  • Table functions must be stored in the same location as the tables they reference. This means that you cannot use a table function that is stored in one region to query data from another region. You have to make sure that your table function and your data are in the same location, such as US, EU, or ASIA.

Conclusion

Table functions are user-defined functions that return a table. They behave similarly to views, but they have some advantages. You can use them anywhere that you can use a table, such as in the FROM clause of a SELECT statement, or in a JOIN clause.

Table functions allow you to pass parameters, use them in DDL statements, UDFs, and scripting. They make your queries more flexible, dynamic, and reusable.

If you want to learn more about table functions, you can check out the official documentation.

I hope you enjoyed this story and learned something new. If you did, please share it with your friends and colleagues who might be interested in BigQuery and table functions. To keep the data-driven party going, do clap and follow me here on Medium and on LinkedIn.

Thank you for reading and have a great day!

--

--