Configure additional table settings

This document shows you how to configure additional table settings in a table definition file. With Dataform core, you can define pre_operations and post_operations to execute a SQL statement before or after table creation. You can also override table settings, such as database or schema, and disable table creation.

Before you begin

  1. In the Google Cloud console, go to the Dataform page.

    Go to Dataform

  2. Select or create a repository.

  3. Select or create a development workspace.

  4. Create a table.

Required roles

To get the permissions that you need to configure additional table settings, ask your administrator to grant you the Dataform Editor (roles/dataform.editor) IAM role on workspaces. For more information about granting roles, see Manage access to projects, folders, and organizations.

You might also be able to get the required permissions through custom roles or other predefined roles.

Override table settings

You can override the schema, database, and name of a selected table.

By default, a table follows the schema and database configuration you set in dataform.json. The name of a table is the same as the name of the table definition SQLX file.

To override the schema and name of a selected table, follow these steps:

  1. Go to your development workspace.

  2. In the Files pane, expand definitions/.

  3. Open a SQLX table definition file.

  4. In the config block, enter the following code snippet:

     {
       schema: "OVERRIDDEN_SCHEMA",
       database: "OVERRIDDEN_DATABASE",
       name: "OVERRIDDEN_NAME"
     }
    

    Replace the following:

    • OVERRIDDEN_SCHEMA: the BigQuery dataset in which you want to create the table

    • OVERRIDDEN_DATABASE: the ID of the BigQuery project in which you want to create the table

    • OVERRIDDEN_NAME: the name for the table, different from the SQLX table definition filename

  5. Optional: Click Format.

Reference a table with an overridden table name

  • To reference a table with an overridden table name, in the ref function, enter the overridden table name set in name: "".

The following code sample references a table with name overridden to overridden_name:

  SELECT * FROM ${ref("overridden_name")}

Define a SQL statement to be executed before table creation

You can configure Dataform to execute one or more SQL statements before creating a selected table in BigQuery. To execute a SQL statement before Dataform creates a selected table, add your statement to the pre_operations block in the table definition SQLX file.

To create a custom SQL statement executed before Dataform creates a specific table, follow these steps:

  1. Go to your development workspace.

  2. In the Files pane, expand definitions/.

  3. Open a SQLX table definition file.

  4. Outside the config block, enter pre_operations { ... }.

  5. Inside pre_operations { ... }, add your SQL statement.

  6. Optional: To add multiple statements, separate them with ---.

  7. Optional: Click Format.

The following code sample shows a pre_operations statements that creates a temporary function that can be used in the select statement:

  pre_operations {
    CREATE TEMP FUNCTION AddFourAndDivide(x INT64, y INT64)
      RETURNS FLOAT64
      AS ((x + 4) / y);
  }

Define a SQL statement to be executed after table creation

You can configure Dataform to execute one or more SQL statements after creating a selected table in BigQuery. To execute a SQL statement after Dataform creates a selected table, add your statement to the post_operations block in the table definition SQLX file. You can add multiple SQL statements to the post_operations block.

To create a custom SQL statement executed after Dataform creates a specific table, follow these steps:

  1. Go to your development workspace.

  2. In the Files pane, expand definitions/.

  3. Open a SQLX table definition file.

  4. Outside the config block, enter post_operations { ... }.

  5. Inside post_operations { ... }, add your SQL statement.

  6. Optional: Click Format.

The following code sample shows post_operations statements that grant groups access to the created table:

    post_operations {
      GRANT `roles/bigquery.dataViewer`
      ON
      TABLE ${self()}
      TO "group:[email protected]", "user:[email protected]"
    }

Disable table creation

To stop Dataform from creating a selected table in BigQuery, you can disable the table in its SQLX table definition file. Dataform keeps a disabled table in the dependency graph, but does not compile and create it. This can be useful, for example, if a table fails and you don't want your whole workflow to fail while you fix the issue.

To disable a table, follow these steps:

  1. Go to your development workspace.

  2. In the Files pane, expand definitions/.

  3. Select a SQLX table definition file.

  4. In the config block of the file, enter disabled: true.

  5. Optional: Click Format.

The following code sample shows a disabled table:

  config {
    type: "table",
    disabled: true
  }

  select * from ${ref("source_data")}

What's next