Introduction

Each SQL dialect has slightly different syntax. This article goes through the specifics of Count's SQL syntax.

SELECT list

Syntax:

SELECT [{ ALL | DISTINCT }] { [ expression. ]* [ EXCEPT ( column_name [, ...] ) ] [ REPLACE ( expression [ AS ] column_name [, ...] ) ] | expression [ [ AS ] alias ] } [, ...]

The SELECT list defines the columns that the query will return. Expressions in the SELECT list can refer to columns in any of the from_items in its corresponding FROM clause.


Select *

SELECT * FROM TMDB_Movies

This will return all columns from the TMDB_Movies table.


Select expression

SELECT title, status, budget FROM TMDB_Movies

This will return the columns title, status, and budget from the TMDB_Movies table.


Select DISTINCT

SELECT DISTINCT status FROM TMDB_Movies

This will return only the unique status values from TMDB_Movies table.


Select EXCEPT

SELECT * EXCEPT unique_row_number FROM TMDB_Movies

A SELECT * EXCEPT statement specifies the names of one or more columns to exclude from the result. All matching column names are omitted from the output.


Select REPLACE

SELECT * REPLACE (revenue *0.8 as revenue) FROM TMDB_Movies

A SELECT * REPLACE statement specifies one or more expression AS identifier clauses. Each identifier must match a column name from the SELECT * statement. In the output column list, the column that matches the identifier in a REPLACE clause is replaced by the expression in that REPLACE clause. A SELECT * REPLACE statement does not change the names or order of columns. However, it can change the value and the value type.



See the Alias section to see how to rename your SELECT list.


FROM Clause

Syntax:

from_item: { table_name [ [ AS ] alias ] [ FOR SYSTEM_TIME AS OF timestamp_expression ] | join | ( query_expr ) [ [ AS ] alias ] | field_path | { UNNEST( array_expression ) | UNNEST( array_path ) | array_path } [ [ AS ] alias ] [ WITH OFFSET [ [ AS ] alias ] ] | with_query_name [ [ AS ] alias ] }

The FROM clause indicates the table or tables from which to retrieve rows, and specifies how to join those rows together to produce a single stream of rows for processing in the rest of the query.


TableName

SELECT * FROM TMDB_Movies


See the Alias section to see how to rename your FROM tables.

Join Types

Syntax:

join: from_item [ join_type ] JOIN from_item [ ON bool_expression | USING ( join_column [, ...] ) ] join_type: { INNER | CROSS | FULL [OUTER] | LEFT [OUTER] | RIGHT [OUTER] }

The JOIN clause merges two from_items so that the SELECT clause can query them as one source. The join_type and ON or USING clause (a "join condition") specify how to combine and discard rows from the two from_items to form a single source


[INNER] JOIN

An INNER JOIN, or simply JOIN, effectively calculates the Cartesian product of the two from_items and discards all rows that do not meet the join condition. "Effectively" means that it is possible to implement an INNER JOIN without actually calculating the Cartesian product.


CROSS JOIN

CROSS JOIN returns the Cartesian product of the two from_items. In other words, it combines each row from the first from_item with each row from the second from_item. If there are M rows from the first and N rows from the second, the result is M * N rows. Note that if either from_item has zero rows, the result is zero rows.


FULL [OUTER] JOIN

A FULL OUTER JOIN (or simply FULL JOIN) returns all fields for all rows in both from_items that meet the join condition.FULL indicates that all rows from both from_items are returned, even if they do not meet the join condition.OUTER indicates that if a given row from one from_item does not join to any row in the other from_item, the row will return with NULLs for all columns from the other from_item.


LEFT [OUTER] JOIN

The result of a LEFT OUTER JOIN (or simply LEFT JOIN) for two from_items always retains all rows of the left from_item in the JOIN clause, even if no rows in the right from_item satisfy the join predicate.LEFT indicates that all rows from the left from_item are returned; if a given row from the left from_item does not join to any row in the right from_item, the row will return with NULLs for all columns from the right from_item. Rows from the right from_item that do not join to any row in the left from_item are discarded.


RIGHT [OUTER] JOIN

The result of a RIGHT OUTER JOIN (or simply RIGHT JOIN) is similar and symmetric to that of LEFT OUTER JOIN.



ON clause

The ON clause contains a bool_expression. A combined row (the result of joining two rows) meets the join condition if bool_expression returns TRUE.


USING clause

The USING clause requires a column_list of one or more columns which occur in both input tables. It performs an equality comparison on that column, and the rows meet the join condition if the equality comparison returns TRUE.

ON and USING Equivalency

The ON and USING keywords are not equivalent, but they are similar. ON returns multiple columns, and USING returns one.


Where

Syntax:

WHERE bool_expression

The WHERE clause filters out rows by evaluating each row against bool_expression, and discards all rows that do not return TRUE (that is, rows that return FALSE or NULL).


You cannot reference column aliases from the SELECT list in the WHERE clause.


Examples:

SELECT title start_wars_films FROM TMDB_Moves WHERE title LIKE '%Star Wars'
SELECT title successful_sw_films FROM TMDB_Moves WHERE contains(title,'Star Wars') AND revenue >= 1e6
SELECT title reboot_films FROM TMDB_Movies WHERE release_date>= '2000-01-01'


Group By

Syntax:

GROUP BY { expression [, ...]}

The GROUP BY clause groups together rows in a table with non-distinct values for the expression in the GROUP BY clause. For multiple rows in the source table with non-distinct values for expression, the GROUP BY clause produces a single combined row. GROUP BY is commonly used when aggregate functions are present in the SELECT list, or to eliminate redundancy in the output. The data type of the expression must be groupable.


SELECT avg(revenue), year(release_date) year_released FROM TMDB_Movies GROUP BY year(release_date)


GROUP BY clauses may also refer to aliases. If a query contains aliases in the SELECT clause, those aliases override names in the corresponding FROM clause.

SELECT avg(revenue), year(release_date) year_released FROM TMDB_Movies GROUP BY year_released


The GROUP BY clause can refer to expression names in the SELECT list. The GROUP BY clause also allows ordinal references to expressions in the SELECT list using integer values. 1 refers to the first expression in the SELECT list, 2 the second, and so forth. The expression list can combine ordinals and expression names.

SELECT avg(revenue), year(release_date) year_released FROM TMDB_Movies GROUP BY 2


Having

Syntax:

HAVING bool_expression

The HAVING clause is similar to the WHERE clause: it filters out rows that do not return TRUE when they are evaluated against the bool_expression.As with the WHERE clause, the bool_expression can be any expression that returns a boolean, and can contain multiple sub-conditions.The HAVING clause differs from the WHERE clause in that:

  • The HAVING clause requires GROUP BY or aggregation to be present in the query.
  • The HAVING clause occurs after GROUP BY and aggregation, and before ORDER BY. This means that the HAVING clause is evaluated once for every aggregated row in the result set. This differs from the WHERE clause, which is evaluated before GROUP BY and aggregation.

The HAVING clause can reference columns available via the FROM clause, as well as SELECT list aliases. Expressions referenced in the HAVING clause must either appear in the GROUP BY clause or they must be the result of an aggregate function:


SELECT count(movie_id), collection FROM TMDB_Movies HAVING count(movie_id)>3


If a query contains aliases in the SELECT clause, those aliases override names in a FROM clause.

SELECT count(movie_id) movie_count, collection FROM TMDB_Movies HAVING movie_count>3


Order By

Syntax:

ORDER BY expression [{ ASC | DESC }] [, ...]

The ORDER BY clause specifies a column or expression as the sort criterion for the result set. If an ORDER BY clause is not present, the order of the results of a query is not defined. Column aliases from a FROM clause or SELECT list are allowed. If a query contains aliases in the SELECT clause, those aliases override names in the corresponding FROM clause.Optional Clauses

  • ASC | DESC: Sort the results in ascending or descending order of expression values. ASC is the default value.
SELECT title, revenue FROM TMDB_Movies ORDER BY revenue DESC

SELECT title, release_date FROM TMDB_Movies ORDER BY release_date ASC


Window

Syntax:

WINDOW named_window_expression [, ...] named_window_expression: named_window AS { named_window | ( [ window_specification ] ) }

A WINDOW clause defines a list of named windows. A named window represents a group of rows in a table upon which to use an analytic function. A named window can be defined with a window specification or reference another named window. If another named window is referenced, the definition of the referenced window must precede the referencing window.


SELECT title, vote_average, collection, LAST_VALUE(title)   OVER (collection_window) AS most_popular FROM TMDB_MOVIES WINDOW collection_window AS (   PARTITION BY collection   ORDER BY vote_average   ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)


Set Operators

Syntax:

UNION { ALL | DISTINCT } | INTERSECT DISTINCT | EXCEPT DISTINCT

Set operators combine results from two or more input queries into a single result set. You must specify ALL or DISTINCT; if you specify ALL, then all rows are retained. If DISTINCT is specified, duplicate rows are discarded.If a given row R appears exactly m times in the first input query and n times in the second input query (m >= 0, n >= 0):

  • For UNION ALL, R appears exactly m + n times in the result.
  • For UNION DISTINCT or UNION as shorthand, the DISTINCT is computed after the UNION is computed, so R appears exactly one time.
  • For INTERSECT DISTINCT, the DISTINCT is computed after the result above is computed.
  • For EXCEPT DISTINCT, row R appears once in the output if m > 0 and n = 0.
  • If there are more than two input queries, the above operations generalize and the output is the same as if the inputs were combined incrementally from left to right.

The following rules apply:

  • For set operations other than UNION ALL, all column types must support equality comparison.
  • The input queries on each side of the operator must return the same number of columns.
  • The operators pair the columns returned by each input query according to the columns' positions in their respective SELECT lists. That is, the first column in the first input query is paired with the first column in the second input query.
  • The result set always uses the column names from the first input query.
  • The result set always uses the supertypes of input types in corresponding columns, so paired columns must also have either the same data type or a common supertype.
  • You must use parentheses to separate different set operations; for this purpose, set operations such as UNION ALL and UNION DISTINCT are different. If the statement only repeats the same set operation, parentheses are not necessary.


SELECT * FROM (
SELECT sum(revenue)revenue, month_of_year(release_date) month_released, year(release_date)
FROM TMDB_Movies
WHERE year(release_date)= 2019)
UNION ALL(
SELECT sum(revenue)revenue, month_of_year(release_date) month_released, year(release_date)
FROM TMDB_Movies
WHERE year(release_date)= 2018)

UNION

The UNION operator combines the result sets of two or more input queries by pairing columns from the result set of each query and vertically concatenating them.

INTERSECT

The INTERSECT operator returns rows that are found in the result sets of both the left and right input queries. Unlike EXCEPT, the positioning of the input queries (to the left versus right of the INTERSECT operator) does not matter.

EXCEPT

The EXCEPT operator returns rows from the left input query that are not present in the right input query.


Limit

Syntax:

LIMIT count

LIMIT specifies a non-negative count of type INT64, and no more than count rows will be returned. LIMIT 0 returns 0 rows.If there is a set operation, LIMIT is applied after the set operation is evaluated.


Select title top_grossing_films, revenue FROM TMDB_Movies ORDER BY revenue DESC LIMIT 3


Aliases

An alias is a temporary name given to a table, column, or expression present in a query. You can introduce explicit aliases in the SELECT list or FROM clause, or BigQuery will infer an implicit alias for some expressions. Expressions with neither an explicit nor implicit alias are anonymous and the query cannot reference them by name.


Optional [AS]

SELECT DISTINCT title AS TITLE FROM TMDB_Movies movies_data


SELECT * FROM Singers as s, Songs as s2 ORDER BY Singers.LastName;  // INVALID.


SELECT list aliases

Aliases in the SELECT list are visible only to the following clauses:

  • GROUP BY clause
  • ORDER BY clause
  • HAVING clause


Did this answer your question?