Introduction

One of the unique parts of the Count notebook is that you can build queries using our own "notebook SQL" dialect. It offers the full power of the SQL language but it is optimized for a notebook interface. You can switch between this language, drag and drop and standard SQL at any time.

Syntax

When in code mode, you'll notice a series of SQL-like expressions. This is SQL optimized for a notebook analysis. Specifically, we have removed:

  • Keywords (SELECT, WHERE, FROM, etc.),
  • The need for nested statements or temporary tables,
  • FROM statements are implied by the other keywords (e.g. SELECT title FROM Movie -> Movie.title)

Lexical Structure

Like SQL, Notebook SQL uses expressions to create a query.


Expressions are:

[function] + TableName.column

where the function is optional.


Examples of expressions

  • a column from a table: Movie.director
  • a function applied over a column: sum(Movie.revenue), year(Movie.release_date)
  • a custom column: "Miami" "city" -> creates a new column called "city" where every row is the value "Miami"
  • conditional statements: case when Movie.collection is null then 'not_franchise' else 'franchise' end -> returns a new column whose value is either 'franchise' or 'not_franchise' depending on the value of the collection column
  • a combination of columns: Movie.revenue - Movie.budget


Query Syntax:

Similar to SQL, when adding expressions to a query, you should do so in the following order:


[Select Expressions], [Filter expressions], [order by expression asc/desc] [limit n]


Example:

Select Expressions: Movie.director, sum(Movie.revenue),

Filter Expressions: Movie.best_picture = 'nominated'

Order By Expression: order by sum(Movie.revenue) desc


which will return the total revenue for each director, for films that have been nominated for best picture at the Oscars. The resulting table will be sorted from the director with the highest total revenue to the smallest.


These are expressions that reflect which rows should/should not be included in the query.

  • Movie.director

These are the expressions that are going to be visible in your resulting table. Make sure you include it in this section if you want to see it in the chart!

  • Conditional columns


  • Columns from the original tables:

Select Expressions [SELECT]

These are the expressions that are going to be visible in your resulting table. Make sure you include it in this section if you want to see it in the chart!

These can be:

  • standard column selections: Movie.director,
  • column index selections: Movie[1:3] will select the first 3 columns in the Movie table
  • aggregated columns: sum(Movie.revenue)
  • transformed columns: year(Movie.release_date)
  • conditional columns: case when Movie.collection is null then "not_franchise" else "franchise" end
  • combinations of columns: Movie.revenue - Movie.budget


Filter Expressions [WHERE]

Filter expressions dictate what rows should/should not be included in the executing the query.

They are composed of expressions + operators + condition.


Example:

  • Movie.best_picture = 'nominated'
  • year(Movie.release_date) >= 2000
  • Movie.collection in ("The Avengers Collection", "James Bond Collection")
  • Movie.best_picture != 'not_nominated', Movie.language!= 'en'
  • Movie.vote_count is not null


Operators:

Equality:

=
!=
is null
is not null

Inequality:

>
<
>=
<=

Part of Whole:

in (option 1, ..., option n)
not in (option 1, ..., option n)


And vs Or:

Filters will meet all conditions unless otherwise specified with an OR:

Movie.revenue is not null OR Movie.budget is not null will return movies that either have revenue OR budget as not nulls.


Order By Expressions [ORDER BY]

Sorting the resulting table is always doable by clicking the column header in the table, but if you wanted to type it out, you can always do:

order by expression [asc/desc]

ASC will put things in ascending order, DESC will put things in descending order.


Examples:

  • order by sum(Movie.revenue) desc
  • order by Movie.title asc

Limit Expressions [LIMIT]

After your order by expression, you can add a limit expression that will only show a subset of results. Simply add the following to the end of your query:

limit n

where n is the number of rows returned.

Examples:

  • order by sum(Movie.revenue) desc limit 1
  • order by Movie.title asc limit 100

Examples:


Did this answer your question?