SQL

Hello, SQL!

Let’s dive into the world of SQL where we don’t just address tables, we also join them!

With marimo, you can mix-and-match both Python and SQL. To create a SQL cell, you first need to install some additional dependencies, including duckdb. Obtain these dependencies with

pip install 'marimo[sql]'

Creating SQL cells

Once the required dependencies are installed, you can create SQL cells in one of the following ways:

  • right click the Add Cell buttons on the left of a cell;
  • click the Convert to SQL button in the cell menu
  • click the Add SQL Cell at the bottom of the page;

Python representation

marimo is still just Python, even when using SQL. Here is an example of how marimo embeds SQL in Python in its file format:

output_df = mo.sql(f"SELECT * FROM my_table LIMIT {max_rows.value}")

Notice that we have an output_df variable in the cell. This is a resulting Polars DataFrame (if you have polars installed) or a Pandas DataFrame (if you don’t). One of them must be installed in order to interact with the SQL result.

The SQL statement itself is an formatted string (f-string), so this means they can contain any valid Python code, such as the values of UI elements. This means your SQL statement and results can be reactive! 🚀

Querying dataframes with SQL

TipData sources panel

Click the database “barrel” icon in the left toolbar to see all dataframes and in-memory tables that your notebook has access to.

Let’s take a look at a SQL cell. The next cell generates a dataframe called df.

Next, we create a SQL query, refercing the Python dataframe df directly.

-- This SQL cell is special since we can reference existing dataframes in the global scope as a table in the SQL query. For example, we can reference the `df` dataframe in the global scope, which was defined in another cell using Python.

SELECT * FROM df;

-- By default, the output variable starts with an underscore (`_df`), making it private to this cell. To access the query result in another cell, change the name of the output variable.

From Python to SQL and back

You can create SQL statements that depend on Python values, such as UI elements:

-- Change the dropdown to see the SQL query filter itself!
--
-- Here we use a duckdb function called `starts_with`:
SELECT * FROM df WHERE starts_with(token, '{token_prefix.value}')

-- Notice that we named the output variable `result`

Since we named the output variable above result, we can use it back in Python.

CSVs, Parquet, Postgres, and more …

We’re not limited to querying dataframes. We can also query an HTTP URL, S3 path, or a file path to a local csv or parquet file.

-- or
SELECT * FROM 's3://my-bucket/file.parquet';
-- or
SELECT * FROM read_csv('path/to/example.csv');
-- or
SELECT * FROM read_parquet('path/to/example.parquet');

With a bit of boilerplate, you can even read and write to Postgres, and join Postgres tables with dataframes in the same query. For a full list of supported data sources, check out the duckdb extensions and our example notebook on duckdb connections.

For this example, we will query a dataframe loaded from a csv.

-- Download a CSV and create an in-memory table; this is optional.
CREATE OR replace TABLE cars as (
    SELECT * FROM cars_df
);

-- Query the table
SELECT * from cars;
SELECT * FROM cars
WHERE
    Cylinders >= {cylinders_dropdown.value[0]}
    AND
    Cylinders <= {cylinders_dropdown.value[1]}
    AND
    ORIGIN = '{origin_dropdown.value}'
Back to top