SQL - jupyter-book-marimo

Hello, SQL!¶

Let’s dive into the world of SQL where we don’t just address tables, we also join them!

With marimo, you can mix-and-match both Python and SQL. To create a SQL cell, you first need to install some additional dependencies, including duckdb. Obtain these dependencies with

pip install 'marimo[sql]'

Creating SQL cells¶

Once the required dependencies are installed, you can create SQL cells in one of the following ways:

right click the Add Cell ::lucide:circle-plus:: buttons on the left of a cell;
click the Convert to SQL ::lucide:database:: button in the cell menu ::lucide:ellipsis::
click the Add SQL Cell at the bottom of the page;

Python representation¶

marimo is still just Python, even when using SQL. Here is an example of how marimo embeds SQL in Python in its file format:

output_df = mo.sql(f"SELECT * FROM my_table LIMIT {max_rows.value}")

Notice that we have an output_df variable in the cell. This is a resulting Polars DataFrame (if you have polars installed) or a Pandas DataFrame (if you don’t). One of them must be installed in order to interact with the SQL result.

The SQL statement itself is a formatted string (f-string), which means it can contain any valid Python code, such as the values of UI elements. This means your SQL statement and results can be reactive! 🚀

Querying dataframes with SQL¶

/// Tip | “Data sources panel”

Click the database "barrel" icon in the left toolbar to see all dataframes and in-
memory tables that your notebook has access to.

///

Let’s take a look at a SQL cell. The next cell generates a dataframe called df.

Next, we create a SQL query, referencing the Python dataframe df directly.

From Python to SQL and back¶

You can create SQL statements that depend on Python values, such as UI elements:

Since we named the output variable above result, we can use it back in Python.

CSVs, Parquet, Postgres, and more ...¶

We’re not limited to querying dataframes. We can also query an HTTP URL, S3 path, or a file path to a local csv or parquet file.

-- or
SELECT * FROM 's3://my-bucket/file.parquet';
-- or
SELECT * FROM read_csv('path/to/example.csv');
-- or
SELECT * FROM read_parquet('path/to/example.parquet');

With a bit of boilerplate, you can even read and write to Postgres, and join Postgres tables with dataframes in the same query. For a full list of supported data sources, check out the duckdb extensions and our example notebook on duckdb connections.

For this example, we will query an HTTP endpoint for a CSV.