SQL

Hello, SQL!

Let’s dive into the world of SQL where we don’t just address tables, we also join them!

With marimo, you can mix-and-match both Python and SQL. To create a SQL cell, you first need to install some additional dependencies, including duckdb. Obtain these dependencies with

pip install 'marimo[sql]'

Creating SQL cells

Once the required dependencies are installed, you can create SQL cells in one of the following ways:

right click the Add Cell buttons on the left of a cell;
click the Convert to SQL button in the cell menu
click the Add SQL Cell at the bottom of the page;

Python representation

marimo is still just Python, even when using SQL. Here is an example of how marimo embeds SQL in Python in its file format:

output_df = mo.sql(f"SELECT * FROM my_table LIMIT {max_rows.value}")

Notice that we have an output_df variable in the cell. This is a resulting Polars DataFrame (if you have polars installed) or a Pandas DataFrame (if you don’t). One of them must be installed in order to interact with the SQL result.

The SQL statement itself is an formatted string (f-string), so this means they can contain any valid Python code, such as the values of UI elements. This means your SQL statement and results can be reactive! 🚀

Querying dataframes with SQL

Data sources panel

Click the database “barrel” icon in the left toolbar to see all dataframes and in-memory tables that your notebook has access to.

Let’s take a look at a SQL cell. The next cell generates a dataframe called df.

Next, we create a SQL query, refercing the Python dataframe df directly.

-- This SQL cell is special since we can reference existing dataframes in the global scope as a table in the SQL query. For example, we can reference the `df` dataframe in the global scope, which was defined in another cell using Python.

SELECT * FROM df;

-- By default, the output variable starts with an underscore (`_df`), making it private to this cell. To access the query result in another cell, change the name of the output variable.

From Python to SQL and back

You can create SQL statements that depend on Python values, such as UI elements:

-- Change the dropdown to see the SQL query filter itself!
--
-- Here we use a duckdb function called `starts_with`:
SELECT * FROM df WHERE starts_with(token, '{token_prefix.value}')

-- Notice that we named the output variable `result`

Since we named the output variable above result, we can use it back in Python.

CSVs, Parquet, Postgres, and more …

We’re not limited to querying dataframes. We can also query an HTTP URL, S3 path, or a file path to a local csv or parquet file.

-- or
SELECT * FROM 's3://my-bucket/file.parquet';
-- or
SELECT * FROM read_csv('path/to/example.csv');
-- or
SELECT * FROM read_parquet('path/to/example.parquet');

With a bit of boilerplate, you can even read and write to Postgres, and join Postgres tables with dataframes in the same query. For a full list of supported data sources, check out the duckdb extensions and our example notebook on duckdb connections.

For this example, we will query a dataframe loaded from a csv.

-- Download a CSV and create an in-memory table; this is optional.
CREATE OR replace TABLE cars as (
    SELECT * FROM cars_df
);

-- Query the table
SELECT * from cars;

SELECT * FROM cars
WHERE
    Cylinders >= {cylinders_dropdown.value[0]}
    AND
    Cylinders <= {cylinders_dropdown.value[1]}
    AND
    ORIGIN = '{origin_dropdown.value}'

%23%20We%20use%20pandas%20for%20the%20CSV%20fetch%20because%20DuckDB%20in%20Pyodide%20does%20not%20support%20HTTPFS%20extensions%0Adef%20read_cars_df()%3A%0A%20%20%20%20import%20pandas%20as%20pd%0A%0A%20%20%20%20try%3A%0A%20%20%20%20%20%20%20%20return%20pd.read_csv(%22https%3A%2F%2Fdatasets.marimo.app%2Fcars.csv%22)%0A%20%20%20%20except%20Exception%3A%0A%20%20%20%20%20%20%20%20%23%20datasets.marimo.app%20returns%20403%20from%20time%20to%20time%2C%20so%20we%20add%20this%20as%20a%20fallback%0A%20%20%20%20%20%20%20%20return%20pd.DataFrame(%0A%20%20%20%20%20%20%20%20%20%20%20%20%5B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Name%22%3A%20%22Honda%20Civic%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Origin%22%3A%20%22Asia%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Cylinders%22%3A%204%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_City%22%3A%2030%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_Highway%22%3A%2038%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Name%22%3A%20%22Toyota%20Camry%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Origin%22%3A%20%22Asia%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Cylinders%22%3A%204%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_City%22%3A%2028%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_Highway%22%3A%2039%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Name%22%3A%20%22Ford%20F-150%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Origin%22%3A%20%22USA%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Cylinders%22%3A%206%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_City%22%3A%2020%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_Highway%22%3A%2026%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Name%22%3A%20%22BMW%20328i%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Origin%22%3A%20%22Europe%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Cylinders%22%3A%204%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_City%22%3A%2023%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_Highway%22%3A%2034%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7B%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Name%22%3A%20%22Mazda%20MX-5%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Origin%22%3A%20%22Asia%22%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22Cylinders%22%3A%204%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_City%22%3A%2026%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%22MPG_Highway%22%3A%2034%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%7D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%5D%0A%20%20%20%20%20%20%20%20)%0A%0Acars_df%20%3D%20read_cars_df()

--- title: SQL marimo-version: 0.23.1 width: medium header: |- # Copyright 2026 Marimo. All rights reserved pyproject: | requires-python = ">=3.11" dependencies = [ "marimo[sql]>=0.23.1", "altair", "matplotlib", "pandas", "plotly", ] --- # Hello, SQL! _Let's dive into the world of SQL where we don't just address tables, we also join them!_  With marimo, you can mix-and-match both **Python and SQL**. To create a SQL cell, you first need to install some additional dependencies, including [duckdb](https://duckdb.org/). Obtain these dependencies with ```bash pip install 'marimo[sql]' ``` ```python {.marimo hide_code="true"} has_duckdb_installed = False try: import duckdb has_duckdb_installed = True except ImportError: pass has_polars_installed = False try: import polars has_polars_installed = True except ImportError: pass has_pandas_installed = False try: import pandas has_pandas_installed = True except ImportError: pass ``` ```python {.marimo hide_code="true"} if has_duckdb_installed: mo.output.replace( mo.md( """ /// Tip | "Installed" If you see this, DuckDB is already installed. /// """ ) ) else: mo.output.replace( mo.md( """ /// Warning | "Not Installed" If you see this, DuckDB is not installed. /// """ ) ) ``` ## Creating SQL cells Once the required dependencies are installed, you can create SQL cells in one of the following ways: - right click the **Add Cell** buttons on the left of a cell; - click the **Convert to SQL** button in the cell menu - click the **Add SQL Cell** at the bottom of the page; ## Python representation marimo is still just Python, even when using SQL. Here is an example of how marimo embeds SQL in Python in its file format: ```python output_df = mo.sql(f"SELECT * FROM my_table LIMIT {max_rows.value}") ``` Notice that we have an **`output_df`** variable in the cell. This is a resulting Polars DataFrame (if you have `polars` installed) or a Pandas DataFrame (if you don't). One of them must be installed in order to interact with the SQL result. The SQL statement itself is an formatted string (f-string), so this means they can contain any valid Python code, such as the values of UI elements. This means your SQL statement and results can be reactive! 🚀  ## Querying dataframes with SQL  ::: {.callout-tip title="Data sources panel"} Click the database "barrel" icon in the left toolbar to see all dataframes and in-memory tables that your notebook has access to. :::  Let's take a look at a SQL cell. The next cell generates a dataframe called `df`. ```python {.marimo hide_code="true"} _SIZE = 1000 def _create_token_data(n_items=100): import random import string def generate_random_string(length): letters = string.ascii_lowercase result_str = "".join(random.choice(letters) for i in range(length)) return result_str def generate_random_numbers(mean, std_dev, num_samples): return [int(random.gauss(mean, std_dev)) for _ in range(num_samples)] random_numbers = generate_random_numbers(50, 15, n_items) random_strings = sorted( list(set([generate_random_string(3) for _ in range(n_items)])) ) return { "token": random_strings, "count": random_numbers[: len(random_strings)], } _data = _create_token_data(_SIZE) # Try polars if has_polars_installed: import polars as pl df = pl.DataFrame(_data) # Fallback to pandas (maybe trying to install it) else: import pandas as pd df = pd.DataFrame(_data) ``` Next, we create a SQL query, refercing the Python dataframe `df` directly. ```sql {.marimo} -- This SQL cell is special since we can reference existing dataframes in the global scope as a table in the SQL query. For example, we can reference the `df` dataframe in the global scope, which was defined in another cell using Python. SELECT * FROM df; -- By default, the output variable starts with an underscore (`_df`), making it private to this cell. To access the query result in another cell, change the name of the output variable. ``` ## From Python to SQL and back  You can create SQL statements that depend on Python values, such as UI elements: ```python {.marimo hide_code="true"} token_prefix = mo.ui.dropdown( list(string.ascii_lowercase), label="token prefix", value="a" ) token_prefix ``` ```sql {.marimo query="result"} -- Change the dropdown to see the SQL query filter itself! -- -- Here we use a duckdb function called `starts_with`: SELECT * FROM df WHERE starts_with(token, '{token_prefix.value}') -- Notice that we named the output variable `result` ``` Since we named the output variable above **`result`**, we can use it back in Python. ```python {.marimo hide_code="true"} charting_library = mo.ui.radio(["matplotlib", "altair", "plotly"]) mo.md( f""" Let's chart the result with a library of your choice: {charting_library} """ ) ``` ```python {.marimo hide_code="true"} _header = mo.md( f""" We can re-use the dropdown from above: {token_prefix} Now we have a histogram visualizing the token count distribution of tokens starting with {token_prefix.value}, powered by your SQL query and UI element. """ ) render_chart( charting_library.value, _header ) if charting_library.value else None ``` ```python {.marimo hide_code="true"} def render_chart(charting_library, header): return mo.vstack( [header, render_charting_library(charting_library)] ).center() def render_charting_library(charting_library): if charting_library == "matplotlib": return render_matplotlib() if charting_library == "altair": return render_altair() if charting_library == "plotly": return render_plotly() def render_matplotlib(): import matplotlib.pyplot as plt plt.hist(result["count"], label=token_prefix.value) plt.xlabel("token count") plt.legend() plt.tight_layout() return plt.gcf() def render_altair(): import altair as alt chart = ( alt.Chart(result) .mark_bar() .encode(x=alt.X("count", bin=True), y=alt.Y("count()")) ) return mo.ui.altair_chart(chart, chart_selection=False) def render_plotly(): import plotly.graph_objects as go return go.Figure(data=[go.Histogram(x=result["count"])]) ``` ## CSVs, Parquet, Postgres, and more ...  We're not limited to querying dataframes. We can also query an **HTTP URL, S3 path, or a file path to a local csv or parquet file**. ```sql -- or SELECT * FROM 's3://my-bucket/file.parquet'; -- or SELECT * FROM read_csv('path/to/example.csv'); -- or SELECT * FROM read_parquet('path/to/example.parquet'); ``` With a bit of boilerplate, you can even read and write to **Postgres**, and join Postgres tables with dataframes in the same query. For a full list of supported data sources, check out the [duckdb extensions](https://duckdb.org/docs/extensions/overview) and our [example notebook on duckdb connections](https://github.com/marimo-team/marimo/blob/main/examples/sql/duckdb_connections.**py**). For this example, we will query a dataframe loaded from a csv. ```python {.marimo hide_code="true"} # We use pandas for the CSV fetch because DuckDB in Pyodide does not support HTTPFS extensions def read_cars_df(): import pandas as pd try: return pd.read_csv("https://datasets.marimo.app/cars.csv") except Exception: # datasets.marimo.app returns 403 from time to time, so we add this as a fallback return pd.DataFrame( [ { "Name": "Honda Civic", "Origin": "Asia", "Cylinders": 4, "MPG_City": 30, "MPG_Highway": 38, }, { "Name": "Toyota Camry", "Origin": "Asia", "Cylinders": 4, "MPG_City": 28, "MPG_Highway": 39, }, { "Name": "Ford F-150", "Origin": "USA", "Cylinders": 6, "MPG_City": 20, "MPG_Highway": 26, }, { "Name": "BMW 328i", "Origin": "Europe", "Cylinders": 4, "MPG_City": 23, "MPG_Highway": 34, }, { "Name": "Mazda MX-5", "Origin": "Asia", "Cylinders": 4, "MPG_City": 26, "MPG_Highway": 34, }, ] ) cars_df = read_cars_df() ``` ```sql {.marimo query="cars"} -- Download a CSV and create an in-memory table; this is optional. CREATE OR replace TABLE cars as ( SELECT * FROM cars_df ); -- Query the table SELECT * from cars; ``` ```python {.marimo hide_code="true"} cylinders_dropdown = mo.ui.range_slider.from_series( cars["Cylinders"], debounce=True, show_value=True ) origin_dropdown = mo.ui.dropdown.from_series(cars["Origin"], value="Asia") mo.hstack([cylinders_dropdown, origin_dropdown]).left() ``` ```sql {.marimo query="filtered_cars"} SELECT * FROM cars WHERE Cylinders >= {cylinders_dropdown.value[0]} AND Cylinders <= {cylinders_dropdown.value[1]} AND ORIGIN = '{origin_dropdown.value}' ``` ```python {.marimo hide_code="true"} mo.hstack( [ mo.stat(label="Total cars", value=str(len(filtered_cars))), mo.stat( label="Average MPG Highway", value=f"{filtered_cars['MPG_Highway'].mean() or 0:.1f}", ), mo.stat( label="Average MPG City", value=f"{filtered_cars['MPG_City'].mean() or 0:.1f}", ), ] ) ``` ```python {.marimo hide_code="true"} import marimo as mo import random ``` ```python {.marimo hide_code="true"} import string ```