Dataflow
How marimo notebooks run
Reactive execution is based on a single rule: when a cell is run, all other cells that reference any of the global variables it defines run automatically.
To provide reactive execution, marimo creates a dataflow graph out of your cells.
Tip: disabling automatic execution.
marimo lets you disable automatic execution: just go into the notebook settings and set
“Runtime > On Cell Change” to “lazy”.
When the runtime is lazy, after running a cell, marimo marks its descendants as stale instead of automatically running them. The lazy runtime puts you in control over when cells are run, while still giving guarantees about the notebook state.
References and definitions
A marimo notebook is a directed acyclic graph in which nodes represent cells and edges represent data dependencies. marimo creates this graph by analyzing each cell (without running it) to determine its
- references (“refs*), the global variables it reads but doesn’t define;
- definitions (“defs”), the global variables it defines.
There is an edge from one cell to another if the latter cell references any global variables defined by the former cell.
The rule for reactive execution can be restated in terms of the graph: when a cell is run, its descendants are run automatically.
Example
The next four cells plot a sine wave with a given period and amplitude. Each cell is labeled with its refs and defs.
mo.refs() and mo.defs() to inspect the refs and defs of any
given cell. This can help with debugging complex notebooks.
For example, here are the refs and defs of this cell:
🌊 Try it! In the above cells, try changing the value period or ampltitude, then click the run button ( ▷ ) to register your changes. See what happens to the sine wave.
Here is the dataflow graph for the cells that make the sine wave plot, plus the cells that import libraries. Each cell is labeled with its defs.
+------+ +-----------+
+-----------| {mo} |-----------+ | {np, plt} |
| +---+--+ | +----+------+
| | | |
| | | |
v v v v
+----------+ +-------------+ +--+----------+
| {period} | | {amplitude} | | {plot_wave} |
+---+------+ +-----+-------+ +------+------+
| | |
| v |
| +----+ |
+------------> | {} | <-------------+
+----+
The last cell, which doesn’t define anything, produces the plot.
Dataflow programming
marimo’s runtime rule has some important consequences that may seem surprising if you are not used to dataflow programming. We list these below.
Execution order is not cell order
The order in which cells are executed is determined entirely by the dataflow graph. This makes marimo notebooks more reproducible than traditional notebooks. It also lets you place boilerplate, like imports or long markdown strings, at the bottom of the editor.
Global variable names must be unique
Every global variable can be defined by only one cell. Without this constraint, there would be no way for marimo to know which order to execute cells in.
If you violate this constraint, marimo provides a helpful error message, like below:
🌊 Try it! In the previous cell, change the name planet to home, then run the cell.
Because defs must be unique, global variables cannot be modified with operators like += or -= in cells other than the one that created them; these operators count as redefinitions of a name.
🌊 Try it! Get rid of the following errors by merging the next two cells into a single cell.
Underscore-prefixed variables are local to cells
Global variables prefixed with an underscore are “private” to the cells that define them. This means that multiple cells can define the same underscore-prefixed name, and one cell’s private variables won’t be made available to other cells.
Example.
Deleting a cell deletes its variables
Deleting a cell deletes its global variables and then runs all cells that reference them. This prevents severe bugs that can arise when state has been deleted from the editor but not from the program memory.
'variable still exists'
Cycles are not allowed
Cycles among cells are not allowed. For example:
marimo doesn’t track attributes
marimo only tracks global variables. Writing object attributes does not trigger reactive execution.
🌊 Example. Change the value of state.number in the next cell, then run the cell. Notice how the subsequent cell isn’t updated.
1
marimo doesn’t track mutations
In Python, it’s impossible to know whether code will mutate an object without running it. So: mutations (such as appending to a list) will not trigger reactive execution.
ui tutorial.Best practices
The constraints marimo puts on your notebooks are all natural consequences of the fact that marimo programs are directed acyclic graphs. As long as you keep this fact in mind, you’ll quickly adapt to the marimo way of writing notebooks.
Ultimately, these constraints will enable you to create powerful notebooks and apps, and they’ll encourage you to write clean, reproducible code.
Follow these tips to stay on the marimo way:
# a cell
numbers = [1, 2, 3]
# another cell
numbers.append(4)
# a cell
numbers = [1, 2, 3]
numbers.append(4)
# a cell
numbers = [1, 2, 3]
# another cell
more_numbers = numbers + [4]
mo.cache to cache the return value of expensive functions.
You can do this if you abstract complex logic into idempotent
functions, following earlier tips.
For example:
import marimo as mo
@mo.cache
def compute_prediction(problem_parameters):
...
compute_predictions is called with a value of
problem_parameters it has not seen, it will compute the predictions
and store them in a cache. The next time it is called with the same
parameters, instead of recomputing the predictions, it will just
fetch the previously computed ones from the cache.
If you are familiar with functools.cache, mo.cache is
similar but more robust, with the cache persisting even
if the cell defining the function is re-run.What’s next?
Check out the tutorial on interactivity for a tour of UI elements:
marimo tutorial ui