JEP: Add a dirty state to code cells in notebook format #69

martinRenou · 2021-05-31T10:27:08Z

Originally suggested by @davidbrochart in jupyter/nbformat#222 and related to the work done in JupyterLab here: jupyterlab/jupyterlab#10296.

Reproducibility is at the heart of Jupyter, but some Notebooks workflows can harm reproducibility. As an example, a Notebook user can make a code cell, execute it, which generates some output, and re-edit the code cell without generating the new output, then save. In this case, the saved Notebook is in a dirty state, where the cell input does not reflect its output.

I suggested a UI change in JupyterLab: jupyterlab/jupyterlab#10296 which shows a visual indication that the cell has been re-edited since the last run, showing that the cell is "dirty":

Discussing with @davidbrochart, we wondered if this should not be included as part of the Notebook format taking the form of a new entry in the cell format. This would give a clue that the output may not reflect the code input execution result.

choldgraf · 2021-05-31T16:08:47Z

I think this is a great idea from a user-facing perspective.

also I hope that it’s ok, I updated the title to make it a bit clearer that we are talking about the notebook format in general, not nbformat the python package

martinRenou · 2021-06-01T08:11:28Z

also I hope that it’s ok, I updated the title to make it a bit clearer that we are talking about the notebook format in general, not nbformat the python package

Isn't nbformat the place where the specification is implemented? It is unclear to me where this change could happen. I thought nbformat was the first place to look at, then Jupyter front-ends could implement it later.

choldgraf · 2021-06-01T16:57:02Z

I think you're right but I'm not positive either - I just felt like notebook format helped disambiguate a bit :-) if you want to change it back that's fine too

blois · 2021-06-01T17:24:12Z

An alternate approach to consider is to build on #68 and persist the Cell ID in the IPython history then on connection to the kernel fetch the execution history and see if the executed code differs from the source.

Colab has been sending cell IDs for a while and subclasses HistoryManager to persist the cell ID- https://github.com/googlecolab/colabtools/blob/main/google/colab/_history.py. We then use this to populate the cell's execution_count in the UI when reconnecting to a kernel.

I think it'd also be useful to populate execution timing info from the kernel when connecting to a runtime. There's an awkward discontinuity when connecting to a kernel where the execution state of a cell in a notebook may be different than the kernel.

krassowski · 2021-06-03T13:13:41Z

Interesting thought on using the cell IDs @blois. Could we take it a step further, and extend kernel execution response (execute_result) to include an optional "depends_on": list[cell id] which would then be used by frontends to implement something like https://github.com/nbsafety-project/nbsafety. This would likely be another JEP (unrelated to having the dirty state as this stands as a feature of its own). Does this idea make sense?

choldgraf changed the title ~~JEP: Add a dirty state to code cells in nbformat~~ JEP: Add a dirty state to code cells in notebook format May 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JEP: Add a dirty state to code cells in notebook format #69

JEP: Add a dirty state to code cells in notebook format #69

martinRenou commented May 31, 2021

choldgraf commented May 31, 2021

martinRenou commented Jun 1, 2021

choldgraf commented Jun 1, 2021

blois commented Jun 1, 2021

krassowski commented Jun 3, 2021

JEP: Add a dirty state to code cells in notebook format #69

JEP: Add a dirty state to code cells in notebook format #69

Comments

martinRenou commented May 31, 2021

choldgraf commented May 31, 2021

martinRenou commented Jun 1, 2021

choldgraf commented Jun 1, 2021

blois commented Jun 1, 2021

krassowski commented Jun 3, 2021