Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JEP: Add a dirty state to code cells in notebook format #69

Open
martinRenou opened this issue May 31, 2021 · 5 comments
Open

JEP: Add a dirty state to code cells in notebook format #69

martinRenou opened this issue May 31, 2021 · 5 comments

Comments

@martinRenou
Copy link
Member

Originally suggested by @davidbrochart in jupyter/nbformat#222 and related to the work done in JupyterLab here: jupyterlab/jupyterlab#10296.

Reproducibility is at the heart of Jupyter, but some Notebooks workflows can harm reproducibility. As an example, a Notebook user can make a code cell, execute it, which generates some output, and re-edit the code cell without generating the new output, then save. In this case, the saved Notebook is in a dirty state, where the cell input does not reflect its output.

I suggested a UI change in JupyterLab: jupyterlab/jupyterlab#10296 which shows a visual indication that the cell has been re-edited since the last run, showing that the cell is "dirty":

dirty

Discussing with @davidbrochart, we wondered if this should not be included as part of the Notebook format taking the form of a new entry in the cell format. This would give a clue that the output may not reflect the code input execution result.

@choldgraf choldgraf changed the title JEP: Add a dirty state to code cells in nbformat JEP: Add a dirty state to code cells in notebook format May 31, 2021
@choldgraf
Copy link
Contributor

I think this is a great idea from a user-facing perspective.

also I hope that it’s ok, I updated the title to make it a bit clearer that we are talking about the notebook format in general, not nbformat the python package

@martinRenou
Copy link
Member Author

also I hope that it’s ok, I updated the title to make it a bit clearer that we are talking about the notebook format in general, not nbformat the python package

Isn't nbformat the place where the specification is implemented? It is unclear to me where this change could happen. I thought nbformat was the first place to look at, then Jupyter front-ends could implement it later.

@choldgraf
Copy link
Contributor

I think you're right but I'm not positive either - I just felt like notebook format helped disambiguate a bit :-) if you want to change it back that's fine too

@blois
Copy link

blois commented Jun 1, 2021

An alternate approach to consider is to build on #68 and persist the Cell ID in the IPython history then on connection to the kernel fetch the execution history and see if the executed code differs from the source.

Colab has been sending cell IDs for a while and subclasses HistoryManager to persist the cell ID- https://github.com/googlecolab/colabtools/blob/main/google/colab/_history.py. We then use this to populate the cell's execution_count in the UI when reconnecting to a kernel.

I think it'd also be useful to populate execution timing info from the kernel when connecting to a runtime. There's an awkward discontinuity when connecting to a kernel where the execution state of a cell in a notebook may be different than the kernel.

@krassowski
Copy link
Member

Interesting thought on using the cell IDs @blois. Could we take it a step further, and extend kernel execution response (execute_result) to include an optional "depends_on": list[cell id] which would then be used by frontends to implement something like https://github.com/nbsafety-project/nbsafety. This would likely be another JEP (unrelated to having the dirty state as this stands as a feature of its own). Does this idea make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants