Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nbconvert requires lxml_clean_html #3228

Open
sjkim2322 opened this issue May 2, 2024 · 6 comments
Open

nbconvert requires lxml_clean_html #3228

sjkim2322 opened this issue May 2, 2024 · 6 comments
Labels
component:docker Container/Docker images related issues dependencies Pull requests that update a dependency file

Comments

@sjkim2322
Copy link

sjkim2322 commented May 2, 2024

Describe the issue

If you run the notebook component on a runtime image that does not have lxml 5.2.0 or lower or lxml_clean_html installed in advance, the following error occurs.

        "lxml.html.clean module is now a separate project lxml_html_clean.\n"
        "Install lxml[html_clean] or lxml_html_clean directly." 

To Reproduce
Steps to reproduce the behavior:

  1. Prepare a Python runtime image of a clean environment (in my case it is “docker.io/python:3.10.12”).
  2. Run the elyra notebook component with the above image set as the runtime image.
  3. The above error occurs while initializing the notebook environment.

Cause I'm guessing

Expected behavior

  • I can pre-install it on the runtime image I will use, but since all images require work, it would be better to add it to elyra's requirements.

Deployment information
Describe what you've deployed and how:

  • Elyra version: [3.16.0.dev0]
  • Installation source: [PyPI]
  • Deployment type: [Kubeflow [notebook server] ]
  • Operating system: [linux]
@lresende
Copy link
Member

lresende commented May 4, 2024

Should this be in Elyra? Or in nbconvert ?

@lresende lresende added component:docker Container/Docker images related issues dependencies Pull requests that update a dependency file and removed status:Needs Triage labels May 4, 2024
@sjkim2322
Copy link
Author

@lresende
Thank you for answer.
I will also raise the issue on nbconvert.

However, I don't know if it is possible to modify the module dependency of a specific version that has already been deployed

In elyra, when the Jupyter notebook component is executed, it appears to download and install the requirements here. Since nbconvert is being installed here, I think it would be good if lxml_clean_html is also added here.

or, If there is a spec that can pre-initialize the notebook component's cell before execution, it may be possible to use it.

@lresende
Copy link
Member

lresende commented May 8, 2024

As you can see, we already removed a lot of transient dependencies from the requirements file to avoid us having to keep syncing to new versions, etc... if they don't enable that in nbconvert itself, then we can continue this conversation.

@sjkim2322
Copy link
Author

sjkim2322 commented May 8, 2024

Yes, I left an issue on nbconvert.
jupyter/nbconvert#2148

@lresende
Copy link
Member

lresende commented May 8, 2024

@sjkim2322 based on your nbcovert issue, can't we update the version of nbconvert?
does that still work with JupyterLab < 4?

@shalberd
Copy link

shalberd commented May 12, 2024

@lresende @sjkim2322

I can confirm that we can just update the version of nbconvert without any issues regarding runtime behavior with the runtime image.

I can specifically confirm that nbconvert in a higher version works fine with Jupyterlab less than 4. See this code here from Red Hat Open Data Hub folks. They still use Jupyterlab less than 4 as well:

https://github.com/opendatahub-io/notebooks/blob/main/runtimes/minimal/ubi9-python-3.9/utils/requirements-elyra.txt

https://github.com/opendatahub-io/notebooks/blob/main/jupyter/datascience/ubi9-python-3.9/Pipfile

We at our org have this running sucessfully. So yes, you can change to, among other version updates most likely

nbconvert==7.1.0

no compatibility issues with Jupyterlab 3.6.7

See my list of packages in my runtime image / jupyter image (I bake in the packages in the Jupyter image and runtime image combined to make these requirements for runtime elyra available airgapped, no problem.

I think if you sync or update requirements-elyra in this project in line with what is listed up in the link for requirments-elyra of opendatahub-io, you won't have any issues anymore, be it nbconvert or the other runtime requirements packages.

[1001050000@s-testjupyter-0 ~]$ pip list | grep nbconvert
nbconvert                                7.16.4
[1001050000@s-testjupyter-0 ~]$ pip list | grep jupyter
jupyter-bokeh                            3.0.7
jupyter_client                           7.4.9
jupyter_core                             5.7.2
jupyter-events                           0.10.0
jupyter-lsp                              2.2.5
jupyter_packaging                        0.12.3
jupyter-resource-usage                   0.7.2
jupyter_server                           2.14.0
jupyter_server_fileid                    0.9.2
jupyter-server-mathjax                   0.2.6
jupyter_server_proxy                     4.0.0
jupyter_server_terminals                 0.5.3
jupyter_server_ydoc                      0.8.0
jupyter-ydoc                             0.2.5
jupyterlab                               3.6.7
jupyterlab_git                           0.44.0
jupyterlab-lsp                           4.2.0
jupyterlab_pygments                      0.3.0
jupyterlab_server                        2.27.1
jupyterlab-streamlit-menu                0.1.0
jupyterlab_widgets                       3.0.10
# This is a comprehensive list of python dependencies that Elyra requires to execute Jupyter notebooks.
ipykernel = "==6.13.0"
ipython = "==8.10.0"
ipython-genutils = "==0.2.0"
jinja2 = "==3.0.3"
jupyter-client = "==7.3.1"
jupyter-core = "==4.11.2"
MarkupSafe = "==2.1.1"
minio = "==7.1.15"
nbclient = "==0.6.3"
nbconvert = "==7.1.0"
nbformat = "==5.4.0"
papermill = "==2.3.4"
pyzmq = "==24.0.1"
prompt-toolkit = "==3.0.30"
requests = "==2.31.0"
tornado = "==6.3.3"
traitlets = "==5.10.0"
urllib3 = "==1.26.18"

@lresende agreed the runtime and transitive dependencies issue is a little bit of a pain, but as mentioned, lifting the library version should be fine, even with Jupyterlab less than 4. I have tested this running in conjunction with airflow as a runtime engine, executing notebooks in tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:docker Container/Docker images related issues dependencies Pull requests that update a dependency file
Projects
None yet
Development

No branches or pull requests

3 participants