Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial changes to support runtime sub package installs #2259

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

akchinSTC
Copy link
Member

@akchinSTC akchinSTC commented Oct 27, 2021

What changes were proposed in this pull request?

pip install elyra['airflow', 'airflow-examples', 'kfp', 'kfp-examples', 'kfp-tekton', 'test','all']

Enables the ability to install selective packages/groups in order to either include or exclude certain runtimes and their dependencies.

How was this pull request tested?

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@elyra-bot
Copy link

elyra-bot bot commented Oct 27, 2021

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

@akchinSTC akchinSTC linked an issue Oct 27, 2021 that may be closed by this pull request
setup.py Outdated Show resolved Hide resolved
@lresende
Copy link
Member

lresende commented Nov 7, 2021

@akchinSTC Could you please add to the description, the supported syntaxes you have in mind.

@akchinSTC akchinSTC added component:build build and build related issues(dependencies and docker) component:install Installation (pip, conda, etc) or packaging (components) labels Mar 9, 2022
@akchinSTC akchinSTC added this to the 3.8.0 milestone Apr 4, 2022
@akchinSTC
Copy link
Member Author

akchinSTC commented Apr 5, 2022

@lresende
After further testing and research, this approach will probably not work. There doesn't seem to be any support for "conditional entrypoints" in any packaging tools.
We do have several workarounds to mimic the behavior that we want.

  1. Create new python packages specifically for the adding entrypoints and dependencies for [kfp, kfp-tekton, airflow, possibly local(depending on current discussions) etc]. We could then pull these in as dependencies and respective entrypoints in elyra. I would argue that kfp/kfp-tekton should be included in default installation. Creation of these packages could be automated and templated via the release script we currently use for the extension packages. Downside is that this would increase the number of packages we maintain will increase and release pipeline will get longer. e.g. pypi -> conda
  2. Leave the entrypoints in place and add conditional imports to check for installed python libraries and display information on how to remedy when the module and associated package is not found.

@lresende
Copy link
Member

lresende commented Apr 6, 2022

@lresende After further testing and research, this approach will probably not work. There doesn't seem to be any support for "conditional entrypoints" in any packaging tools. We do have several workarounds to mimic the behavior that we want.

  1. Create new python packages specifically for the adding entrypoints and dependencies for [kfp, kfp-tekton, airflow, possibly local(depending on current discussions) etc]. We could then pull these in as dependencies and respective entrypoints in elyra. I would argue that kfp/kfp-tekton should be included in default installation. Creation of these packages could be automated and templated via the release script we currently use for the extension packages. Downside is that this would increase the number of packages we maintain will increase and release pipeline will get longer. e.g. pypi -> conda
  2. Leave the entrypoints in place and add conditional imports to check for installed python libraries and display information on how to remedy when the module and associated package is not found.

How would the # 2 work? Basically don't continue processing the entrypoint load if a given dependency is not there? I think that would be a simple way to implement this.

@kevin-bates
Copy link
Member

If we take workaround option 1 and split processors into their own packages, I would argue that the only OOTB runtime type should be the local runtime type. This would also satisfy those cases where folks want to simply try things out. I believe including any KFP or Airflow in the base installation is contrary to what #2136 is trying to address since Airflow users don't want/need KFP and vice versa.

If we take workaround option 2 (which seems more attenable and less "expensive" maintenance-wise), then the responsibility is on us to operate with no platform-specific packages and, more importantly, defer the loading of the processor entrypoints until necessary.

One approach that we could use here is to add code to the RuntimesSchemas SchemasProvider that confirms the applicable packages are installed (kinda similar to what already occurs with KFP engines) and, if not present, simply does not include those schemas out of its get_schemas() method. This information could then be used by the processor manager (or whatever loads the entrypoints) and have it only load the processor entrypoint if its corresponding schema is present in the Runtimes schemaspace. This is possible by virtue of the fact that Runtime entrypoint names must match the runtime schema name (.e.g, 'kfp' and 'airflow').

@kevin-bates
Copy link
Member

@lresende and my responses crossed, but it looks like we're suggesting similar approaches.

@akchinSTC akchinSTC force-pushed the i-2136 branch 2 times, most recently from 65dd225 to 9e16d98 Compare April 6, 2022 16:32
try:
# Check to see if Airflow package is installed, since we do not have any dependencies
# on any Airflow package, we use GitHub as our canary
from github import Github # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will be sufficient. Won't we still need the ability for the user to physically indicate their intentions? Like pip install elyra[kfp] or pip install elyra[airflow].

Do we need a dummy airflow package that we could use (and install via the previous example) to make the user's intention known to server code? This package, while on PyPi and Conda-forge, would likely never have to change, so I'm not sure there would be much of a burden other than its initial creation.

Then elyra[kfp] would include the KFP sdk, elyra[kfp-tekon] would include elyra[kfp] + kfp-tekton and elyra[airflow] would include this dummy airflow placeholder as its dependency. It could even have a dependency on GitHub, if we needed that (and elyra[gitlab] could include elyra[airflow] + gitlab if we wanted that package tied to airflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. There may be other reasons a package (like github) might be installed. For example, once the community provides a component catalog connector that relies on the package.

@ptitzler
Copy link
Member

This PR should address the question raised in #2089

@ptitzler
Copy link
Member

Do we need a dummy airflow package that we could use (and install via the previous example) to make the user's intention known to server code? This package, while on PyPi and Conda-forge, would likely never have to change, so I'm not sure there would be much of a burden other than its initial creation.

Instead of a dummy package, would it be possible to create a package for each runtime that serves a purpose, such as identifies the runtime capabilities? This way we would be more consistent across runtimes.

@kevin-bates
Copy link
Member

kevin-bates commented Apr 12, 2022

Instead of a dummy package, would it be possible to create a package for each runtime that serves a purpose, such as identifies the runtime capabilities? This way we would be more consistent across runtimes.

That works too. We only need something for airflow, but could maybe have something relative to kfp as well. If so, I think these packages would need to be fairly static so as to prevent the need from having regular releases, etc.

Do you have an idea of how "runtime capabilities" would be expressed across the various runtimes? What are examples of such? And these capabilities would need to be included in "true" BYO runtime implementations as well.

@ptitzler
Copy link
Member

Instead of a dummy package, would it be possible to create a package for each runtime that serves a purpose, such as identifies the runtime capabilities? This way we would be more consistent across runtimes.

That works too. We only need something for airflow, but could make have something relative to kfp as well. If so, I think these packages would need to be fairly static so as to prevent the need from having regular releases, etc.

For what I was considering that would be the case. I'm also thinking about a potential future 'local' runtime here, which doesn't necessarily have any third party dependencies.

Do you have an idea of how "runtime capabilities" would be expressed across the various runtimes? What are examples of such? And these capabilities would need to be included in "true" BYO runtime implementations as well.

Focusing on the 'what' might be useful to encode, I was thinking this would include whether a runtime supports certain runtime features such as:

  • export pipeline to a runtime native format (e.g. not applicable to 'local')
  • generic components
  • custom components
  • ...

The UI could use those hints to disable or hide things, like this PR proposes to do for VPE launcher tiles.

@akchinSTC akchinSTC removed this from the 3.10 milestone May 31, 2022
@akchinSTC akchinSTC added the status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise. label Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:build build and build related issues(dependencies and docker) component:install Installation (pip, conda, etc) or packaging (components) priority:stretch status:Work in Progress Development in progress. A PR tagged with this label is not review ready unless stated otherwise.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable installing only specific runtimes when deploying elyra
4 participants