Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make generic pipelines to DAG rendering work with Airflow 2.x #3166

Open
shalberd opened this issue Jun 21, 2023 · 4 comments · May be fixed by #3167
Open

make generic pipelines to DAG rendering work with Airflow 2.x #3166

shalberd opened this issue Jun 21, 2023 · 4 comments · May be fixed by #3167
Assignees

Comments

@shalberd
Copy link

shalberd commented Jun 21, 2023

Describe the issue
Currently, even when using generic pipeline editor with python files and notebooks, the rendered DAG is not compatible with Airflow 2..6.0 or higher, or any Airflow 2.x, for that matter :-)
Broken DAG: [/opt/airflow/dags/repo/SvensWetterapprootpvc-0202162036.py] Traceback (most recent call last): File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed File "/opt/airflow/dags/repo/SvensWetterapprootpvc-0202162036.py", line 2, in <module> from airflow.contrib.kubernetes.volume_mount import VolumeMount ModuleNotFoundError: No module named 'airflow.contrib.kubernetes'

To Reproduce
Steps to reproduce the behavior:

  1. Create a generic pipeline
  2. Submit it to the Airflow runtime with Airflow 2.x

Screenshots or log output
If applicable, add screenshots or log output to help explain your problem.

Log Output
Paste the log output here.

Expected behavior
DAg is rendered correctly for use in Airflow 2.x

Starting with jinja template
https://github.com/elyra-ai/elyra/blob/main/elyra/templates/airflow/airflow_template.jinja2

and elyra/elyra/pipeline/airflow/processor_airflow.py

possible hints
https://airflow.apache.org/docs/apache-airflow-providers-cncf-kubernetes/stable/operators.html

https://airflow.apache.org/docs/apache-airflow/2.5.3/howto/upgrading-from-1-10/index.html

https://airflow.apache.org/docs/apache-airflow/2.5.3/core-concepts/dags.html

apache/airflow#21653

Deployment information
Describe what you've deployed and how:

  • Elyra version: latest
  • Installation source: [e.g. PyPI, conda, from source, official container image, custom container image]
  • Deployment type: Open Data Hub pre-RedHat takeover
  • Operating system: [e.g. macos, linux]
@ianonavy
Copy link

My team was able to get generic pipeline DAGs to work with Airflow 2.5.3 on this commit (kflow-ai@f9d1329). We don't have unit tests working locally, so this commit isn't ready for a PR. Perhaps we can work with an active maintainer to get this in proper shape for an upstream merge?

Off the top of my head, we need automated tests to work and potentially test with both earlier and later versions of Airflow. My understanding from this comment is we may actually want to take a different approach to keep Airflow 1 support at the same time.

screenshot of Airflow 2.5.3 showing a working pipeline generated from a dev build of Elyra

@shalberd
Copy link
Author

shalberd commented Jun 21, 2023

Do we really need to test it with Airflow 1.x?

By the way, your code looks really good, I have almost exactly that in my local code base right now :-)

Two differences:

https://github.com/apache/airflow/pull/21653/files

days_ago will not be there anymore in Airflow 3.x, so
import pendulum

instead of

from airflow.utils.dates import days_ago

and

schedule='@once'

instead of
schedule_interval='@once'

and

start_date=pendulum.today('UTC').add(days=-1),

instead of

start_date=days_ago(1),

@shalberd
Copy link
Author

shalberd commented Jun 21, 2023

@ianonavy yup, we will certainly work with @lresende I am also sporadically in contact with @thesuperzapper who is involved in both KF Pipelines and airflow helm charts.

@shalberd shalberd linked a pull request Jun 21, 2023 that will close this issue
shalberd added a commit to shalberd/elyra that referenced this issue Jan 12, 2024
…ocessor airflow and jinja2 template airflow to comply with Airflow 2.x. Added new location of KubernetesPodOperator library in Airflow 2.x to test Pipeline for processor airflow. Added cpu and memory limits fields in airflow 2.x fashion as well.

Signed-off-by: Sven Thoms <21118431+shalberd@users.noreply.github.com>
@shalberd
Copy link
Author

shalberd commented Jan 12, 2024

@ianonavy "Off the top of my head, we need automated tests to work and potentially test with both earlier and later versions of Airflow." we will deprecate airflow 1.x support with elyra 4.0 and I am thinking of adding more unit tests aside from the existing ones, too.

so far, I only had to modify one existing test in community elyra for it to work https://github.com/elyra-ai/elyra/pull/3167/files#diff-f4fbb5d86d69f450c9235421616ec18e28f22eb174fb97332ba1952026c0a9f6 Do you think more than that unit test change or additonal unit test adding is needed? The existing Elyra tests still all run though in PR 3167 and spcifically https://github.com/elyra-ai/elyra/actions/runs/7505710196/job/20435558496?pr=3167#step:6:506

as mentioned, we are assuming full switchover to airflow 2.x in Elyra 4.0. Discussed with and okayed by the maintainers.

@shalberd shalberd self-assigned this Jan 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants