Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pyarrow import in CI pipelines #11

Merged
merged 3 commits into from
Dec 4, 2024
Merged

Conversation

scott-routledge2
Copy link
Contributor

@scott-routledge2 scott-routledge2 commented Dec 4, 2024

Changes included in this PR

Fixes issue where a newer version of libutf8proc gets installed which breaks our pyarrow package

Testing strategy

Ran Nightly, PR CI to confirm that building bodo succeeded

User facing changes

Checklist

  • Pipelines passed before requesting review. To run CI you must include [run CI] in your commit message.
  • I am familiar with the Contributing Guide
  • I have installed + ran pre-commit hooks.

Copy link
Contributor

@srilman srilman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ehsantn ehsantn merged commit 9c01285 into main Dec 4, 2024
28 of 121 checks passed
@ehsantn ehsantn deleted the scott/fix_pyarrow_import branch December 4, 2024 17:28
srilman pushed a commit that referenced this pull request Dec 4, 2024
scott-routledge2 pushed a commit that referenced this pull request Dec 16, 2024
* Initial commit

* [BP-454] IPyParallel Wrapper Kernel for Platform (#1)

* [BP-454] Improvements and fixes (#2)

* Fix README

* Add PyPi badge

* Allow users to create their own IPy clusters, formatting

* Fix typo

* Simplify a little

* Invert magic check logic

* Small change in example ipcluster config

* Generate IPYPARALLEL_MAGICS from line and cell magics

* Fix typo in comment

* Modify `example_ipcluster_config.py`  (#3)

* [BP-1032] Check `update_hostfile.sh` in multiple locations including `/home/bodo` (#4)

* Check update_hostfile in multiple locations including /home/bodo

* Add .DS_Store to .gitignore

* [BP-902] feat/custom-launcher (#5)

* feat/custom-launcher

* rfac: rename launcher

* rfac: default error message when exit code doesnt match

* chore: register launcher entrypoint

* chore: handle px autopx errors

* Update launcher.py

* Update launcher.py

* chore: update logging

* chore: minor fixes

* chore: minor fixes

* rfac: check for engine running status

* Update launcher.py

* fix: include bodo custom magics

* Update bodo_platform_ipyparallel_kernel/kernel.py

Co-authored-by: Sahil Gupta <[email protected]>

* chore: minor fixes

* chore: minor fixes

* fix: formatting

* chore: minor fixes

* chore: allow users to run pxresult or disable autopx when engines not running

Co-authored-by: Sahil Gupta <[email protected]>

* Fix issue with autopx saying it's enabled after cluster creation failed (#7)

* Return when cluster isn't created succesfully

* Fix issue with cell remaining in pending state after cluster creation fails

Co-authored-by: Isaac <[email protected]>

* BP-1607: Shutdown view on error (#8)

* Shutdown view

* Add comment

Co-authored-by: Isaac <[email protected]>

* BP 1607: IPyParallel Engines not killed (#9)

* Shutdown view

* Add comment

* Add extra shutdown to stop_ipyparallel_cluster

* Kill engines after shutdown

* Add async

* Change order of shutdown

* Stop engines

* Add debug messages

* Switch to sigterm

* Remove async

* Remove check

* If there are outstanding tasks abort and kill engines

* Add await

* Use client instead of cluster

* Fix member name

* Try sync

* Remove unecessary shutdown

* Readd

* Move if

* Run formatter

* Try forcekilling after stop_cluster

* Revert

* Add comment

* Stop engines and controller through cluster interface

* Add comments and manually remove cluster file

* Formatting

* Readd shutdown

* Replace stop_cluster_sync with shutdown

* Change to cluster api

* Add debug

* Add block=True

* Add wrap future

* Wait with timneout

* Wrap in wait

* Always SigKill

* Add stop_cluster_sync at end of shutdown

* Add catch for filenotfound

* Add comments

Co-authored-by: Isaac <[email protected]>
Co-authored-by: Sahil <[email protected]>

* [BP-1486][BP-1826] Add support for SQL language mode using the `%%sql` magic (#6)

This adds support for the new SQL language mode on our platform. This is done by converting SQL queries to a %%sql magic execution.
The language mode and catalog detection is done in execute_request where this information is readily available. This function is run before do_execute, which is why we get the required information from the execution metadata here, and convert the code that's then sent to do_execute. Since we want to run SQL queries on the IPyParallel cluster by default, we also add the %%px magic automatically (if %autopx isn't enabled). Just like any other IPP magics, this would therefore start an IPP cluster if one is not already initialized.

Co-authored-by: Isaac <[email protected]>
Co-authored-by: meehawk <[email protected]>
Co-authored-by: Sahil <[email protected]>

* [BP-2119] Replace MPI EngineSetLauncher with Slurm (#11)

* replace MPI EngineSetLauncher with Slurm

* check if Slurm installed

* workaround ipp signal bug

* add comment

* Remove dead ipcluster config file

* [BP-880] Add Parallel-Python Mode Support (#10)

This adds support for a "Parallel-Python" language mode, in addition to the existing "Python" and "SQL" language modes. This has the following implications:
- `%autopx` is no longer required, and in fact will be ignored by the kernel.
- Functionality of the "Python" mode has been limited to running things on a single core (note that it's not rank 0 of the IPyParallel cluster, rather a separate process). This mode is now limited to running sequential code (for debugging, etc.) and our custom line magics like `%pconda`, etc. IPyParallel magics are not supported, and the kernel will show an error message if they're used.
- "Parallel-Python" is really the new default. It will add `%%px` to all code (unless the user is already using an IPyParallel magic).
- `execute_request` acts like a pre-parser for all code based on the language mode, reducing the complexity in `do_execute`.
- `execute_request` does the following:
    - For SQL mode:
        - If IPyParallel, SQL or Bodo Platform magics (`%pconda`, etc.) are used, an error will be shown and no code will be run.
        - If the code is regular SQL (which we assume it is if it doesn't use the magics mentioned above), we will add the `%%sql` magic along with the catalog details to the code. If a catalog is not specified, an error message will be shown to the users. We will also add `%%px` so it's executed on the IPyParallel cluster.
        - If the code turns out not to be SQL, the behavior is undefined (most likely a BodoSQL parsing error, but could be something else).
    - For Python mode:
        - If any of the IPyParallel magics are used, an error will be shown to the user to use the Parallel-Python mode instead.
        - Everything else will be passed along as is, including our custom line magics (`%pconda`, etc.)
    - For Parallel Python mode:
        - If `%autopx` is used, a warning will be shown to the user and the code will be disregarded.
        - Any other IPyParallel magics will be passed through as is (this is safe since `autopx` won't be on, and the semantics should always be correct while in Parallel-Python mode).
        - Our custom magics (`%pconda`, etc.) will be passed through as is (this is safe since `autopx` cannot be on, so they will get run on the separate single core process, which is the intended use case).
        - All other code will be prepended with `%%px`.
- `do_execute` is now simpler, in that it just starts an IPyParallel cluster whenever the code has any IPyParallel magics (`autopx` can't get through `execute_request`, but even if it did, `do_execute` will detect it and show a warning). It also handles not allowing running IPyParallel magics (except `%pxresult`) when the engines are dead (same as before). 


Co-authored-by: Isaac <[email protected]>

* revert to using MPIEngineLauncher (#12)

* revert to using MPI

* switching to using environment vars

* switching to using environment vars

* Update bodo_platform_ipyparallel_kernel/launcher.py

Co-authored-by: Sahil Gupta <[email protected]>

Co-authored-by: Aaditya Srivathsan <[email protected]>
Co-authored-by: Sahil Gupta <[email protected]>

* changes

* update conda-lock

* change location

* fix test

* suggested changes

* fix

* update buildspec

* [run ci]

---------

Co-authored-by: Sahil Gupta <[email protected]>
Co-authored-by: Sahil Gupta <[email protected]>
Co-authored-by: meehawk <[email protected]>
Co-authored-by: Isaac Warren <[email protected]>
Co-authored-by: Isaac <[email protected]>
Co-authored-by: meehawk <[email protected]>
Co-authored-by: Ehsan Totoni <[email protected]>
Co-authored-by: Aaditya Srivathsan <[email protected]>
Co-authored-by: Aaditya Srivathsan <[email protected]>
scott-routledge2 added a commit that referenced this pull request Dec 16, 2024
IsaacWarren pushed a commit that referenced this pull request Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants