Bodo scales analytics/ML codes in Python to bare-metal cluster/cloud performance automatically. It compiles a subset of Python (Pandas/Numpy) to efficient parallel binaries with MPI, requiring only minimal code changes. Bodo is orders of magnitude faster than alternatives like Apache Spark.
Development guide includes Bodo's development documentation. Below is a list of useful documents:
- Getting Started
- Building Bodo from Source
- Bodo Engine Architecture: Compiler Stages, Builtin Functions, IR Extensions.
- Development Lifecycle: Process of contributing to Bodo.
- How to add Tests: Writing tests and how to use pytest framework
- Debugging
- Code Style: Bodo Code style guide
- CI/CD: Testing with Continuous Integration
- Performance Benchmarking Tips
- Code Coverage
- Useful Numba knowledge
- Development using Docker
- Conda Build
- Release Checklist
- Customer Ops
- Bodo User Documentation
Most the relevant documentation can be found on the BodoSQL Confluence. You may also encounter Bodo related issues. For those you should use the Bodo-Engine Confluence.