Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BSE-4358: BodoSQL support for S3 tables #126

Merged
merged 81 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
e76ddfb
Start planner API movement
njriasan Jan 2, 2025
64812e8
Updated the entry point for parseQuery
njriasan Jan 2, 2025
b1bbc8a
Fixed the import
njriasan Jan 2, 2025
88e6673
Merge branch 'main' into nick/bodosql_python_interface
njriasan Jan 2, 2025
325486c
Merge branch 'main' into nick/bodosql_python_interface
njriasan Jan 2, 2025
85cbbad
Moved getOptimizedPlanString
njriasan Jan 3, 2025
777458d
Moved getPandasAndPlanString
njriasan Jan 3, 2025
293e3b8
Moved getPandasString
njriasan Jan 3, 2025
2a835ec
Moved getWriteType
njriasan Jan 3, 2025
f9d9611
Moved executeDDL
njriasan Jan 3, 2025
abab715
Moved the last RelationalAlgebra API
njriasan Jan 3, 2025
1ec8501
Added missing docstrings [run CI]
njriasan Jan 3, 2025
7f9d2e9
Fixed the import ordering [run CI]
njriasan Jan 3, 2025
7da882e
Move reset to package private [run CI]
njriasan Jan 3, 2025
6d02755
Updated fromTypeID API [run CI]
njriasan Jan 3, 2025
2a943b9
Updated WriteTargetEnum [run CI]
njriasan Jan 3, 2025
add7295
Updated the logger API [run CI]
njriasan Jan 3, 2025
d8665d5
Fixed save issue [run CI]
njriasan Jan 3, 2025
28cce7d
Fixed API copying issue [run CI]
njriasan Jan 3, 2025
7ac23dc
Merge branch 'nick/bodosql_python_interface' into nick/fix_BodoSQLCol…
njriasan Jan 3, 2025
af1a9cf
Added missing static declaration [run CI]
njriasan Jan 3, 2025
864ee4e
Merge branch 'nick/bodosql_python_interface' into nick/fix_BodoSQLCol…
njriasan Jan 3, 2025
48605d6
Added more missing static declarations [run CI]
njriasan Jan 3, 2025
1b909c9
Merge branch 'main' into nick/bodosql_python_interface [run CI]
njriasan Jan 3, 2025
04db869
Removed ArrayList
njriasan Jan 3, 2025
80fd441
Removed hash map interaction
njriasan Jan 3, 2025
98bafde
Added stack trace API
njriasan Jan 3, 2025
052ce93
Removed properties
njriasan Jan 3, 2025
c25928c
Added builders
njriasan Jan 3, 2025
2349f14
Add local table builder
njriasan Jan 3, 2025
142c45f
Update LocalSchema
njriasan Jan 3, 2025
7febfe5
Removed LocalSchemaClass [run CI]
njriasan Jan 3, 2025
f87b383
Removed more dead code [run CI]
njriasan Jan 3, 2025
bca4fdc
Fixed bugs [run CI]
njriasan Jan 3, 2025
3b36b27
Merge branch 'nick/bodosql_python_interface' into nick/fix_BodoSQLCol…
njriasan Jan 3, 2025
688f131
Merge branch 'nick/fix_BodoSQLColumn' into nick/remove_constructors_b…
njriasan Jan 3, 2025
6918eb9
Removed planner type
njriasan Jan 3, 2025
09f51b6
Removed DDLExecutionResult calls
njriasan Jan 3, 2025
7ebec08
Added get lowered globals
njriasan Jan 3, 2025
f484159
Refactored pair access
njriasan Jan 3, 2025
1c5d5be
Fixed the exception interface
njriasan Jan 3, 2025
b1cee44
Removed ColumnDataTypeClass
njriasan Jan 3, 2025
76a0c56
Removed ColumnDataTypeClass
njriasan Jan 3, 2025
1e0139e
Removed everything but the constructors
njriasan Jan 3, 2025
f63733d
Refactored RelationalAlgebraGenerator constructor
njriasan Jan 3, 2025
374e4bf
Removed RelationalAlgebraGeneratorClass [run CI]
njriasan Jan 3, 2025
db61339
Merge branch 'main' into nick/bodosql_python_interface [run CI]
njriasan Jan 4, 2025
4b7890e
Merge branch 'nick/bodosql_python_interface' into nick/fix_BodoSQLCol…
njriasan Jan 4, 2025
a9c46f3
Merge branch 'nick/fix_BodoSQLColumn' into nick/remove_constructors_b…
njriasan Jan 4, 2025
10952b6
Merge branch 'nick/remove_constructors_bodosql' into nick/remove_rema…
njriasan Jan 4, 2025
395d644
Fixed map APIs [run CI]
njriasan Jan 5, 2025
95052cb
Merge branch 'nick/remove_constructors_bodosql' into nick/remove_rema…
njriasan Jan 5, 2025
2ca5b36
Fixed create_java_dynamic_parameter_type_list [run CI]
njriasan Jan 6, 2025
db9edfa
Merge branch 'nick/remove_constructors_bodosql' into nick/remove_rema…
njriasan Jan 6, 2025
d066870
Fixed merge conflict [run CI]
njriasan Jan 6, 2025
c873e60
Merge branch 'nick/fix_BodoSQLColumn' into nick/remove_constructors_b…
njriasan Jan 6, 2025
2dfb822
Apply Isaac's feedback [run CI]
njriasan Jan 6, 2025
c66bafc
Merge branch 'nick/remove_constructors_bodosql' into nick/remove_rema…
njriasan Jan 6, 2025
fcebae7
Merged with main [run CI]
njriasan Jan 6, 2025
0fb2a07
Merge branch 'nick/remove_constructors_bodosql' into nick/remove_rema…
njriasan Jan 6, 2025
2fddacb
Fixed APIs
njriasan Jan 6, 2025
3164ada
moved definitions [run CI]
njriasan Jan 6, 2025
86947e4
Add S3TablesCatalog skeleton
IsaacWarren Jan 6, 2025
e91d009
Add implementations for bodosql catalog
IsaacWarren Jan 6, 2025
0f7b77c
Add BodoS3TablesCatalog to PythonEntryPoint
IsaacWarren Jan 7, 2025
550691b
Add s3_tables_catalog to bodosql
IsaacWarren Jan 7, 2025
0585d51
Add s3_tables_catalog fixture
IsaacWarren Jan 7, 2025
01fe6ac
Actually cleanup written table
IsaacWarren Jan 7, 2025
d712c80
Add basic s3 tables read test
IsaacWarren Jan 7, 2025
a7cd361
Add write test
IsaacWarren Jan 7, 2025
43844d7
Update docs
IsaacWarren Jan 7, 2025
e9372c4
Allow catalogs to not specify a default schema
IsaacWarren Jan 7, 2025
d5bb594
Don't give default schema
IsaacWarren Jan 7, 2025
c015ad2
[Run CI]
IsaacWarren Jan 7, 2025
43f60fc
Merge remote-tracking branch 'origin/main' into isaac/bodosql_s3_tables
IsaacWarren Jan 8, 2025
5a8c201
[Run CI]
IsaacWarren Jan 8, 2025
6e2cc69
Fix docstring
IsaacWarren Jan 9, 2025
21de7a1
Fix doc
IsaacWarren Jan 9, 2025
d4389f5
Fix comment
IsaacWarren Jan 9, 2025
06c9ed3
Add infra requirements to docstrings
IsaacWarren Jan 9, 2025
7095e2d
[Run CI]
IsaacWarren Jan 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Add local table builder
  • Loading branch information
njriasan committed Jan 3, 2025
commit 2349f14af695edd68a04fcf7f8999edb32768813
6 changes: 2 additions & 4 deletions BodoSQL/bodosql/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,11 +22,9 @@
from bodosql.bodosql_types.database_catalog import DatabaseCatalog
from bodosql.bodosql_types.table_path import TablePath, TablePathType
from bodosql.imported_java_classes import (
ColumnClass,
ColumnDataTypeClass,
JavaEntryPoint,
LocalSchemaClass,
LocalTableClass,
RelationalAlgebraGeneratorClass,
)
from bodosql.py4j_gateway import build_java_array_list, build_java_hash_map
Expand Down Expand Up @@ -234,7 +232,7 @@ def construct_json_array_type(arr_type):

def get_sql_column_type(arr_type, col_name):
data_type = get_sql_data_type(arr_type)
return ColumnClass(col_name, data_type)
return JavaEntryPoint.buildBodoSQLColumnImpl(col_name, data_type)


def get_sql_data_type(arr_type):
Expand Down Expand Up @@ -580,7 +578,7 @@ def add_table_type(
estimated_ndvs = {} if estimated_ndvs is None else estimated_ndvs
estimated_ndvs_java_map = build_java_hash_map(estimated_ndvs)

table = LocalTableClass(
table = JavaEntryPoint.buildLocalTable(
table_name,
schema.getFullPath(),
col_arr,
Expand Down
4 changes: 0 additions & 4 deletions BodoSQL/bodosql/imported_java_classes.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,7 @@
gateway = get_gateway()
if bodo.get_rank() == 0:
try:
ColumnClass = gateway.jvm.com.bodosql.calcite.table.BodoSQLColumnImpl
ColumnDataTypeClass = gateway.jvm.com.bodosql.calcite.table.ColumnDataTypeInfo
LocalTableClass = gateway.jvm.com.bodosql.calcite.table.LocalTable
LocalSchemaClass = gateway.jvm.com.bodosql.calcite.schema.LocalSchema
RelationalAlgebraGeneratorClass = (
gateway.jvm.com.bodosql.calcite.application.RelationalAlgebraGenerator
Expand All @@ -37,9 +35,7 @@
saw_error = True
msg = str(e)
else:
ColumnClass = None
ColumnDataTypeClass = None
LocalTableClass = None
LocalSchemaClass = None
RelationalAlgebraGeneratorClass = None
SnowflakeDriver = None
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,10 @@ import com.bodosql.calcite.catalog.SnowflakeCatalog
import com.bodosql.calcite.catalog.TabularCatalog
import com.bodosql.calcite.ddl.DDLExecutionResult
import com.bodosql.calcite.table.BodoSQLColumn
import com.bodosql.calcite.table.BodoSQLColumnImpl
import com.bodosql.calcite.table.ColumnDataTypeInfo
import com.bodosql.calcite.table.LocalTable
import com.google.common.collect.ImmutableList
import org.apache.commons.lang3.exception.ExceptionUtils
import java.util.Properties

Expand Down Expand Up @@ -308,5 +311,57 @@ class PythonEntryPoint {
icebergVolume: String?,
): SnowflakeCatalog =
SnowflakeCatalog(username, password, accountName, defaultDatabaseName, warehouseName, accountInfo, icebergVolume)

/**
* Build a BodoSQLColumnImpl object.
* @param columnName The column name to use.
* @param dataTypeInfo The data type info to use.
* @return The BodoSQLColumnImpl object.
*/
@JvmStatic
fun buildBodoSQLColumnImpl(
columnName: String,
dataTypeInfo: ColumnDataTypeInfo,
): BodoSQLColumnImpl = BodoSQLColumnImpl(columnName, dataTypeInfo)

/**
* Build a LocalTable object.
* @param tableName The table name to use.
* @param path The path to use.
* @param columns The columns to use.
* @param isWriteable Whether the table is writeable.
* @param readCode The read code to use.
* @param writeCodeFormatString The write code format string to use.
* @param useIORead Whether to use IO read.
* @param dbType The database type to use.
* @param estimatedRowCount The estimated row count to use. If this is not known, this should be null.
* @param estimatedNdvs The estimated NDVs to use. If this is not known, this should be null.
* @return The LocalTable object.
*/
@JvmStatic
fun buildLocalTable(
tableName: String,
path: ImmutableList<String>,
columns: List<BodoSQLColumn>,
isWriteable: Boolean,
readCode: String,
writeCodeFormatString: String,
useIORead: Boolean,
dbType: String,
estimatedRowCount: Long?,
estimatedNdvs: Map<String, Int>,
): LocalTable =
LocalTable(
tableName,
path,
columns,
isWriteable,
readCode,
writeCodeFormatString,
useIORead,
dbType,
estimatedRowCount,
estimatedNdvs,
)
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -9,13 +9,13 @@
import com.bodosql.calcite.ir.Variable;
import com.google.common.collect.ImmutableList;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import org.apache.calcite.rel.type.RelDataType;
import org.apache.calcite.rel.type.RelDataTypeField;
import org.apache.calcite.schema.Statistic;
import org.apache.calcite.schema.Table;
import org.checkerframework.checker.nullness.qual.NonNull;
import org.checkerframework.checker.nullness.qual.Nullable;

/**
Expand Down Expand Up @@ -77,16 +77,16 @@ public class LocalTable extends BodoSqlTable {
* @param estimatedRowCount Estimated row count passed from Python.
*/
public LocalTable(
String name,
ImmutableList<String> schemaPath,
List<BodoSQLColumn> columns,
boolean isWriteable,
String readCode,
String writeCodeFormatString,
boolean useIORead,
String dbType,
@NonNull String name,
@NonNull ImmutableList<String> schemaPath,
@NonNull List<BodoSQLColumn> columns,
@NonNull boolean isWriteable,
@NonNull String readCode,
@NonNull String writeCodeFormatString,
@NonNull boolean useIORead,
@NonNull String dbType,
@Nullable Long estimatedRowCount,
@Nullable HashMap<String, Integer> estimatedNdvs) {
@NonNull Map<String, Integer> estimatedNdvs) {

super(name, schemaPath, columns);
this.isWriteable = isWriteable;
Expand Down Expand Up @@ -214,7 +214,7 @@ public Table extend(List<RelDataTypeField> extensionFields) {
useIORead,
dbType,
null,
null);
estimatedNdvs);
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ object BodoGenTest {
false,
"MEMORY",
null,
null,
mapOf(),
)
schema.addTable(table)
var cols2: ArrayList<BodoSQLColumn> = ArrayList()
Expand All @@ -61,7 +61,7 @@ object BodoGenTest {
false,
"MEMORY",
null,
null,
mapOf(),
)
schema.addTable(table2)
val table3: BodoSqlTable =
Expand All @@ -75,7 +75,7 @@ object BodoGenTest {
false,
"MEMORY",
null,
null,
mapOf(),
)
schema.addTable(table3)
val generator =
Expand Down