Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

language agnostic checkpointing for azure eventhub scaler #1621

Merged
merged 7 commits into from
May 5, 2021

Conversation

christle
Copy link
Contributor

@christle christle commented Feb 17, 2021

this is a solution for a language agnostic checkpointing.

It comes with a new konfiguration for eventhub scaler: checkpointType.

Actually with 3 Types:

AzureWebJob: the is default if no blobcontainer is set. this is the current working behavior for azure functions

GoSdk: GoSdk has a very special storage path and a special checkpoint structure.

Default: Default is set if a blobcontainer is set. Default, because this path and checkpoint structure is used by java, python and maybe others. It stores the values as metadata on the checkpoint blob.

Fixes #769 #767 #741

@ahmelsayed
Copy link
Contributor

Thanks @christle for the contribution. My main concern is that this is a breaking change for eventhub scaler. @tomkerkhove would probably know better than me how we should handle such breaking changes.

@christle
Copy link
Contributor Author

As I understand, the azure functions are currently the only checkpointing that works. This is the type AzureWebJob and for that, everything stays the same. No additional configuration is necessary either. All other frameworks are currently not executable. But it may be that I'm wrong. If there are currently still variants, they use probably this old Python checkpointing behavior. If want to support old Phyton checkpointing we could add a specific python checkpointer and set it as the default behavior if a container name is set.

@christle
Copy link
Contributor Author

i think the point with the old python behavior is not bad. I could add it as an additional backward checkpointer. Then we make sure all stays the same. But if anybody will use checkpointing with die current python- dapr- java- eventhub framework and so on, then the checkpointType have to be set explcitly, in this case.

@tomkerkhove
Copy link
Member

As I understand, the azure functions are currently the only checkpointing that works. This is the type AzureWebJob and for that, everything stays the same. No additional configuration is necessary either. All other frameworks are currently not executable. But it may be that I'm wrong. If there are currently still variants, they use probably this old Python checkpointing behavior. If want to support old Phyton checkpointing we could add a specific python checkpointer and set it as the default behavior if a container name is set.

That is not correct, actually.

As per the docs:

As of now, the Event Hub scaler only supports reading checkpoints from Blob Storage, as well as scaling only Event Hub applications written in C#, Java, Python or created with Azure Functions.

We'll need to be backwards compatible with all of them.

@christle christle marked this pull request as draft February 18, 2021 16:14
@christle
Copy link
Contributor Author

christle commented Feb 18, 2021

The current implementation in 2.1 is this:

containerName == "" -> use Webjob Behavior (azure function) -> that means use "azure-webjobs-event" as containername
containerName != "" -> use the same checkpointing like above but with a different path (that was suitable for c#, java before the change inside the checkpoint behavior see #767)
when read checkpoint from blob -> try it for default one (both above) and try it for specific python.

currently my change is:
containername == "" or checkpointType is "AzureWebJob" -> use Webjob Behavior (azure function) > that means use "azure-webjobs-event" as containername and the checkpointing is the same like before
containername != "" or checkpointType is "Default" -> use the new path for java and python eventhub sdks (and maybe others, i have to check c# )and read the checkpoint form blob metadata
containername != "" AND checkpointType is "GoSdk" use special gosdk path and checkpointing (also for dapr apps)

what i can add to make it backwards compatible even for old sdks:
containername != "" AND checkpointType is "" -> use an additional checkpointer for older java sdks and the python part, with the try to merge statement on the checkpoint struct. (But then the checkpointing Java, Python with the sdk have to explicit set the checkpointType. That should not be problem because this currently not working in 2.1 so they have to update keda)

i think then we have all former SDK's supported and additional the new checkpointing for java, python + the gosdk. And a open structure to support new checkpointer for more languages which special checkpoint behavior. But i think there will be more sdks that use it like the new Java and Python checkpointer. So i think the suggested default checkpointer support more than these both at the moment.

@christle christle force-pushed the main branch 3 times, most recently from 69293dd to 19c38ed Compare April 11, 2021 23:34
@christle
Copy link
Contributor Author

hi @ahmelsayed,

i've added a new checkpointer for backward compatibility. Now there a 4 checkpointer:

DefaultCheckpointer (Old Java SDK, Old Python SDK, C#, maybe more)
this checkpointer handle all checkpoint formats and storage paths, from the current release. if no CheckpointType is configured, this Checkpointer is active.

AzureWebJobCheckpointer (Azure Functions)
this checkpointer handle the format and storage path for azure functions. Works in the same way like before, if no containername is configured (or the checkpointTypeis set to AzureWebJob but this is not required because the backward compatibility).

BlobMetadataCheckpointer (Current Java SDK, Current Python, maybe more)
new checkpointer which reads checkpoint data from blob metadata, and not from the body. I've tested java and python. Maybe there are more eventhub sdk's that use metadata instead of the body. For this behavior the checkpointType: BlobMetadata and containerName are required. fixes #767 #741

GoSdkCheckpointer (GoSdk and Dapr for example which use the GoSdk)
new checkpoint for special golang path and format. For this behavior the checkpointType: GoSdk and containerName are required.

new Checkpointer could be added at the same way. Only the implementation of this interface is required:

type checkpointer interface {
	resolvePath(info azure.EventHubInfo) (*url.URL, error)
	extractCheckpoint(get *azblob.DownloadResponse) (Checkpoint, error)
}

could review that? i would left the PR on DRAFT, because i've already tested all checkpointer on a real storage blob and integrationtests but i want to test it with a complete keda build inside our cluster. Then i can remove the draft, from my side.

@tomkerkhove
Copy link
Member

tomkerkhove commented Apr 12, 2021

These are a lot of different types to checkpoint, which is fine for me!

However, instead of being so implicit, I would introduce checkpointingStrategy: {strategyName} to be more explicit as this is pretty important.

Wdyt @ahmelsayed @zroubalik?

@tomkerkhove
Copy link
Member

The logic itself and all looks great btw, thanks!

@christle
Copy link
Contributor Author

i just renamed checkpointType to checkpointStrategy. The values are unchanged. Is this ok?

@tomkerkhove
Copy link
Member

Fine for me, what do you think @zroubalik @ahmelsayed?

Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

@christle christle marked this pull request as ready for review April 25, 2021 20:37
@christle
Copy link
Contributor Author

i just removed the draft flag. Yesterday i've made a successful live test on our dev cluster.

@tomkerkhove
Copy link
Member

Would you mind:

@christle christle force-pushed the main branch 2 times, most recently from 132ff93 to 06ce04a Compare April 29, 2021 08:07
@christle
Copy link
Contributor Author

christle commented Apr 29, 2021

done. I move over to the docs repo

@christle
Copy link
Contributor Author

the docs PR: kedacore/keda-docs#429

@nzthiago
Copy link
Contributor

nzthiago commented May 3, 2021

For JavaScript the Azure SDK for JS provides a BlobCheckpointStore class which implements the required read/writes to a durable store by using Azure Blob Storage in case you want to see if this implementation would also work for JS.

@christle
Copy link
Contributor Author

christle commented May 3, 2021

i changed the code to lowercase for checkpointStrategy parameters, @zroubalik

@nzthiago thanks, for your link! I've checked the implementation and it looks like the JS framework works with the "BlobMetadata" checkpointStrategy like Java and Python.

Copy link
Member

@zroubalik zroubalik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@zroubalik
Copy link
Member

@tomkerkhove ok to merge this?

@christle
Copy link
Contributor Author

christle commented May 4, 2021

fixed merge conflict on CHANGELOG.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Azure Event Hubs scaler checkpointing should be language agnostic
5 participants