PartitionKey in Stream Analytics and Event Hubs

Karl Gardner 190 Reputation points
2024-07-29T01:01:07.6533333+00:00

Hello,

I'm trying to learn more about event hub partitions. I'm running a simple .Net program to send 20 events to an Event Hub with 3 partitions. I then query the events with the input preview in stream analytics. The simple .Net program is following:

using Azure.Messaging.EventHubs;
using Azure.Messaging.EventHubs.Producer;

var connectionString = "Endpoint=sb://eventhubsnamespace321.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=";
var eventHubName = "";
var producer = new EventHubProducerClient(connectionString, eventHubName);

try
{
    var eventBatch = new List<EventData>();
    for (var counter = 0; counter < 20; ++counter)
    {
        var eventBody = new BinaryData($$"""{"numCars": {{new Random().Next(50)}}, "highway": {{counter%3}}}""");
        var eventData = new EventData(eventBody);
        eventBatch.Add(eventData);
    }
    await producer.SendAsync(eventBatch, new SendEventOptions() {PartitionKey = "highway"});
}
catch(Exception ex)
{
    Console.WriteLine($"exception occured {ex}");
}

As you can see the Event Hub name is "highway2. I'm sending two columns (in Json) of "numCars" and "highway". The "highway" is either 0, 1, or 2 for the three different partitions. I set the PartitionKey in the SendEventOptions. In Event Hubs I have 3 partitions:

User's image

Now, I connected the Stream analytics job to the Event Hub and only seeing one partition for the PartitionId:

User's image

Shouldn't the PartitionId be different for each highway since the PartitionKey is set to highway?

Thanks!

Azure Event Hubs
Azure Event Hubs
An Azure real-time data ingestion service.
644 questions
Azure Stream Analytics
Azure Stream Analytics
An Azure real-time analytics service designed for mission-critical workloads.
357 questions
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA-MSFT 90,146 Reputation points Microsoft Employee
    2024-08-22T08:18:48.12+00:00

    @Karl Gardner - I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer .

    Ask: PartitionKey in Stream Analytics and Event Hubs

    Solution: The issue is resolved.

    So the problem here is with the SendEventOptions object sent with the EventHubProducerClient. The two properties in the SendEventOptions are PartitionId (to specify the partition number of the event batch that is being sent) and PartitionKey (to specify a partition to be grouped with a property). In this case, using the PartitionKey as "highway" would send all the events that include JSON property of "highway" to the same partition. Thus, you are only seeing one partition in the PartitionId column in stream analytics. If you would like send each event with a different highway number you have to explicitly specify the partition in the PartitionId property. The correct code for this would be to send a new event batch in each loop of the for loop:

    using Azure.Messaging.EventHubs;
    using Azure.Messaging.EventHubs.Producer;
    
    var connectionString = "Endpoint=sb://eventhubsnamespace364.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=tSAwdrF1lLVR4gsfVWfcul9m0zkp5fdrh+AEhKroVnw=";
    var eventHubName = "highway2";
    var producer = new EventHubProducerClient(connectionString, eventHubName);
    
    
    try
    {
        for (var counter = 0; counter < 20; ++counter)
        {
            var eventBatch = new List<EventData>();
            var eventBody = new BinaryData($$"""{"numCars": {{new Random().Next(50)}}, "highway": {{counter%3}}}""");
            var eventData = new EventData(eventBody);
            eventBatch.Add(eventData);
            await producer.SendAsync(eventBatch, new SendEventOptions() {PartitionId = $"{counter%3}"});
        }
    }
    catch(Exception ex)
    {
        Console.WriteLine($"exception occured {ex}");
    }
    
    

    For JSON objects events, the PartitionKey property will group all events to the same parition that include the PartitionKey as a property.

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Karl Gardner 190 Reputation points
    2024-08-20T01:53:41.2666667+00:00

    So the problem here is with the SendEventOptions object sent with the EventHubProducerClient. The two properties in the SendEventOptions are PartitionId (to specify the partition number of the event batch that is being sent) and PartitionKey (to specify a partition to be grouped with a property). In this case, using the PartitionKey as "highway" would send all the events that include JSON property of "highway" to the same partition. Thus, you are only seeing one partition in the PartitionId column in stream analytics. If you would like send each event with a different highway number you have to explicitly specify the partition in the PartitionId property. The correct code for this would be to send a new event batch in each loop of the for loop:

    using Azure.Messaging.EventHubs;
    using Azure.Messaging.EventHubs.Producer;
    
    var connectionString = "Endpoint=sb://eventhubsnamespace364.servicebus.windows.net/;SharedAccessKeyName=RootManageSharedAccessKey;SharedAccessKey=tSAwdrF1lLVR4gsfVWfcul9m0zkp5fdrh+AEhKroVnw=";
    var eventHubName = "highway2";
    var producer = new EventHubProducerClient(connectionString, eventHubName);
    
    
    try
    {
        for (var counter = 0; counter < 20; ++counter)
        {
            var eventBatch = new List<EventData>();
            var eventBody = new BinaryData($$"""{"numCars": {{new Random().Next(50)}}, "highway": {{counter%3}}}""");
            var eventData = new EventData(eventBody);
            eventBatch.Add(eventData);
            await producer.SendAsync(eventBatch, new SendEventOptions() {PartitionId = $"{counter%3}"});
        }
    }
    catch(Exception ex)
    {
        Console.WriteLine($"exception occured {ex}");
    }
    

    For JSON objects events, the PartitionKey property will group all events to the same parition that include the PartitionKey as a property.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.