question-mark
Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

event hub body cannot be decompressed, when use gzipped event hub as trigger

See original GitHub issue

Actual behavior

I use event hub binding as a trigger. The event content is compressed by gZip. The message body of input object event: azure.functions.EventHubEvent.get_body() can not be decompressed.

Known workarounds

When I read the message using azure-eventhub, it can be decompressed.

Example Code:

Function TimerTrigger.py sends gZipped string message to eventhub. Function EventHubTrigger.py uses it as trigger, reads the message body. But message content is different from what is sent, and can not be un-gZipped.

TimerTrigger.py

import datetime
import logging
import gzip
import io

import azure.functions as func
from azure.eventhub import EventHubClient, Sender, EventData

ADDRESS = "Removed"
USER = "Removed"
KEY = "Removed"

def main(mytimer: func.TimerRequest) -> None:
    utc_timestamp = datetime.datetime.utcnow().replace(
        tzinfo=datetime.timezone.utc).isoformat()

    if mytimer.past_due:
        logging.info('The timer is past due!')

    client = EventHubClient(ADDRESS, debug=False, username=USER, password=KEY)
    sender = client.add_sender(partition="0")
    client.run()
    message = gZipString(('Test Event @ '+ utc_timestamp).encode('utf-8'))
    sender.send(EventData(message))
    logging.info('TimerTrigger: sent message %s', message )

def gZipString(stringtoZip):
    out = io.BytesIO()
    with gzip.GzipFile(fileobj=out, mode="wb") as f:
        f.write(stringtoZip)
    return out.getvalue()

EventHubTrigger.py

import logging
import gzip
import io
import azure.functions as func

def main(event: func.EventHubEvent):
    logging.info('EventHubEvent: SN = %s, Partition = %s', event.sequence_number, event.partition_key)
    logging.info('  EventHubEvent: Body= %s', event.get_body() )
    decompressed_data = ''
    try: 
        decompressed_data = gunzip_bytes_obj(event.get_body())
    except Exception as error:
        logging.info('EventHubEvent: Uzip Error = %s', error )
        decompressed_data = 'Failed'
        pass
    logging.info('  EventHubEvent: Decompressed Data= %s', decompressed_data )
    
def gunzip_bytes_obj(bytes_obj):
    in_ = io.BytesIO()
    in_.write(bytes_obj)
    in_.seek(0)
    with gzip.GzipFile(fileobj=in_, mode='rb') as fo:
        gunzipped_bytes_obj = fo.read()

    return gunzipped_bytes_obj

Example Ouput:

Here is an example from function log: From TimerTrigger function log:

TimerTrigger: sent message b'\x1f\x8b\x08\x00\xa1Q\xed\\\x02\xff\x0bI-.Qp-K\xcd+QpP020\xb4\xd450\xd55\xb2\x0814\xb522\xb020\xd03107\xb60\xd56\x00q\x00b|\x10\xfc-\x00\x00\x00'

From EventHubTigger function log:

EventHubEvent: Body= b'\x1f\xef\xbf\xbd\x08\x00\xef\xbf\xbdQ\xef\xbf\xbd\\\x02\xef\xbf\xbd\x0bI-.Qp-K\xef\xbf\xbd+QpP020\xef\xbf\xbd\xef\xbf\xbd50\xef\xbf\xbd5\xef\xbf\xbd\x0814\xef\xbf\xbd22\xef\xbf\xbd20\xef\xbf\xbd3107\xef\xbf\xbd0\xef\xbf\xbd6\x00q\x00b|\x10\xef\xbf\xbd-\x00\x00\x00'
EventHubEvent: Uzip Error = Not a gzipped file (b'\x1f\xef')

The Received message body is different from what is sent. The ‘\xef\xbf\xbd’ was not in the original message. Could it come from a different encoding (e.g. Unicode)?

Related information

azure-functions==1.0.0b4
azure-functions-worker==1.0.0b6
grpcio==1.20.1
grpcio-tools==1.20.1
protobuf==3.7.1
six==1.12.0

Issue Analytics

  • State:closed
  • Created 4 years ago
  • Comments:9 (5 by maintainers)

github_iconTop GitHub Comments

1reaction
maiqbal11commented, Jun 21, 2019

Hi @AEYWang, you should be able to change the function app configuration and code to unblock your scenario:

  1. Modify your function.json to add "dataType": "binary". This will transmit the raw gzipped bytes without applying any encoding to them.
  2. Change the function signature from def main(event: func.EventHubEvent) to def main(event). You should still be able to access all the properties associated with the EventHubEvent but this annotation change is needed to match up the function.json with the function definition.

I was able to make these changes and run your function successfully. Please try and feel free to circle back with the results.

0reactions
maiqbal11commented, Jun 25, 2019

Hi @AEYWang, the behavior that you are encountering is not fully documented. When you specify binary in the function.json, we try to match it against the annotation that you have provided (in your case EventHubEvent). If you requested a particular dataType, this does not match up with the annotation of EventHubEvent and we would error out. This is our way of ensuring that users do not specify inconsistent settings. However, the only way to request raw bytes is to use the dataType so the annotation needs to be removed to avoid the error. In the future, we will most likely make this a warning since it is not an inconsistency in every case. Thanks for pointing out the issue and asking all the right questions. 😄 Closing this issue for now. Please re-open if you have further concerns.

Read more comments on GitHub >

github_iconTop Results From Across the Web

Unable to decompress Gzip Body - MSDN
I have got a Pi sending IOT messaged with GZIP based on the following ... I have IOTHub sending to EventHub and Function...
Read more >
How to Process Compressed Data flowing through an Azure ...
I am pretty new to using Event hubs and am stuck of this problem. We are streaming data of format xml to eventhub....
Read more >
Azure Event Hubs | Databricks on AWS
Learn how to use Azure Event Hubs as a source and sink for streaming data in Databricks.
Read more >
Azure Functions and Event Hubs: Optimising for Throughput
TL;DR. I take a fairly standard serverless event processing scenario — an Azure Function triggered by messages in an Event Hub — and...
Read more >
Decompressing Azure IoT messages using Azure Stream ...
The EventHub shows output based on the decompressed messages. Other services picking up the compressed events could be Azure Functions, Storage ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found