event hub body cannot be decompressed, when use gzipped event hub as trigger
See original GitHub issueActual behavior
I use event hub binding as a trigger. The event content is compressed by gZip. The message body of input object event: azure.functions.EventHubEvent.get_body() can not be decompressed.
Known workarounds
When I read the message using azure-eventhub, it can be decompressed.
Example Code:
Function TimerTrigger.py sends gZipped string message to eventhub. Function EventHubTrigger.py uses it as trigger, reads the message body. But message content is different from what is sent, and can not be un-gZipped.
TimerTrigger.py
import datetime
import logging
import gzip
import io
import azure.functions as func
from azure.eventhub import EventHubClient, Sender, EventData
ADDRESS = "Removed"
USER = "Removed"
KEY = "Removed"
def main(mytimer: func.TimerRequest) -> None:
utc_timestamp = datetime.datetime.utcnow().replace(
tzinfo=datetime.timezone.utc).isoformat()
if mytimer.past_due:
logging.info('The timer is past due!')
client = EventHubClient(ADDRESS, debug=False, username=USER, password=KEY)
sender = client.add_sender(partition="0")
client.run()
message = gZipString(('Test Event @ '+ utc_timestamp).encode('utf-8'))
sender.send(EventData(message))
logging.info('TimerTrigger: sent message %s', message )
def gZipString(stringtoZip):
out = io.BytesIO()
with gzip.GzipFile(fileobj=out, mode="wb") as f:
f.write(stringtoZip)
return out.getvalue()
EventHubTrigger.py
import logging
import gzip
import io
import azure.functions as func
def main(event: func.EventHubEvent):
logging.info('EventHubEvent: SN = %s, Partition = %s', event.sequence_number, event.partition_key)
logging.info(' EventHubEvent: Body= %s', event.get_body() )
decompressed_data = ''
try:
decompressed_data = gunzip_bytes_obj(event.get_body())
except Exception as error:
logging.info('EventHubEvent: Uzip Error = %s', error )
decompressed_data = 'Failed'
pass
logging.info(' EventHubEvent: Decompressed Data= %s', decompressed_data )
def gunzip_bytes_obj(bytes_obj):
in_ = io.BytesIO()
in_.write(bytes_obj)
in_.seek(0)
with gzip.GzipFile(fileobj=in_, mode='rb') as fo:
gunzipped_bytes_obj = fo.read()
return gunzipped_bytes_obj
Example Ouput:
Here is an example from function log: From TimerTrigger function log:
TimerTrigger: sent message b'\x1f\x8b\x08\x00\xa1Q\xed\\\x02\xff\x0bI-.Qp-K\xcd+QpP020\xb4\xd450\xd55\xb2\x0814\xb522\xb020\xd03107\xb60\xd56\x00q\x00b|\x10\xfc-\x00\x00\x00'
From EventHubTigger function log:
EventHubEvent: Body= b'\x1f\xef\xbf\xbd\x08\x00\xef\xbf\xbdQ\xef\xbf\xbd\\\x02\xef\xbf\xbd\x0bI-.Qp-K\xef\xbf\xbd+QpP020\xef\xbf\xbd\xef\xbf\xbd50\xef\xbf\xbd5\xef\xbf\xbd\x0814\xef\xbf\xbd22\xef\xbf\xbd20\xef\xbf\xbd3107\xef\xbf\xbd0\xef\xbf\xbd6\x00q\x00b|\x10\xef\xbf\xbd-\x00\x00\x00'
EventHubEvent: Uzip Error = Not a gzipped file (b'\x1f\xef')
The Received message body is different from what is sent. The ‘\xef\xbf\xbd’ was not in the original message. Could it come from a different encoding (e.g. Unicode)?
Related information
azure-functions==1.0.0b4
azure-functions-worker==1.0.0b6
grpcio==1.20.1
grpcio-tools==1.20.1
protobuf==3.7.1
six==1.12.0
Issue Analytics
- State:
- Created 4 years ago
- Comments:9 (5 by maintainers)
Top GitHub Comments
Hi @AEYWang, you should be able to change the function app configuration and code to unblock your scenario:
function.json
to add"dataType": "binary"
. This will transmit the raw gzipped bytes without applying any encoding to them.def main(event: func.EventHubEvent)
todef main(event)
. You should still be able to access all the properties associated with theEventHubEvent
but this annotation change is needed to match up thefunction.json
with the function definition.I was able to make these changes and run your function successfully. Please try and feel free to circle back with the results.
Hi @AEYWang, the behavior that you are encountering is not fully documented. When you specify
binary
in thefunction.json
, we try to match it against the annotation that you have provided (in your caseEventHubEvent
). If you requested a particulardataType
, this does not match up with the annotation ofEventHubEvent
and we would error out. This is our way of ensuring that users do not specify inconsistent settings. However, the only way to request raw bytes is to use thedataType
so the annotation needs to be removed to avoid the error. In the future, we will most likely make this a warning since it is not an inconsistency in every case. Thanks for pointing out the issue and asking all the right questions. 😄 Closing this issue for now. Please re-open if you have further concerns.