Skip to content

Issue with url query string in the version of furl library used by unoconv extension's renderer #250

@rija

Description

@rija

What:

After launching an instance of MFR (configured for http provider) from a Docker image locally built from up-to-date code from github ;
when I tried to render a PPTX document by calling MFR as follow:
http://192.168.99.100:7778/render?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx

I got the following error from PDF.js:

PDF.js v1.0.1145 (build: 76a24d8)
Message: Unexpected server response (400) while retrieving PDF "http://192.168.99.100:7778/export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx?format=pdf".

The url is clearly malformed (?format instead of &format) and the log entry reflect that:

[2017-04-21 12:22:35,994][ERROR][tornado.application]: Uncaught exception GET /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx?format=pdf (192.168.99.1)
HTTPServerRequest(protocol='http', host='192.168.99.100:7778', method='GET', uri='/export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx?format=pdf', version='HTTP/1.1', remote_ip='192.168.99.1', headers={'Accept-Language': 'en-us', 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/602.4.8 (KHTML, like Gecko) Version/10.0.3 Safari/602.4.8', 'Accept': '*/*', 'Referer': 'http://192.168.99.100:7778/assets/pdf/build/pdf.worker.min.js', 'Accept-Encoding': 'gzip, deflate', 'Host': '192.168.99.100:7778', 'Connection': 'keep-alive', 'Dnt': '1'})
Traceback (most recent call last):
 ...
  File "/code/mfr/server/handlers/export.py", line 27, in prepare
    self.format = self.request.query_arguments['format'][0].decode('utf-8')
KeyError: 'format'

Investigation:

The error in the log seems to indicate an error in the renderer code (that call the export handler) for unoconv extensions so I've surrounded the following line of code in mfr/extrensions/unoconv/render.py by logging of the url before and after adding the query string:

24 exported_url.args['format'] = self.map['format']

Expected outcome:

[2017-04-21 12:22:35,472][WARNING][mfr.extensions.unoconv.render]: url before format: /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx
[2017-04-21 12:22:35,472][WARNING][mfr.extensions.unoconv.render]: url after format: /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx&format=pdf

Actual outcome:

[2017-04-21 12:22:35,472][WARNING][mfr.extensions.unoconv.render]: url before format: /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx
[2017-04-21 12:22:35,472][WARNING][mfr.extensions.unoconv.render]: url after format: /export?url=http://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx?format=pdf

Cause & Workaround:

I noticed in mfr/requirements.txt the following line:

furl==0.4.2

I've installed furl 0.5.7 on the docker container

pip3 install furl==0.5.7

and it fixed the issue:

[2017-04-21 13:34:14,158][WARNING][mfr.extensions.unoconv.render]: url before format: /export?url=https://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx
[2017-04-21 13:34:14,159][WARNING][mfr.extensions.unoconv.render]: url after format: /export?url=https://gigadb-bundles-test.s3.amazonaws.com/ComparingReplicates.pptx&format=pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions