-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Here is a summary on the discussion on custom response parsing (#108, #125, and others):
In version 1 of the HTTP Client Module, the override-media-type
was available (inspired by the override-content-type
option in XProc). It could be used to overwrite the Content-Type
header of a response.
In practice, the approach turned out be fairly flexible, but it had some shortcomings: It did not allow for a fine-grained processing of multipart bodies, and it was not intuitive enough for all users.
The following alternatives have been proposed in the scope of version 2 of the spec:
parse-response
(boolean)
- Original draft: https://github.com/expath/expath-cg/blob/1da836628bbdf831fcfc1e4ad9dc487d05e7c663/specs/http-client-2/index.html
- Description: Parsing of the response body can be disabled via the
parse-response
option. All bodies of single and multipart responses will be returned as binary items of typexs:base64Binary
, and the values can be processed (stored, parsed, forwarded) in a second step.
Adam pointed out that the name may be misleading, so it’s named parse-response-entity-body
in the current draft. My suggestion in #108 was to call it parse-bodies
: Only responses are “parsed” (requests are serialized), and the plural form indicates that we may have multiple bodies in a response.
parse-response
(enum)
Adam made a suggestion for extending the proposal in #108:
raw
. We don't have an equivalent option at the moment, but the idea is that the raw response from the server is returned. i.e. no parsing occurs, no status, no headers. This has applications for debugging and also for logging responses.status
. This would be equivalent tostatus-only: true()
.headers
. This would be the equivalent toparse-response-entity-body : false()
multipart-raw
. This would extract the headers of the response, and locate the multipart bodies, however this would present each multipart in a raw manner, i.e. no multipart headers would be parsed.full
. This would be the default, and basically the same as the currentparse-response-entity-body : true()
parse-response
(map)
In #125, the proposal was extended to a nested map (further discussion see #125 (comment)).
I decided to summarize the proposals as I believe that a plain and simple solution might lead to less confusion and may even be more flexible, because a user can always do post-processing in XQuery.
In my opinion, the major requirements for (non-implicit) response parsing is to be able to retrieve bodies (single part, multiple bodies) in their original representation. In #125, I proposed the following solution:
parse
/ parse-bodies
(string)
Option | Description |
---|---|
auto |
implicit parsing (default) |
string |
return all bodies as strings |
binary |
return all bodies as binaries |
skip |
ignore response body |
I believe this approach would be sufficient to cover most challenges people will be confronted with (but, honestly, not all that we could envision):
- In most cases, people will use the default (
auto
). - If the requested result cannot be converted to the implicit target format, or if another format is required than resulting from the implicit conversion, the
string
option can be used for textual results. All bodies will be converted to strings, based on the encoding that is returned by the server (optionally) via the originalContent-Type
header and thecharset
option. - The
binary
option is helpful…- if the conversion is no text,
- if the string conversion fails,
- if some bodies of a multipart response are textual and some are binary, or
- if the results needs to be processed only as simple stream.
- The
skip
is option is used if only the headers of a result are required.
Some more thoughts on this simplified approach are listed in #125.
Examples for using the approach:
(: return single JSON response as XML :)
http:get('http://json.db/doc123', map { 'parse': 'string' })?body
=> fn:json-to-xml()
(: store returned multipart bodies :)
for $part at $pos in http:get('http://multipart.db/data123', map { 'parse': 'binary' })?body
return file:write-binary($pos || '.bin', $part?body)
(: ignore reponse bodies :)
http:get('http://json.db/doc123', map { 'parse': 'skip' })
@adamretter: Maybe my thoughts are too plain and simple? Do you get some more use cases in mind that we should consider? Looking forward to feedback!