Coverage for /pythoncovmergedfiles/medio/medio/usr/local/lib/python3.8/site-packages/botocore/parsers.py: 22%
Shortcuts on this page
r m x toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
Shortcuts on this page
r m x toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
1# Copyright 2014 Amazon.com, Inc. or its affiliates. All Rights Reserved.
2#
3# Licensed under the Apache License, Version 2.0 (the "License"). You
4# may not use this file except in compliance with the License. A copy of
5# the License is located at
6#
7# http://aws.amazon.com/apache2.0/
8#
9# or in the "license" file accompanying this file. This file is
10# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
11# ANY KIND, either express or implied. See the License for the specific
12# language governing permissions and limitations under the License.
13"""Response parsers for the various protocol types.
15The module contains classes that can take an HTTP response, and given
16an output shape, parse the response into a dict according to the
17rules in the output shape.
19There are many similarities amongst the different protocols with regard
20to response parsing, and the code is structured in a way to avoid
21code duplication when possible. The diagram below is a diagram
22showing the inheritance hierarchy of the response classes.
24::
28 +--------------+
29 |ResponseParser|
30 +--------------+
31 ^ ^ ^
32 +--------------------+ | +-------------------+
33 | | |
34 +----------+----------+ +------+-------+ +-------+------+
35 |BaseXMLResponseParser| |BaseRestParser| |BaseJSONParser|
36 +---------------------+ +--------------+ +--------------+
37 ^ ^ ^ ^ ^ ^
38 | | | | | |
39 | | | | | |
40 | ++----------+-+ +-+-----------++ |
41 | |RestXMLParser| |RestJSONParser| |
42 +-----+-----+ +-------------+ +--------------+ +----+-----+
43 |QueryParser| |JSONParser|
44 +-----------+ +----------+
47The diagram above shows that there is a base class, ``ResponseParser`` that
48contains logic that is similar amongst all the different protocols (``query``,
49``json``, ``rest-json``, ``rest-xml``). Amongst the various services there
50is shared logic that can be grouped several ways:
52* The ``query`` and ``rest-xml`` both have XML bodies that are parsed in the
53 same way.
54* The ``json`` and ``rest-json`` protocols both have JSON bodies that are
55 parsed in the same way.
56* The ``rest-json`` and ``rest-xml`` protocols have additional attributes
57 besides body parameters that are parsed the same (headers, query string,
58 status code).
60This is reflected in the class diagram above. The ``BaseXMLResponseParser``
61and the BaseJSONParser contain logic for parsing the XML/JSON body,
62and the BaseRestParser contains logic for parsing out attributes that
63come from other parts of the HTTP response. Classes like the
64``RestXMLParser`` inherit from the ``BaseXMLResponseParser`` to get the
65XML body parsing logic and the ``BaseRestParser`` to get the HTTP
66header/status code/query string parsing.
68Additionally, there are event stream parsers that are used by the other parsers
69to wrap streaming bodies that represent a stream of events. The
70BaseEventStreamParser extends from ResponseParser and defines the logic for
71parsing values from the headers and payload of a message from the underlying
72binary encoding protocol. Currently, event streams support parsing bodies
73encoded as JSON and XML through the following hierarchy.
76 +--------------+
77 |ResponseParser|
78 +--------------+
79 ^ ^ ^
80 +--------------------+ | +------------------+
81 | | |
82 +----------+----------+ +----------+----------+ +-------+------+
83 |BaseXMLResponseParser| |BaseEventStreamParser| |BaseJSONParser|
84 +---------------------+ +---------------------+ +--------------+
85 ^ ^ ^ ^
86 | | | |
87 | | | |
88 +-+----------------+-+ +-+-----------------+-+
89 |EventStreamXMLParser| |EventStreamJSONParser|
90 +--------------------+ +---------------------+
92Return Values
93=============
95Each call to ``parse()`` returns a dict has this form::
97 Standard Response
99 {
100 "ResponseMetadata": {"RequestId": <requestid>}
101 <response keys>
102 }
104 Error response
106 {
107 "ResponseMetadata": {"RequestId": <requestid>}
108 "Error": {
109 "Code": <string>,
110 "Message": <string>,
111 "Type": <string>,
112 <additional keys>
113 }
114 }
116"""
118import base64
119import http.client
120import json
121import logging
122import re
124from botocore.compat import ETree, XMLParseError
125from botocore.eventstream import EventStream, NoInitialResponseError
126from botocore.utils import (
127 is_json_value_header,
128 lowercase_dict,
129 merge_dicts,
130 parse_timestamp,
131)
133LOG = logging.getLogger(__name__)
135DEFAULT_TIMESTAMP_PARSER = parse_timestamp
138class ResponseParserFactory:
139 def __init__(self):
140 self._defaults = {}
142 def set_parser_defaults(self, **kwargs):
143 """Set default arguments when a parser instance is created.
145 You can specify any kwargs that are allowed by a ResponseParser
146 class. There are currently two arguments:
148 * timestamp_parser - A callable that can parse a timestamp string
149 * blob_parser - A callable that can parse a blob type
151 """
152 self._defaults.update(kwargs)
154 def create_parser(self, protocol_name):
155 parser_cls = PROTOCOL_PARSERS[protocol_name]
156 return parser_cls(**self._defaults)
159def create_parser(protocol):
160 return ResponseParserFactory().create_parser(protocol)
163def _text_content(func):
164 # This decorator hides the difference between
165 # an XML node with text or a plain string. It's used
166 # to ensure that scalar processing operates only on text
167 # strings, which allows the same scalar handlers to be used
168 # for XML nodes from the body and HTTP headers.
169 def _get_text_content(self, shape, node_or_string):
170 if hasattr(node_or_string, 'text'):
171 text = node_or_string.text
172 if text is None:
173 # If an XML node is empty <foo></foo>,
174 # we want to parse that as an empty string,
175 # not as a null/None value.
176 text = ''
177 else:
178 text = node_or_string
179 return func(self, shape, text)
181 return _get_text_content
184class ResponseParserError(Exception):
185 pass
188class ResponseParser:
189 """Base class for response parsing.
191 This class represents the interface that all ResponseParsers for the
192 various protocols must implement.
194 This class will take an HTTP response and a model shape and parse the
195 HTTP response into a dictionary.
197 There is a single public method exposed: ``parse``. See the ``parse``
198 docstring for more info.
200 """
202 DEFAULT_ENCODING = 'utf-8'
203 EVENT_STREAM_PARSER_CLS = None
205 def __init__(self, timestamp_parser=None, blob_parser=None):
206 if timestamp_parser is None:
207 timestamp_parser = DEFAULT_TIMESTAMP_PARSER
208 self._timestamp_parser = timestamp_parser
209 if blob_parser is None:
210 blob_parser = self._default_blob_parser
211 self._blob_parser = blob_parser
212 self._event_stream_parser = None
213 if self.EVENT_STREAM_PARSER_CLS is not None:
214 self._event_stream_parser = self.EVENT_STREAM_PARSER_CLS(
215 timestamp_parser, blob_parser
216 )
218 def _default_blob_parser(self, value):
219 # Blobs are always returned as bytes type (this matters on python3).
220 # We don't decode this to a str because it's entirely possible that the
221 # blob contains binary data that actually can't be decoded.
222 return base64.b64decode(value)
224 def parse(self, response, shape):
225 """Parse the HTTP response given a shape.
227 :param response: The HTTP response dictionary. This is a dictionary
228 that represents the HTTP request. The dictionary must have the
229 following keys, ``body``, ``headers``, and ``status_code``.
231 :param shape: The model shape describing the expected output.
232 :return: Returns a dictionary representing the parsed response
233 described by the model. In addition to the shape described from
234 the model, each response will also have a ``ResponseMetadata``
235 which contains metadata about the response, which contains at least
236 two keys containing ``RequestId`` and ``HTTPStatusCode``. Some
237 responses may populate additional keys, but ``RequestId`` will
238 always be present.
240 """
241 LOG.debug('Response headers: %r', response['headers'])
242 LOG.debug('Response body:\n%r', response['body'])
243 if response['status_code'] >= 301:
244 if self._is_generic_error_response(response):
245 parsed = self._do_generic_error_parse(response)
246 elif self._is_modeled_error_shape(shape):
247 parsed = self._do_modeled_error_parse(response, shape)
248 # We don't want to decorate the modeled fields with metadata
249 return parsed
250 else:
251 parsed = self._do_error_parse(response, shape)
252 else:
253 parsed = self._do_parse(response, shape)
255 # We don't want to decorate event stream responses with metadata
256 if shape and shape.serialization.get('eventstream'):
257 return parsed
259 # Add ResponseMetadata if it doesn't exist and inject the HTTP
260 # status code and headers from the response.
261 if isinstance(parsed, dict):
262 response_metadata = parsed.get('ResponseMetadata', {})
263 response_metadata['HTTPStatusCode'] = response['status_code']
264 # Ensure that the http header keys are all lower cased. Older
265 # versions of urllib3 (< 1.11) would unintentionally do this for us
266 # (see urllib3#633). We need to do this conversion manually now.
267 headers = response['headers']
268 response_metadata['HTTPHeaders'] = lowercase_dict(headers)
269 parsed['ResponseMetadata'] = response_metadata
270 self._add_checksum_response_metadata(response, response_metadata)
271 return parsed
273 def _add_checksum_response_metadata(self, response, response_metadata):
274 checksum_context = response.get('context', {}).get('checksum', {})
275 algorithm = checksum_context.get('response_algorithm')
276 if algorithm:
277 response_metadata['ChecksumAlgorithm'] = algorithm
279 def _is_modeled_error_shape(self, shape):
280 return shape is not None and shape.metadata.get('exception', False)
282 def _is_generic_error_response(self, response):
283 # There are times when a service will respond with a generic
284 # error response such as:
285 # '<html><body><b>Http/1.1 Service Unavailable</b></body></html>'
286 #
287 # This can also happen if you're going through a proxy.
288 # In this case the protocol specific _do_error_parse will either
289 # fail to parse the response (in the best case) or silently succeed
290 # and treat the HTML above as an XML response and return
291 # non sensical parsed data.
292 # To prevent this case from happening we first need to check
293 # whether or not this response looks like the generic response.
294 if response['status_code'] >= 500:
295 if 'body' not in response or response['body'] is None:
296 return True
298 body = response['body'].strip()
299 return body.startswith(b'<html>') or not body
301 def _do_generic_error_parse(self, response):
302 # There's not really much we can do when we get a generic
303 # html response.
304 LOG.debug(
305 "Received a non protocol specific error response from the "
306 "service, unable to populate error code and message."
307 )
308 return {
309 'Error': {
310 'Code': str(response['status_code']),
311 'Message': http.client.responses.get(
312 response['status_code'], ''
313 ),
314 },
315 'ResponseMetadata': {},
316 }
318 def _do_parse(self, response, shape):
319 raise NotImplementedError(f"{self.__class__.__name__}._do_parse")
321 def _do_error_parse(self, response, shape):
322 raise NotImplementedError(f"{self.__class__.__name__}._do_error_parse")
324 def _do_modeled_error_parse(self, response, shape, parsed):
325 raise NotImplementedError(
326 f"{self.__class__.__name__}._do_modeled_error_parse"
327 )
329 def _parse_shape(self, shape, node):
330 handler = getattr(
331 self, f'_handle_{shape.type_name}', self._default_handle
332 )
333 return handler(shape, node)
335 def _handle_list(self, shape, node):
336 # Enough implementations share list serialization that it's moved
337 # up here in the base class.
338 parsed = []
339 member_shape = shape.member
340 for item in node:
341 parsed.append(self._parse_shape(member_shape, item))
342 return parsed
344 def _default_handle(self, shape, value):
345 return value
347 def _create_event_stream(self, response, shape):
348 parser = self._event_stream_parser
349 name = response['context'].get('operation_name')
350 return EventStream(response['body'], shape, parser, name)
352 def _get_first_key(self, value):
353 return list(value)[0]
355 def _has_unknown_tagged_union_member(self, shape, value):
356 if shape.is_tagged_union:
357 cleaned_value = value.copy()
358 cleaned_value.pop("__type", None)
359 if len(cleaned_value) != 1:
360 error_msg = (
361 "Invalid service response: %s must have one and only "
362 "one member set."
363 )
364 raise ResponseParserError(error_msg % shape.name)
365 tag = self._get_first_key(cleaned_value)
366 if tag not in shape.members:
367 msg = (
368 "Received a tagged union response with member "
369 "unknown to client: %s. Please upgrade SDK for full "
370 "response support."
371 )
372 LOG.info(msg % tag)
373 return True
374 return False
376 def _handle_unknown_tagged_union_member(self, tag):
377 return {'SDK_UNKNOWN_MEMBER': {'name': tag}}
380class BaseXMLResponseParser(ResponseParser):
381 def __init__(self, timestamp_parser=None, blob_parser=None):
382 super().__init__(timestamp_parser, blob_parser)
383 self._namespace_re = re.compile('{.*}')
385 def _handle_map(self, shape, node):
386 parsed = {}
387 key_shape = shape.key
388 value_shape = shape.value
389 key_location_name = key_shape.serialization.get('name') or 'key'
390 value_location_name = value_shape.serialization.get('name') or 'value'
391 if shape.serialization.get('flattened') and not isinstance(node, list):
392 node = [node]
393 for keyval_node in node:
394 for single_pair in keyval_node:
395 # Within each <entry> there's a <key> and a <value>
396 tag_name = self._node_tag(single_pair)
397 if tag_name == key_location_name:
398 key_name = self._parse_shape(key_shape, single_pair)
399 elif tag_name == value_location_name:
400 val_name = self._parse_shape(value_shape, single_pair)
401 else:
402 raise ResponseParserError(f"Unknown tag: {tag_name}")
403 parsed[key_name] = val_name
404 return parsed
406 def _node_tag(self, node):
407 return self._namespace_re.sub('', node.tag)
409 def _handle_list(self, shape, node):
410 # When we use _build_name_to_xml_node, repeated elements are aggregated
411 # into a list. However, we can't tell the difference between a scalar
412 # value and a single element flattened list. So before calling the
413 # real _handle_list, we know that "node" should actually be a list if
414 # it's flattened, and if it's not, then we make it a one element list.
415 if shape.serialization.get('flattened') and not isinstance(node, list):
416 node = [node]
417 return super()._handle_list(shape, node)
419 def _handle_structure(self, shape, node):
420 parsed = {}
421 members = shape.members
422 if shape.metadata.get('exception', False):
423 node = self._get_error_root(node)
424 xml_dict = self._build_name_to_xml_node(node)
425 if self._has_unknown_tagged_union_member(shape, xml_dict):
426 tag = self._get_first_key(xml_dict)
427 return self._handle_unknown_tagged_union_member(tag)
428 for member_name in members:
429 member_shape = members[member_name]
430 if (
431 'location' in member_shape.serialization
432 or member_shape.serialization.get('eventheader')
433 ):
434 # All members with locations have already been handled,
435 # so we don't need to parse these members.
436 continue
437 xml_name = self._member_key_name(member_shape, member_name)
438 member_node = xml_dict.get(xml_name)
439 if member_node is not None:
440 parsed[member_name] = self._parse_shape(
441 member_shape, member_node
442 )
443 elif member_shape.serialization.get('xmlAttribute'):
444 attribs = {}
445 location_name = member_shape.serialization['name']
446 for key, value in node.attrib.items():
447 new_key = self._namespace_re.sub(
448 location_name.split(':')[0] + ':', key
449 )
450 attribs[new_key] = value
451 if location_name in attribs:
452 parsed[member_name] = attribs[location_name]
453 return parsed
455 def _get_error_root(self, original_root):
456 if self._node_tag(original_root) == 'ErrorResponse':
457 for child in original_root:
458 if self._node_tag(child) == 'Error':
459 return child
460 return original_root
462 def _member_key_name(self, shape, member_name):
463 # This method is needed because we have to special case flattened list
464 # with a serialization name. If this is the case we use the
465 # locationName from the list's member shape as the key name for the
466 # surrounding structure.
467 if shape.type_name == 'list' and shape.serialization.get('flattened'):
468 list_member_serialized_name = shape.member.serialization.get(
469 'name'
470 )
471 if list_member_serialized_name is not None:
472 return list_member_serialized_name
473 serialized_name = shape.serialization.get('name')
474 if serialized_name is not None:
475 return serialized_name
476 return member_name
478 def _build_name_to_xml_node(self, parent_node):
479 # If the parent node is actually a list. We should not be trying
480 # to serialize it to a dictionary. Instead, return the first element
481 # in the list.
482 if isinstance(parent_node, list):
483 return self._build_name_to_xml_node(parent_node[0])
484 xml_dict = {}
485 for item in parent_node:
486 key = self._node_tag(item)
487 if key in xml_dict:
488 # If the key already exists, the most natural
489 # way to handle this is to aggregate repeated
490 # keys into a single list.
491 # <foo>1</foo><foo>2</foo> -> {'foo': [Node(1), Node(2)]}
492 if isinstance(xml_dict[key], list):
493 xml_dict[key].append(item)
494 else:
495 # Convert from a scalar to a list.
496 xml_dict[key] = [xml_dict[key], item]
497 else:
498 xml_dict[key] = item
499 return xml_dict
501 def _parse_xml_string_to_dom(self, xml_string):
502 try:
503 parser = ETree.XMLParser(
504 target=ETree.TreeBuilder(), encoding=self.DEFAULT_ENCODING
505 )
506 parser.feed(xml_string)
507 root = parser.close()
508 except XMLParseError as e:
509 raise ResponseParserError(
510 f"Unable to parse response ({e}), "
511 f"invalid XML received. Further retries may succeed:\n{xml_string}"
512 )
513 return root
515 def _replace_nodes(self, parsed):
516 for key, value in parsed.items():
517 if list(value):
518 sub_dict = self._build_name_to_xml_node(value)
519 parsed[key] = self._replace_nodes(sub_dict)
520 else:
521 parsed[key] = value.text
522 return parsed
524 @_text_content
525 def _handle_boolean(self, shape, text):
526 if text == 'true':
527 return True
528 else:
529 return False
531 @_text_content
532 def _handle_float(self, shape, text):
533 return float(text)
535 @_text_content
536 def _handle_timestamp(self, shape, text):
537 return self._timestamp_parser(text)
539 @_text_content
540 def _handle_integer(self, shape, text):
541 return int(text)
543 @_text_content
544 def _handle_string(self, shape, text):
545 return text
547 @_text_content
548 def _handle_blob(self, shape, text):
549 return self._blob_parser(text)
551 _handle_character = _handle_string
552 _handle_double = _handle_float
553 _handle_long = _handle_integer
556class QueryParser(BaseXMLResponseParser):
557 def _do_error_parse(self, response, shape):
558 xml_contents = response['body']
559 root = self._parse_xml_string_to_dom(xml_contents)
560 parsed = self._build_name_to_xml_node(root)
561 self._replace_nodes(parsed)
562 # Once we've converted xml->dict, we need to make one or two
563 # more adjustments to extract nested errors and to be consistent
564 # with ResponseMetadata for non-error responses:
565 # 1. {"Errors": {"Error": {...}}} -> {"Error": {...}}
566 # 2. {"RequestId": "id"} -> {"ResponseMetadata": {"RequestId": "id"}}
567 if 'Errors' in parsed:
568 parsed.update(parsed.pop('Errors'))
569 if 'RequestId' in parsed:
570 parsed['ResponseMetadata'] = {'RequestId': parsed.pop('RequestId')}
571 return parsed
573 def _do_modeled_error_parse(self, response, shape):
574 return self._parse_body_as_xml(response, shape, inject_metadata=False)
576 def _do_parse(self, response, shape):
577 return self._parse_body_as_xml(response, shape, inject_metadata=True)
579 def _parse_body_as_xml(self, response, shape, inject_metadata=True):
580 xml_contents = response['body']
581 root = self._parse_xml_string_to_dom(xml_contents)
582 parsed = {}
583 if shape is not None:
584 start = root
585 if 'resultWrapper' in shape.serialization:
586 start = self._find_result_wrapped_shape(
587 shape.serialization['resultWrapper'], root
588 )
589 parsed = self._parse_shape(shape, start)
590 if inject_metadata:
591 self._inject_response_metadata(root, parsed)
592 return parsed
594 def _find_result_wrapped_shape(self, element_name, xml_root_node):
595 mapping = self._build_name_to_xml_node(xml_root_node)
596 return mapping[element_name]
598 def _inject_response_metadata(self, node, inject_into):
599 mapping = self._build_name_to_xml_node(node)
600 child_node = mapping.get('ResponseMetadata')
601 if child_node is not None:
602 sub_mapping = self._build_name_to_xml_node(child_node)
603 for key, value in sub_mapping.items():
604 sub_mapping[key] = value.text
605 inject_into['ResponseMetadata'] = sub_mapping
608class EC2QueryParser(QueryParser):
609 def _inject_response_metadata(self, node, inject_into):
610 mapping = self._build_name_to_xml_node(node)
611 child_node = mapping.get('requestId')
612 if child_node is not None:
613 inject_into['ResponseMetadata'] = {'RequestId': child_node.text}
615 def _do_error_parse(self, response, shape):
616 # EC2 errors look like:
617 # <Response>
618 # <Errors>
619 # <Error>
620 # <Code>InvalidInstanceID.Malformed</Code>
621 # <Message>Invalid id: "1343124"</Message>
622 # </Error>
623 # </Errors>
624 # <RequestID>12345</RequestID>
625 # </Response>
626 # This is different from QueryParser in that it's RequestID,
627 # not RequestId
628 original = super()._do_error_parse(response, shape)
629 if 'RequestID' in original:
630 original['ResponseMetadata'] = {
631 'RequestId': original.pop('RequestID')
632 }
633 return original
635 def _get_error_root(self, original_root):
636 for child in original_root:
637 if self._node_tag(child) == 'Errors':
638 for errors_child in child:
639 if self._node_tag(errors_child) == 'Error':
640 return errors_child
641 return original_root
644class BaseJSONParser(ResponseParser):
645 def _handle_structure(self, shape, value):
646 final_parsed = {}
647 if shape.is_document_type:
648 final_parsed = value
649 else:
650 member_shapes = shape.members
651 if value is None:
652 # If the comes across the wire as "null" (None in python),
653 # we should be returning this unchanged, instead of as an
654 # empty dict.
655 return None
656 final_parsed = {}
657 if self._has_unknown_tagged_union_member(shape, value):
658 tag = self._get_first_key(value)
659 return self._handle_unknown_tagged_union_member(tag)
660 for member_name in member_shapes:
661 member_shape = member_shapes[member_name]
662 json_name = member_shape.serialization.get('name', member_name)
663 raw_value = value.get(json_name)
664 if raw_value is not None:
665 final_parsed[member_name] = self._parse_shape(
666 member_shapes[member_name], raw_value
667 )
668 return final_parsed
670 def _handle_map(self, shape, value):
671 parsed = {}
672 key_shape = shape.key
673 value_shape = shape.value
674 for key, value in value.items():
675 actual_key = self._parse_shape(key_shape, key)
676 actual_value = self._parse_shape(value_shape, value)
677 parsed[actual_key] = actual_value
678 return parsed
680 def _handle_blob(self, shape, value):
681 return self._blob_parser(value)
683 def _handle_timestamp(self, shape, value):
684 return self._timestamp_parser(value)
686 def _do_error_parse(self, response, shape):
687 body = self._parse_body_as_json(response['body'])
688 error = {"Error": {"Message": '', "Code": ''}, "ResponseMetadata": {}}
689 headers = response['headers']
690 # Error responses can have slightly different structures for json.
691 # The basic structure is:
692 #
693 # {"__type":"ConnectClientException",
694 # "message":"The error message."}
696 # The error message can either come in the 'message' or 'Message' key
697 # so we need to check for both.
698 error['Error']['Message'] = body.get(
699 'message', body.get('Message', '')
700 )
701 # if the message did not contain an error code
702 # include the response status code
703 response_code = response.get('status_code')
705 code = body.get('__type', response_code and str(response_code))
706 if code is not None:
707 # code has a couple forms as well:
708 # * "com.aws.dynamodb.vAPI#ProvisionedThroughputExceededException"
709 # * "ResourceNotFoundException"
710 if '#' in code:
711 code = code.rsplit('#', 1)[1]
712 if 'x-amzn-query-error' in headers:
713 code = self._do_query_compatible_error_parse(
714 code, headers, error
715 )
716 error['Error']['Code'] = code
717 self._inject_response_metadata(error, response['headers'])
718 return error
720 def _do_query_compatible_error_parse(self, code, headers, error):
721 """
722 Error response may contain an x-amzn-query-error header to translate
723 errors codes from former `query` services into `json`. We use this to
724 do our lookup in the errorfactory for modeled errors.
725 """
726 query_error = headers['x-amzn-query-error']
727 query_error_components = query_error.split(';')
729 if len(query_error_components) == 2 and query_error_components[0]:
730 error['Error']['QueryErrorCode'] = code
731 error['Error']['Type'] = query_error_components[1]
732 return query_error_components[0]
733 return code
735 def _inject_response_metadata(self, parsed, headers):
736 if 'x-amzn-requestid' in headers:
737 parsed.setdefault('ResponseMetadata', {})['RequestId'] = headers[
738 'x-amzn-requestid'
739 ]
741 def _parse_body_as_json(self, body_contents):
742 if not body_contents:
743 return {}
744 body = body_contents.decode(self.DEFAULT_ENCODING)
745 try:
746 original_parsed = json.loads(body)
747 return original_parsed
748 except ValueError:
749 # if the body cannot be parsed, include
750 # the literal string as the message
751 return {'message': body}
754class BaseEventStreamParser(ResponseParser):
755 def _do_parse(self, response, shape):
756 final_parsed = {}
757 if shape.serialization.get('eventstream'):
758 event_type = response['headers'].get(':event-type')
759 event_shape = shape.members.get(event_type)
760 if event_shape:
761 final_parsed[event_type] = self._do_parse(
762 response, event_shape
763 )
764 else:
765 self._parse_non_payload_attrs(
766 response, shape, shape.members, final_parsed
767 )
768 self._parse_payload(response, shape, shape.members, final_parsed)
769 return final_parsed
771 def _do_error_parse(self, response, shape):
772 exception_type = response['headers'].get(':exception-type')
773 exception_shape = shape.members.get(exception_type)
774 if exception_shape is not None:
775 original_parsed = self._initial_body_parse(response['body'])
776 body = self._parse_shape(exception_shape, original_parsed)
777 error = {
778 'Error': {
779 'Code': exception_type,
780 'Message': body.get('Message', body.get('message', '')),
781 }
782 }
783 else:
784 error = {
785 'Error': {
786 'Code': response['headers'].get(':error-code', ''),
787 'Message': response['headers'].get(':error-message', ''),
788 }
789 }
790 return error
792 def _parse_payload(self, response, shape, member_shapes, final_parsed):
793 if shape.serialization.get('event'):
794 for name in member_shapes:
795 member_shape = member_shapes[name]
796 if member_shape.serialization.get('eventpayload'):
797 body = response['body']
798 if member_shape.type_name == 'blob':
799 parsed_body = body
800 elif member_shape.type_name == 'string':
801 parsed_body = body.decode(self.DEFAULT_ENCODING)
802 else:
803 raw_parse = self._initial_body_parse(body)
804 parsed_body = self._parse_shape(
805 member_shape, raw_parse
806 )
807 final_parsed[name] = parsed_body
808 return
809 # If we didn't find an explicit payload, use the current shape
810 original_parsed = self._initial_body_parse(response['body'])
811 body_parsed = self._parse_shape(shape, original_parsed)
812 final_parsed.update(body_parsed)
814 def _parse_non_payload_attrs(
815 self, response, shape, member_shapes, final_parsed
816 ):
817 headers = response['headers']
818 for name in member_shapes:
819 member_shape = member_shapes[name]
820 if member_shape.serialization.get('eventheader'):
821 if name in headers:
822 value = headers[name]
823 if member_shape.type_name == 'timestamp':
824 # Event stream timestamps are an in milleseconds so we
825 # divide by 1000 to convert to seconds.
826 value = self._timestamp_parser(value / 1000.0)
827 final_parsed[name] = value
829 def _initial_body_parse(self, body_contents):
830 # This method should do the initial xml/json parsing of the
831 # body. We we still need to walk the parsed body in order
832 # to convert types, but this method will do the first round
833 # of parsing.
834 raise NotImplementedError("_initial_body_parse")
837class EventStreamJSONParser(BaseEventStreamParser, BaseJSONParser):
838 def _initial_body_parse(self, body_contents):
839 return self._parse_body_as_json(body_contents)
842class EventStreamXMLParser(BaseEventStreamParser, BaseXMLResponseParser):
843 def _initial_body_parse(self, xml_string):
844 if not xml_string:
845 return ETree.Element('')
846 return self._parse_xml_string_to_dom(xml_string)
849class JSONParser(BaseJSONParser):
850 EVENT_STREAM_PARSER_CLS = EventStreamJSONParser
852 """Response parser for the "json" protocol."""
854 def _do_parse(self, response, shape):
855 parsed = {}
856 if shape is not None:
857 event_name = shape.event_stream_name
858 if event_name:
859 parsed = self._handle_event_stream(response, shape, event_name)
860 else:
861 parsed = self._handle_json_body(response['body'], shape)
862 self._inject_response_metadata(parsed, response['headers'])
863 return parsed
865 def _do_modeled_error_parse(self, response, shape):
866 return self._handle_json_body(response['body'], shape)
868 def _handle_event_stream(self, response, shape, event_name):
869 event_stream_shape = shape.members[event_name]
870 event_stream = self._create_event_stream(response, event_stream_shape)
871 try:
872 event = event_stream.get_initial_response()
873 except NoInitialResponseError:
874 error_msg = 'First event was not of type initial-response'
875 raise ResponseParserError(error_msg)
876 parsed = self._handle_json_body(event.payload, shape)
877 parsed[event_name] = event_stream
878 return parsed
880 def _handle_json_body(self, raw_body, shape):
881 # The json.loads() gives us the primitive JSON types,
882 # but we need to traverse the parsed JSON data to convert
883 # to richer types (blobs, timestamps, etc.
884 parsed_json = self._parse_body_as_json(raw_body)
885 return self._parse_shape(shape, parsed_json)
888class BaseRestParser(ResponseParser):
889 def _do_parse(self, response, shape):
890 final_parsed = {}
891 final_parsed['ResponseMetadata'] = self._populate_response_metadata(
892 response
893 )
894 self._add_modeled_parse(response, shape, final_parsed)
895 return final_parsed
897 def _add_modeled_parse(self, response, shape, final_parsed):
898 if shape is None:
899 return final_parsed
900 member_shapes = shape.members
901 self._parse_non_payload_attrs(
902 response, shape, member_shapes, final_parsed
903 )
904 self._parse_payload(response, shape, member_shapes, final_parsed)
906 def _do_modeled_error_parse(self, response, shape):
907 final_parsed = {}
908 self._add_modeled_parse(response, shape, final_parsed)
909 return final_parsed
911 def _populate_response_metadata(self, response):
912 metadata = {}
913 headers = response['headers']
914 if 'x-amzn-requestid' in headers:
915 metadata['RequestId'] = headers['x-amzn-requestid']
916 elif 'x-amz-request-id' in headers:
917 metadata['RequestId'] = headers['x-amz-request-id']
918 # HostId is what it's called whenever this value is returned
919 # in an XML response body, so to be consistent, we'll always
920 # call is HostId.
921 metadata['HostId'] = headers.get('x-amz-id-2', '')
922 return metadata
924 def _parse_payload(self, response, shape, member_shapes, final_parsed):
925 if 'payload' in shape.serialization:
926 # If a payload is specified in the output shape, then only that
927 # shape is used for the body payload.
928 payload_member_name = shape.serialization['payload']
929 body_shape = member_shapes[payload_member_name]
930 if body_shape.serialization.get('eventstream'):
931 body = self._create_event_stream(response, body_shape)
932 final_parsed[payload_member_name] = body
933 elif body_shape.type_name in ['string', 'blob']:
934 # This is a stream
935 body = response['body']
936 if isinstance(body, bytes):
937 body = body.decode(self.DEFAULT_ENCODING)
938 final_parsed[payload_member_name] = body
939 else:
940 original_parsed = self._initial_body_parse(response['body'])
941 final_parsed[payload_member_name] = self._parse_shape(
942 body_shape, original_parsed
943 )
944 else:
945 original_parsed = self._initial_body_parse(response['body'])
946 body_parsed = self._parse_shape(shape, original_parsed)
947 final_parsed.update(body_parsed)
949 def _parse_non_payload_attrs(
950 self, response, shape, member_shapes, final_parsed
951 ):
952 headers = response['headers']
953 for name in member_shapes:
954 member_shape = member_shapes[name]
955 location = member_shape.serialization.get('location')
956 if location is None:
957 continue
958 elif location == 'statusCode':
959 final_parsed[name] = self._parse_shape(
960 member_shape, response['status_code']
961 )
962 elif location == 'headers':
963 final_parsed[name] = self._parse_header_map(
964 member_shape, headers
965 )
966 elif location == 'header':
967 header_name = member_shape.serialization.get('name', name)
968 if header_name in headers:
969 final_parsed[name] = self._parse_shape(
970 member_shape, headers[header_name]
971 )
973 def _parse_header_map(self, shape, headers):
974 # Note that headers are case insensitive, so we .lower()
975 # all header names and header prefixes.
976 parsed = {}
977 prefix = shape.serialization.get('name', '').lower()
978 for header_name in headers:
979 if header_name.lower().startswith(prefix):
980 # The key name inserted into the parsed hash
981 # strips off the prefix.
982 name = header_name[len(prefix) :]
983 parsed[name] = headers[header_name]
984 return parsed
986 def _initial_body_parse(self, body_contents):
987 # This method should do the initial xml/json parsing of the
988 # body. We we still need to walk the parsed body in order
989 # to convert types, but this method will do the first round
990 # of parsing.
991 raise NotImplementedError("_initial_body_parse")
993 def _handle_string(self, shape, value):
994 parsed = value
995 if is_json_value_header(shape):
996 decoded = base64.b64decode(value).decode(self.DEFAULT_ENCODING)
997 parsed = json.loads(decoded)
998 return parsed
1000 def _handle_list(self, shape, node):
1001 location = shape.serialization.get('location')
1002 if location == 'header' and not isinstance(node, list):
1003 # List in headers may be a comma separated string as per RFC7230
1004 node = [e.strip() for e in node.split(',')]
1005 return super()._handle_list(shape, node)
1008class RestJSONParser(BaseRestParser, BaseJSONParser):
1009 EVENT_STREAM_PARSER_CLS = EventStreamJSONParser
1011 def _initial_body_parse(self, body_contents):
1012 return self._parse_body_as_json(body_contents)
1014 def _do_error_parse(self, response, shape):
1015 error = super()._do_error_parse(response, shape)
1016 self._inject_error_code(error, response)
1017 return error
1019 def _inject_error_code(self, error, response):
1020 # The "Code" value can come from either a response
1021 # header or a value in the JSON body.
1022 body = self._initial_body_parse(response['body'])
1023 if 'x-amzn-errortype' in response['headers']:
1024 code = response['headers']['x-amzn-errortype']
1025 # Could be:
1026 # x-amzn-errortype: ValidationException:
1027 code = code.split(':')[0]
1028 error['Error']['Code'] = code
1029 elif 'code' in body or 'Code' in body:
1030 error['Error']['Code'] = body.get('code', body.get('Code', ''))
1032 def _handle_integer(self, shape, value):
1033 return int(value)
1035 _handle_long = _handle_integer
1038class RestXMLParser(BaseRestParser, BaseXMLResponseParser):
1039 EVENT_STREAM_PARSER_CLS = EventStreamXMLParser
1041 def _initial_body_parse(self, xml_string):
1042 if not xml_string:
1043 return ETree.Element('')
1044 return self._parse_xml_string_to_dom(xml_string)
1046 def _do_error_parse(self, response, shape):
1047 # We're trying to be service agnostic here, but S3 does have a slightly
1048 # different response structure for its errors compared to other
1049 # rest-xml serivces (route53/cloudfront). We handle this by just
1050 # trying to parse both forms.
1051 # First:
1052 # <ErrorResponse xmlns="...">
1053 # <Error>
1054 # <Type>Sender</Type>
1055 # <Code>InvalidInput</Code>
1056 # <Message>Invalid resource type: foo</Message>
1057 # </Error>
1058 # <RequestId>request-id</RequestId>
1059 # </ErrorResponse>
1060 if response['body']:
1061 # If the body ends up being invalid xml, the xml parser should not
1062 # blow up. It should at least try to pull information about the
1063 # the error response from other sources like the HTTP status code.
1064 try:
1065 return self._parse_error_from_body(response)
1066 except ResponseParserError:
1067 LOG.debug(
1068 'Exception caught when parsing error response body:',
1069 exc_info=True,
1070 )
1071 return self._parse_error_from_http_status(response)
1073 def _parse_error_from_http_status(self, response):
1074 return {
1075 'Error': {
1076 'Code': str(response['status_code']),
1077 'Message': http.client.responses.get(
1078 response['status_code'], ''
1079 ),
1080 },
1081 'ResponseMetadata': {
1082 'RequestId': response['headers'].get('x-amz-request-id', ''),
1083 'HostId': response['headers'].get('x-amz-id-2', ''),
1084 },
1085 }
1087 def _parse_error_from_body(self, response):
1088 xml_contents = response['body']
1089 root = self._parse_xml_string_to_dom(xml_contents)
1090 parsed = self._build_name_to_xml_node(root)
1091 self._replace_nodes(parsed)
1092 if root.tag == 'Error':
1093 # This is an S3 error response. First we'll populate the
1094 # response metadata.
1095 metadata = self._populate_response_metadata(response)
1096 # The RequestId and the HostId are already in the
1097 # ResponseMetadata, but are also duplicated in the XML
1098 # body. We don't need these values in both places,
1099 # we'll just remove them from the parsed XML body.
1100 parsed.pop('RequestId', '')
1101 parsed.pop('HostId', '')
1102 return {'Error': parsed, 'ResponseMetadata': metadata}
1103 elif 'RequestId' in parsed:
1104 # Other rest-xml services:
1105 parsed['ResponseMetadata'] = {'RequestId': parsed.pop('RequestId')}
1106 default = {'Error': {'Message': '', 'Code': ''}}
1107 merge_dicts(default, parsed)
1108 return default
1110 @_text_content
1111 def _handle_string(self, shape, text):
1112 text = super()._handle_string(shape, text)
1113 return text
1116PROTOCOL_PARSERS = {
1117 'ec2': EC2QueryParser,
1118 'query': QueryParser,
1119 'json': JSONParser,
1120 'rest-json': RestJSONParser,
1121 'rest-xml': RestXMLParser,
1122}