@@ -515,11 +515,11 @@ class StreamingDetectIntentResponse(proto.Message):
515515
516516 Multiple response messages can be returned in order:
517517
518- 1. If the input was set to streaming audio, the first one or more
519- messages contain ``recognition_result``. Each
520- ``recognition_result`` represents a more complete transcript of
521- what the user said. The last ``recognition_result`` has
522- ``is_final`` set to ``true`` .
518+ 1. If the ``StreamingDetectIntentRequest.input_audio`` field was
519+ set, the ``recognition_result`` field is populated for one or
520+ more messages. See the
521+ [StreamingRecognitionResult][google.cloud.dialogflow.v2.StreamingRecognitionResult]
522+ message for details about the result message sequence .
523523
524524 2. The next message contains ``response_id``, ``query_result`` and
525525 optionally ``webhook_status`` if a WebHook was called.
@@ -570,33 +570,42 @@ class StreamingRecognitionResult(proto.Message):
570570 the audio that is currently being processed or an indication that
571571 this is the end of the single requested utterance.
572572
573- Example:
574-
575- 1. transcript: "tube"
576-
577- 2. transcript: "to be a"
578-
579- 3. transcript: "to be"
580-
581- 4. transcript: "to be or not to be" is_final: true
582-
583- 5. transcript: " that's"
584-
585- 6. transcript: " that is"
586-
587- 7. message_type: ``END_OF_SINGLE_UTTERANCE``
588-
589- 8. transcript: " that is the question" is_final: true
590-
591- Only two of the responses contain final results (#4 and #8 indicated
592- by ``is_final: true``). Concatenating these generates the full
593- transcript: "to be or not to be that is the question".
594-
595- In each response we populate:
596-
597- - for ``TRANSCRIPT``: ``transcript`` and possibly ``is_final``.
598-
599- - for ``END_OF_SINGLE_UTTERANCE``: only ``message_type``.
573+ While end-user audio is being processed, Dialogflow sends a series
574+ of results. Each result may contain a ``transcript`` value. A
575+ transcript represents a portion of the utterance. While the
576+ recognizer is processing audio, transcript values may be interim
577+ values or finalized values. Once a transcript is finalized, the
578+ ``is_final`` value is set to true and processing continues for the
579+ next transcript.
580+
581+ If
582+ ``StreamingDetectIntentRequest.query_input.audio_config.single_utterance``
583+ was true, and the recognizer has completed processing audio, the
584+ ``message_type`` value is set to \`END_OF_SINGLE_UTTERANCE and the
585+ following (last) result contains the last finalized transcript.
586+
587+ The complete end-user utterance is determined by concatenating the
588+ finalized transcript values received for the series of results.
589+
590+ In the following example, single utterance is enabled. In the case
591+ where single utterance is not enabled, result 7 would not occur.
592+
593+ ::
594+
595+ Num | transcript | message_type | is_final
596+ --- | ----------------------- | ----------------------- | --------
597+ 1 | "tube" | TRANSCRIPT | false
598+ 2 | "to be a" | TRANSCRIPT | false
599+ 3 | "to be" | TRANSCRIPT | false
600+ 4 | "to be or not to be" | TRANSCRIPT | true
601+ 5 | "that's" | TRANSCRIPT | false
602+ 6 | "that is | TRANSCRIPT | false
603+ 7 | unset | END_OF_SINGLE_UTTERANCE | unset
604+ 8 | " that is the question" | TRANSCRIPT | true
605+
606+ Concatenating the finalized transcripts with ``is_final`` set to
607+ true, the complete utterance becomes "to be or not to be that is the
608+ question".
600609
601610 Attributes:
602611 message_type (google.cloud.dialogflow_v2.types.StreamingRecognitionResult.MessageType):
0 commit comments