ドキュメント要約を使用する方法

[アーティクル]
12/19/2023

ドキュメント要約は、ユーザーが読むには長すぎると考えられるコンテンツを短縮するように設計されています。抽出要約と抽象要約の両方で、記事、論文、またはドキュメントが重要な文に要約されます。

抽出要約は、元のコンテンツ内で最も重要性または関連性の高い情報をまとめて表す文を抽出することで概要を生成します。

抽象要約: 主要なアイデアが表されている要約された文をドキュメントから生成することによって、概要を生成します。

クエリ指向要約: 要約するときにクエリを使用できます。

これらの各機能は、指定された場合、目的の特定の項目を要約できます。

API で使用される AI モデルはサービスによって提供されるため、分析対象のコンテンツを送信するだけで済みます。

簡単に移動できるように、各サービスの対応するセクションへのリンクを次に示します。

側面	セクション
抽出	抽出要約
抽象	抽象要約
クエリ指向	クエリ指向要約

機能

ヒント

クイックスタートの記事に従って、これらの機能を使い始めることができます。また、Language Studio を使うと、コードを記述しなくても要求の例を作成できます。

抽出要約 API では、自然言語処理手法を使用して、非構造化テキストドキュメント内の重要な文が検索されます。これらの文がまとめられて、ドキュメントの主要なアイデアが伝えられます。

抽出要約では、抽出された文と元のドキュメント内でのその位置と共に、システム応答の一部としてランクスコアが返されます。ランクスコアは、ドキュメントの主要なアイデアに対して文がどの程度関係していると判断されるのかの指標です。モデルでは、各文に対して 0 から 1 (両端を含む) のスコアが与えられ、要求ごとに最も高いスコアの文が返されます。たとえば、3 文の要約を要求した場合、サービスからはスコアが最も高い 3 つの文が返されます。

Azure AI Language にはキーフレーズ抽出というもう 1 つの機能があり、重要な情報を抽出できます。キーフレーズ抽出と抽出要約のどちらを使用するかを決めるときは、次の点を考慮します。

キーフレーズ抽出からは語句が返されるのに対し、抽出要約からは文が返されます。
抽出要約からは文と共にランクスコアが返され、要求ごとに上位ランクの文が返されます。
抽出要約では、次の位置情報も返されます。
- オフセット: 抽出された各文の開始位置。
- 長さ: 抽出された各文の長さ。

データの処理方法を決定する (省略可能)

データの送信

ドキュメントをテキストの文字列として API に送信します。要求が受信されると分析が実行されます。 API は非同期なので、API 要求を送信してから、結果を受信するまでに、遅延が発生する可能性があります。

この機能を使うと、API の結果は、応答で示される要求取り込み時刻から 24 時間利用できます。この時間が経過すると、結果は消去され、取得できなくなります。

ドキュメントの要約結果を取得する

言語検出から結果を取得するときは、結果をアプリケーションにストリーミングしたり、ローカルシステム上のファイルに出力を保存したりできます。

次に示すのは、要約のために送信するコンテンツの例です。これは、Microsoft ブログ記事「統合 AI のために包括的な表現」を使って抽出されたものです。この記事は一例にすぎず、API は長い入力テキストを受け入れることができます。詳しくは、データの制限に関するセクションをご覧ください。

"At Microsoft, we have been on a quest to advance AI beyond existing techniques, by taking a more holistic, human-centric approach to learning and understanding. As Chief Technology Officer of Azure AI services, I have been working with a team of amazing scientists and engineers to turn this quest into a reality. In my role, I enjoy a unique perspective in viewing the relationship among three attributes of human cognition: monolingual text (X), audio or visual sensory signals, (Y) and multilingual (Z). "At the intersection of all three, there’s magic—what we call XYZ-code as illustrated in Figure 1—a joint representation to create more powerful AI that can speak, hear, see, and understand humans better." 私たちは、XYZ コードは、さまざまなモダリティと言語にまたがるクロスドメイン転移学習という長期的なビジョンを実現できると信じています。 The goal is to have pretrained models that can jointly learn representations to support a broad range of downstream AI tasks, much in the way humans do today. Over the past five years, we have achieved human performance on benchmarks in conversational speech recognition, machine translation, conversational question answering, machine reading comprehension, and image captioning. These five breakthroughs provided us with strong signals toward our more ambitious aspiration to produce a leap in AI capabilities, achieving multi-sensory and multilingual learning that is closer in line with how humans learn and understand. I believe the joint XYZ-code is a foundational component of this aspiration, if grounded with external knowledge sources in the downstream AI tasks."

ドキュメント要約 API 要求は、要求を受信した時点で API バックエンド用のジョブを作成することにより処理されます。ジョブが成功すると、API の出力が返されます。出力は 24 時間取得できます。この時間が過ぎると、出力は消去されます。多言語と絵文字のサポートにより、応答にはテキストオフセットが含まれる場合があります。詳細についてはオフセットの処理方法に関するページを参照してください。

上記の例を使うと、API から次のような要約された文が返される可能性があります。

抽出要約:

"At Microsoft, we have been on a quest to advance AI beyond existing techniques, by taking a more holistic, human-centric approach to learning and understanding."
"私たちは、XYZ コードは、さまざまなモダリティと言語にまたがるクロスドメイン転移学習という長期的なビジョンを実現できると信じています。"
"目標は、さまざまなダウンストリーム AI タスクをサポートするために、現在人間が行っているのとほぼ同様の方法で、表現を共同で学習できる一連の事前トレーニング済みモデルを獲得することです。"

抽象要約:

"Microsoft is taking a more holistic, human-centric approach to learning and understanding. 私たちは、XYZ コードは、さまざまなモダリティと言語にまたがるクロスドメイン転移学習という長期的なビジョンを実現できると信じています。 Over the past five years, we have achieved human performance on benchmarks in."

ドキュメントの抽出要約を試す

ドキュメントの抽出要約を使って、記事、論文、またはドキュメントの概要を取得できます。例については、クイックスタートの記事をご覧ください。

sentenceCount パラメータを使って、返される文の数を指定できます。既定値は 3 です。範囲は 1 から 20 です。

また、sortby パラメーターを使うと、抽出された文が返されるときの順序を指定できます。値は Offset または Rank で、既定値は Offset です。

パラメーターの値	説明
順位	サービスによって決定された入力ドキュメントとの関連性に従って文を並べ替えます。
Offset	入力ドキュメントで文が出現する元の順序を維持します。

ドキュメントの抽象要約を試す

ドキュメントの抽象要約を使い始めるときの例を次に示します。

テキストエディターに以下のコマンドをコピーします。 BASH の例では、行連結文字として \ を使用します。ご利用のコンソールまたはターミナルで異なる行連結文字が使われている場合は、その文字を代わりに使ってください。

curl -i -X POST https://<your-language-resource-endpoint>/language/analyze-text/jobs?api-version=2022-10-01-preview \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-language-resource-key>" \
-d \
' 
{
  "displayName": "Document Abstractive Summarization Task Example",
  "analysisInput": {
    "documents": [
      {
        "id": "1",
        "language": "en",
        "text": "At Microsoft, we have been on a quest to advance AI beyond existing techniques, by taking a more holistic, human-centric approach to learning and understanding. As Chief Technology Officer of Azure AI services, I have been working with a team of amazing scientists and engineers to turn this quest into a reality. In my role, I enjoy a unique perspective in viewing the relationship among three attributes of human cognition: monolingual text (X), audio or visual sensory signals, (Y) and multilingual (Z). At the intersection of all three, there’s magic—what we call XYZ-code as illustrated in Figure 1—a joint representation to create more powerful AI that can speak, hear, see, and understand humans better. We believe XYZ-code enables us to fulfill our long-term vision: cross-domain transfer learning, spanning modalities and languages. The goal is to have pretrained models that can jointly learn representations to support a broad range of downstream AI tasks, much in the way humans do today. Over the past five years, we have achieved human performance on benchmarks in conversational speech recognition, machine translation, conversational question answering, machine reading comprehension, and image captioning. These five breakthroughs provided us with strong signals toward our more ambitious aspiration to produce a leap in AI capabilities, achieving multi-sensory and multilingual learning that is closer in line with how humans learn and understand. I believe the joint XYZ-code is a foundational component of this aspiration, if grounded with external knowledge sources in the downstream AI tasks."
      }
    ]
  },
  "tasks": [
    {
      "kind": "AbstractiveSummarization",
      "taskName": "Document Abstractive Summarization Task 1",
      "parameters": {
        "summaryLength": short
      }
    }
  ]
}
'

sentenceCount を指定しない場合、モデルによって要約の長さが決定されます。 sentenceCount は出力要約の文の数の近似値であり、範囲は 1 から 20 であることに注意してください。抽象要約に対する sentenceCount の使用は推奨されていません。

必要に応じて、コマンドに次の変更を加えます。
- your-language-resource-key 値をキーに置き換えます。
- 要求 URL (your-language-resource-endpoint) の最初の部分を独自のエンドポイント URL に置き換えます。
コマンドプロンプトウィンドウ (BASH など) を開きます。
テキストエディターからコマンドプロンプトウィンドウにコマンドを貼り付けて、コマンドを実行します。
operation-location を応答ヘッダーから取得します。値は次の URL のようになります。

https://<your-language-resource-endpoint>/language/analyze-text/jobs/12345678-1234-1234-1234-12345678?api-version=2022-10-01-preview

要求の結果を取得するには、次の cURL コマンドを使用します。 <my-job-id> を、前の operation-location 応答ヘッダーから受け取った数値 ID 値に必ず置き換えてください。

curl -X GET https://<your-language-resource-endpoint>/language/analyze-text/jobs/<my-job-id>?api-version=2022-10-01-preview \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-language-resource-key>"

抽象ドキュメント要約の例の JSON 応答

{
    "jobId": "cd6418fe-db86-4350-aec1-f0d7c91442a6",
    "lastUpdateDateTime": "2022-09-08T16:45:14Z",
    "createdDateTime": "2022-09-08T16:44:53Z",
    "expirationDateTime": "2022-09-09T16:44:53Z",
    "status": "succeeded",
    "errors": [],
    "displayName": "Document Abstractive Summarization Task Example",
    "tasks": {
        "completed": 1,
        "failed": 0,
        "inProgress": 0,
        "total": 1,
        "items": [
            {
                "kind": "AbstractiveSummarizationLROResults",
                "taskName": "Document Abstractive Summarization Task 1",
                "lastUpdateDateTime": "2022-09-08T16:45:14.0717206Z",
                "status": "succeeded",
                "results": {
                    "documents": [
                        {
                            "summaries": [
                                {
                                    "text": "Microsoft is taking a more holistic, human-centric approach to AI. We've developed a joint representation to create more powerful AI that can speak, hear, see, and understand humans better. We've achieved human performance on benchmarks in conversational speech recognition, machine translation, ...... and image captions.",
                                    "contexts": [
                                        {
                                            "offset": 0,
                                            "length": 247
                                        }
                                    ]
                                }
                            ],
                            "id": "1"
                        }
                    ],
                    "errors": [],
                    "modelVersion": "latest"
                }
            }
        ]
    }
}

パラメーター	Description
`-X POST <endpoint>`	API にアクセスするためのエンドポイントを指定します。
`-H Content-Type: application/json`	JSON データを送信するためのコンテンツタイプ。
`-H "Ocp-Apim-Subscription-Key:<key>`	API にアクセスするためのキーを指定します。
`-d <documents>`	送信するドキュメントを含む JSON。

次の cURL コマンドは、BASH シェルから実行されます。これらのコマンドは、実際のリソース名、リソースキー、JSON の値に合わせて編集してください。

クエリベース要約

クエリベースドキュメント要約 API は、既存のドキュメント要約 API の拡張機能です。

最大の違いは、要求本文内の (tasks>parameters>query の下の) 新しい query フィールドです。さらに、短/中/長の "バケット" で優先 summaryLength を指定する新しい方法があり、抽象を使用する際には特に、これを sentenceCount の代わりに使用することをお勧めします。要求の例を次に示します。

curl -i -X POST https://<your-language-resource-endpoint>/language/analyze-text/jobs?api-version=2023-11-15-preview \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-language-resource-key>" \
-d \
' 
{
  "displayName": "Document Extractive Summarization Task Example",
  "analysisInput": {
    "documents": [
      {
        "id": "1",
        "language": "en",
        "text": "At Microsoft, we have been on a quest to advance AI beyond existing techniques, by taking a more holistic, human-centric approach to learning and understanding. As Chief Technology Officer of Azure AI services, I have been working with a team of amazing scientists and engineers to turn this quest into a reality. In my role, I enjoy a unique perspective in viewing the relationship among three attributes of human cognition: monolingual text (X), audio or visual sensory signals, (Y) and multilingual (Z). At the intersection of all three, there’s magic—what we call XYZ-code as illustrated in Figure 1—a joint representation to create more powerful AI that can speak, hear, see, and understand humans better. We believe XYZ-code enables us to fulfill our long-term vision: cross-domain transfer learning, spanning modalities and languages. The goal is to have pretrained models that can jointly learn representations to support a broad range of downstream AI tasks, much in the way humans do today. Over the past five years, we have achieved human performance on benchmarks in conversational speech recognition, machine translation, conversational question answering, machine reading comprehension, and image captioning. These five breakthroughs provided us with strong signals toward our more ambitious aspiration to produce a leap in AI capabilities, achieving multi-sensory and multilingual learning that is closer in line with how humans learn and understand. I believe the joint XYZ-code is a foundational component of this aspiration, if grounded with external knowledge sources in the downstream AI tasks."
      }
    ]
  },
  "tasks": [
    {
      "kind": "ExtractiveSummarization",
      "taskName": "Document Extractive Summarization Task 1",
      "parameters": {
        "query": "XYZ-code",
        "summaryLength": short
      }
    }
  ]
}
'

summaryParameter の使用

summaryLength パラメーターについては、次の 3 つの値が受け取られます。

短: だいたい 2 つから 3 つの文章のまとめと約 120 個のトークンを生成します。
中: だいたい 4 つから 6 つの文章のまとめと約 170 個のトークンを生成します。
長: だいたい 7 つ以上の文章のまとめと約 210 個のトークンを生成します。

サービスとデータの制限

分単位および秒単位で送信できる要求のサイズと数については、サービスの制限に関する記事を参照してください。

ドキュメント要約を使用する方法

機能

データの処理方法を決定する (省略可能)

データの送信

ドキュメントの要約結果を取得する

ドキュメントの抽出要約を試す

ドキュメントの抽象要約を試す

抽象ドキュメント要約の例の JSON 応答

クエリベース要約

summaryParameter の使用

サービスとデータの制限

関連項目

その他のリソース

ドキュメント要約を使用する方法

機能

データの処理方法を決定する (省略可能)

データの送信

ドキュメントの要約結果を取得する

ドキュメントの抽出要約を試す

ドキュメントの抽象要約を試す

抽象ドキュメント要約の例の JSON 応答

クエリ ベース要約

summaryParameter の使用

サービスとデータの制限

関連項目

その他のリソース

クエリベース要約