|
| 1 | +--- |
| 2 | + |
| 3 | +api_name: |
| 4 | +- Microsoft.Office.DocumentFormat.OpenXML.Packaging |
| 5 | +api_type: |
| 6 | +- schema |
| 7 | +ms.assetid: 2f6f0f89-0ac0-4d40-9f1a-222caf074cf1 |
| 8 | +title: 'How to: Replace Text in a Word Document Using SAX (Simple API for XML)' |
| 9 | +description: 'Learn how to replace text in a Word document using SAX (Simple API for XML)' |
| 10 | +ms.suite: office |
| 11 | + |
| 12 | +ms.author: o365devx |
| 13 | +author: o365devx |
| 14 | +ms.topic: conceptual |
| 15 | +ms.date: 04/03/2025 |
| 16 | +ms.localizationpriority: high |
| 17 | +--- |
| 18 | +# Replace Text in a Word Document Using SAX (Simple API for XML) |
| 19 | + |
| 20 | +This topic shows how to use the Open XML SDK to search and replace text in a Word document with the |
| 21 | +Open XML SDK using the Simple API for XML (SAX) approach. For more information about the basic structure |
| 22 | +of a `WordprocessingML` document, see [Structure of a WordprocessingML document](./structure-of-a-wordprocessingml-document.md). |
| 23 | + |
| 24 | +## Why Use the SAX Approach? |
| 25 | + |
| 26 | +The Open XML SDK provides two ways to parse Office Open XML files: the Document Object Model (DOM) and the Simple API for XML (SAX). The DOM approach is designed to make it easy to query and parse Open XML files by using strongly-typed classes. However, the DOM approach requires loading entire Open XML parts into memory, which can lead to slower processing and Out of Memory exceptions when working with very large parts. The SAX approach reads in the XML in an Open XML part one element at a time without reading in the entire part into memory giving noncached, forward-only access to the XML data, which makes it a better choice when reading very large parts. |
| 27 | + |
| 28 | +## Accessing the MainDocumentPart |
| 29 | + |
| 30 | +The text of a Word document is stored in the <xref:DocumentFormat.OpenXml.Packaging.MainDocumentPart>, so the first step to |
| 31 | +finding and replacing text is to access the Word document's `MainDocumentPart`. To do that we first use the `WordprocessingDocument.Open` |
| 32 | +method passing in the path to the document as the first parameter and a second parameter `true` to indicate that we |
| 33 | +are opening the file for editing. Then make sure that the `MainDocumentPart` is not null. |
| 34 | + |
| 35 | +### [C#](#tab/cs-1) |
| 36 | +[!code-csharp[](../../samples/word/replace_text_with_sax/cs/Program.cs#snippet1)] |
| 37 | + |
| 38 | +### [Visual Basic](#tab/vb-1) |
| 39 | +[!code-vb[](../../samples/word/replace_text_with_sax/vb/Program.vb#snippet1)] |
| 40 | +*** |
| 41 | + |
| 42 | +## Create Memory Stream, OpenXmlReader, and OpenXmlWriter |
| 43 | + |
| 44 | +With the DOM approach to editing documents, the entire part is read into memory, so we can use the Open XML SDK's |
| 45 | +strongly typed classes to access the <xref:DocumentFormat.OpenXml.Wordprocessing.Text> class to access the |
| 46 | +document's text and edit it. The SAX approach, however, uses the <xref:DocumentFormat.OpenXml.OpenXmlPartReader> |
| 47 | +and <xref:DocumentFormat.OpenXml.OpenXmlPartWriter> classes, which access a part's stream with forward-only |
| 48 | +access. The advantage of this is that the entire part does not need to be loaded into memory, which is faster |
| 49 | +and uses less memory, but since the same part cannot be opened in multiple streams at the same time, we cannot create a |
| 50 | +<xref:DocumentFormat.OpenXml.OpenXmlReader> to read a part and a <xref:DocumentFormat.OpenXml.OpenXmlWriter> to edit |
| 51 | +the same part at the same time. The solution to this is to create an additional memory stream and write the |
| 52 | +updated part to the new memory stream then use the stream to update the part when `OpenXmlReader` and `OpenXmlWriter` |
| 53 | +have been disposed. In the code below we create the `MemoryStream` to store the updated part and create an |
| 54 | +`OpenXmlReader` for the `MainDocumentPart` and a `OpenXmlWriter` to write to the `MemoryStream` |
| 55 | + |
| 56 | +### [C#](#tab/cs-2) |
| 57 | +[!code-csharp[](../../samples/word/replace_text_with_sax/cs/Program.cs#snippet2)] |
| 58 | + |
| 59 | +### [Visual Basic](#tab/vb-2) |
| 60 | +[!code-vb[](../../samples/word/replace_text_with_sax/vb/Program.vb#snippet2)] |
| 61 | +*** |
| 62 | + |
| 63 | +## Reading the Part and Writing to the New Stream |
| 64 | + |
| 65 | +Now that we have an `OpenXmlReader` to read the part and an `OpenXmlWriter` to write to the new `MemoryStream` |
| 66 | +we use the <xref:DocumentFormat.OpenXml.OpenXmlReader.Read*> method to read each element in the part. As |
| 67 | +each element is read in we check if it is of type `Text` and if it is, we use the <xrefDocumentFormat.OpenXml.OpenXmlReader.GetText*> |
| 68 | +method to access the text and use <xref:System.String.Replace*> to update the text. If it is not a |
| 69 | +`Text` element, then we write it to the stream unchanged. |
| 70 | + |
| 71 | +> [!Note] |
| 72 | +> In a Word document text can be separated into multiple `Text` elements, so if you are replacing a |
| 73 | +> phrase and not a single word, it's best to replace one word at a time. |
| 74 | +
|
| 75 | +### [C#](#tab/cs-3) |
| 76 | +[!code-csharp[](../../samples/word/replace_text_with_sax/cs/Program.cs#snippet3)] |
| 77 | + |
| 78 | +### [Visual Basic](#tab/vb-3) |
| 79 | +[!code-vb[](../../samples/word/replace_text_with_sax/vb/Program.vb#snippet3)] |
| 80 | +*** |
| 81 | + |
| 82 | +## Writing the New Stream to the MainDocumentPart |
| 83 | + |
| 84 | +With the updated part written to the memory stream the last step is to set the `MemoryStream`'s |
| 85 | +position to 0 and use the <xref:DocumentFormat.OpenXml.Packaging.OpenXmlPart.FeedData*> method |
| 86 | +to replace the `MainDocumentPart` with the updated stream. |
| 87 | + |
| 88 | +### [C#](#tab/cs-4) |
| 89 | +[!code-csharp[](../../samples/word/replace_text_with_sax/cs/Program.cs#snippet4)] |
| 90 | + |
| 91 | +### [Visual Basic](#tab/vb-4) |
| 92 | +[!code-vb[](../../samples/word/replace_text_with_sax/vb/Program.vb#snippet4)] |
| 93 | +*** |
| 94 | + |
| 95 | +## Sample Code |
| 96 | + |
| 97 | +Below is the complete sample code to replace text in a Word document using the SAX (Simple API for XML) |
| 98 | +approach. |
| 99 | + |
| 100 | +### [C#](#tab/cs-0) |
| 101 | +[!code-csharp[](../../samples/word/replace_text_with_sax/cs/Program.cs#snippet0)] |
| 102 | + |
| 103 | +### [Visual Basic](#tab/vb-0) |
| 104 | +[!code-vb[](../../samples/word/replace_text_with_sax/vb/Program.vb#snippet0)] |
| 105 | +*** |
0 commit comments