Friday, August 7, 2009

10+ tips for using Word as an XML editor

Author: Susan Harkins

XML makes it possible for you to extract, manipulate, store, and reuse data from any number of sources - and Word 2003 and 2007 provide tools for working with XML files. Here are a few pointers that will make the process go more smoothly.



Using XML, you can store data in a format that's easily available to other software. The resulting file contains not only the data, but also a description of the document in plain text. That makes reusing data much simpler because any software that can read plain text can read the data. Many will argue that Word isn't the right tool for editing XML files, but if Word is what you have, you want to get the job done efficiently and effectively.

Here are a few suggestions for simplifying your XML tasks in Word. (We're providing instructions for Word 2003, but the concepts are similar for Word 2007.)

Note: This article is also available a PDF download.

1: Use All Word Files

Most applications default to the application's format in the Open and Save dialog boxes. For instance, Word 2003 defaults to Word's .doc format. If you want to open a file type other than the default setting, you have to open the File Of Type drop-down list, choose the file type, and then let Word update the list. You can avoid several clicks if you set Word's default to display XML files by selecting All Word Documents from the File As Type list. Once you do, Word will always display XML files in the Name list, as shown in Figure A. Word will retain this setting until you change it.

Figure A

List all Word document types.

2: Save as XML

By default, Word saves files in document format (doc). If you work exclusively with XML files, you have to remember to change that setting every time you save a file. It may be more efficient to configure Word to save your documents as XML files automatically:

  1. Choose Options from the Tools menu.
  2. Click the Save tab.
  3. From the Save Word Files As drop-down list, choose XML Document (*.xml), as shown in Figure B, and click OK.

Figure B

Save word files to XML format automatically.

3: Right-click for attributes

When you open an XML document, Word displays both tags and content. It also opens the XML Structure task pane (to the right). Where are the attributes? To see attributes you must right-click an element and select Attributes from the resulting context menu. Doing so will display the Attributes For Item Dialog box, shown in Figure C. To change a value, select an attribute in the Assigned Attributes list and edit its value in the Value control. (If the task pane isn't visible, press [Ctrl]+[F1].)

Figure C

View an element's attributes.

If the element has multiple instances, the dialog box won't indicate which one you're working with. To avoid confusion, highlight the specific element before opening the Attributes For Item dialog box.

4: Find options

Word lets you control how it handles an XML file, but the configuration options can be difficult to find. You can take the traditional route to the options, as follows:

  1. From the Tools menu, select Templates And Add-Ins.
  2. Click the XML Schema tab.
  3. Click the XML Options button to open the XML Options dialog box, shown in Figure D.

Figure D

Control how Word hands an XML file.

There's an easier way to get to the options. Click the XML Options link at the bottom of the XML Structure task pane, shown in Figure E. Keep in mind that these options work with the current document only. You must reset them if you open a different XML file.

Figure E

Bypass the menus and click the XML Options link to open the XML Options dialog box.

5: Edit more easily

If you plan to edit actual content, you don't need the tags. In fact, if you display them, you could accidentally delete one. To turn off tags while editing, uncheck the Show XML Tags In The Document option in the XML Structure dialog. To change a value, simply type over it. To delete a value, select the entire element, including the start and end tags (as indicated by the red borders). If you delete a value without deleting the element tags, you leave an empty element.

If you prefer to use a shortcut, press [Ctrl]+[Shift]+X. This combination toggles between hiding and showing tags.

6: Display empty elements

Generally, you should avoid empty elements, but there are circumstances where they're acceptable. If the Show XML Tags In The Document option is enabled, you won't see them, though, which can present problems. If you want to inhibit element names but still know when an element is empty, use placeholders as follows:

  1. From the Tools menu, choose Templates And Add-Ins.
  2. Click the XML Schema tab.
  3. Click XML Options.
  4. Check the Show Placeholder Text For All Empty Elements option in the XML View options, as shown in Figure F.
  5. Click OK twice.

Figure F

Word will display placeholders for empty elements.

7: Avoid potential to lose data

A transform determines the data that makes it into a Word document. If the transform doesn't accommodate data in the file you open, that data doesn't make it to the open file. In this case, the transform works like a filter of sorts. For instance, you might use a transform to produce a list of products and prices. Another transform might include product names, prices, and a description of the product. Instead of opening the original file and manually deleting the data you don't want, the transform does it for you automatically, just by applying the transform as you open the original file.

To open a file using a transform, choose Open With Transform from the Open button's drop-down list (in the Open dialog box.) You can also apply a transform when you save a file. In the Save As dialog box, choose XML Document (*.xml) from the File As Type option, check the Apply Transform option, and then click the Transform button to choose the transform you want to apply.

It's important to realize that not only does the transform work on the open file, but it also changes the original Word document. If you save the open file, those changes become permanent to the original file. If you apply a transform to the file you're saving, those changes also become permanent. To avoid losing data or otherwise altering the original file, save the transformed file using a new name. That simple step makes perfect sense, yet it's easy to forget. You don't always realize that the transform is filtering data. Since the original data is out of sight, it's also out of mind — and it's easy to forget that the original file might contain data not visible in the transformed file.

8: Download XML Reference schemas

If you plan to use Word to write code to manipulate XML format, download Office 2003 XML Reference Schemas or 2007 Office System: XML Schema Reference. These files are Help files for working with XML structure in Word. Download them and open the .chm files in your browser for easy reference and viewing while you work. If you need to share XML files with others, consider using Word 2003: XML Viewer.

9: Turn off namespace alias

Sometimes, element names in the XML Structure are long and rather meaningless, as shown in Figure G. That's because by default, the pane displays namespaces in element names.

Figure G

The namespaces, displayed by default, are confusing.

To inhibit the namespaces, do the following:

  1. Click the XML Options link at the bottom of the XML Structure task pane.
  2. Check Hide Namespace Alias In XML Structure Task Pane in the XML View options.
  3. Click OK.

As you can see in Figure H the names are shorter and the list is much friendlier.

Figure H

Turn off the namespaces to view a friendlier list.

10: Prevent deletion of XML elements

You might not be the only person to edit an XML document. Fortunately, you can enable the document protection feature to protect XML tags while allowing others to edit the actual content. Just follow these steps:

  1. Check Show XML Tags In The Document in the XML Structure task pane.
  2. From the Tools menu, select Protect Document.
  3. Check Allow Only This Type Of Editing In the Document in the Editing Restrictions section of the Protect Document task pane.
  4. Select No Changes (Read Only). (That's the default, so you probably won't need to select it.)
  5. In the document, select the contents of an element.
  6. Then, check the Everyone option in the Exceptions (Optional) section in the Protect Document task pane, as shown in Figure I.
  7. Repeat steps 5 and 6 for each XML tag that contains data you want to allow others to edit.
  8. Click Yes, Start Enforcing Protection.
  9. To password-protect the document, enter the password twice; to encrypt the document, click User Authentication.
  10. Click OK.

Figure I

Allow others to edit content.

11: View files in Office 2007

The Office 2007 applications use Office Open XML Format files. These files make use of ZIP compression technology. If you'd like to see the XML document parts for any Word, Excel, or PowerPoint (2007) file, change the file's four-character extension to ZIP. Then, open that file in your Windows Explorer. You'll see a few folders:

  • The _rels folder contains a file named .rels that stores information about the relationships between items in the ZIP package. That's how Office 2007 knows where to find everything it needs when opening a document.
  • The main document folder (word in figure x) stores the document's content and media files (pictures and so on). It also stores various document elements, such as settings, headers, and themes.
  • The [Content_Types].xml file contains definitions of the types of content.

Using the ZIP extension method, you can quickly learn a lot about your document. Just be careful not to change the folder structure or alter filenames while you're exploring.

No comments:

ITWORLD
If you have any question then you put your question as comments.

Put your suggestions as comments