How to Convert Word Documents to HTML

If you’ve had a website for a while, it’s likely you have several Microsoft Word documents or Portable Document Format (PDF) files converted from Microsoft Word documents on your site.

And with more people accessing the web via mobile devices, opening a Word document or PDF is problematic.

Many users don’t have Mobile Viewers for Microsoft Office installed on their mobile devices. Or they don’t want to download and install another app just so they can open Microsoft Word documents.

Text in the PDF or Word document may not reflow to fit the screen, resulting in users sliding the content left and right, up and down, and pinching and zooming in order to read the content.

Instead of reading your content, your readers leave your website, frustrated with the process. Who knows if they’ll return to your site?

Reasons to Convert Word Documents

When you convert Word documents to HTML, you make your content available to everyone in a format that any device can read since HTML is the standard language used to create web pages.

Content can be formatted and structured so it can be easily read, without requiring users to install additional software or download an application. Search engines can index your content so users can easily find it.

Converting Word Documents to HTML

I’ve used many methods to convert word documents to HTML, but my favorite tool to use is Word2CleanHTML.

Word2CleanHTML

A free online site, Word2CleanHTML allows you to paste the content of your Word document and output clean HTML you can use on your website.

It offers several conversion options, which is one of the reasons I like to use it. And because Word2CleanHTML is online, I don’t need to download and install another app.

In addition, it strips out invalid and proprietary tags and allows you to specify whether you want to:

  • Remove empty paragraphs
  • Convert <b> tag to <strong> tag, <i> tag to <em> tag
  • Replace non-ascii with HTML entities
  • Replace smart quotes with ascii equivalents
  • Indent with tabs, instead of spaces
  • Replace non-breaking spaces with ordinary spaces

What You Need to Know about Word2CleanHTML

I’ve used Word2CleanHTML successfully to convert Word documents without images to HTML. If you have lists in your Word document, you may need to edit some of the list markup in an HTML editor.

If your Word document has images with alternative text and you want to ensure the alternative text is retained, check out Terrill Thompson’s Converting Word to PDF or HTML: Options for Accessibility post.

Thompson describes several tools you can use for retaining the alternative text when you convert from Word to HTML. Be aware the output the tools create will contain lots of embedded inline CSS styles which you’ll need to edit.

I’ve also used Word2CleanHTML on Google Docs, but the converted content was filled with a lot of dir markup as well as additional paragraph tags. Your mileage may vary, depending on the structure of your document.

Is It Worth It?

By converting Word documents to HTML you provide content that is accessible to everyone, that allows readers to get the information they want quickly.

Why force users to spend so much time trying to read your content when you have an option to provide a better user experience?

Photo of author

About the Author

Deborah Edwards-Oñoro enjoys birding, gardening, taking photos, reading, and watching tennis. She's retired from a 25+ year career in web design, usability, and accessibility.