Simple Comparison of MS Word and Hand Made Web pages

In real life you should use all the tools you can to get the job done. But while you are learning, you should do the work by hand. Examples of web editors are things like Microsoft Word (saved as HTML), DreamWeaver and others.

This is a somewhat exaggerated example, but I wanted you to see the difference between HTML files generated by hand and those generated by Microsoft Word. In this example, I created a Word document that simply had the phrase "Hello, comp313", centered in large bold type. The original .doc file is about 19KB long. I save the file as HTML and the resulting file is 1.86K. This is a considerable savings but may not be typical of larger, more complicated documents.

The hand made HTML file is about 200 bytes. Here is the hand made file:

<html>
	<head>
	<title>Hello, comp313</title>
</head>

<body>

<h1 align="center">
Hello, comp313
</h1>

</body>
</html>

And this is the Word document saved as HTML.

<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:w="urn:schemas-microsoft-com:office:word"
xmlns="http://www.w3.org/TR/REC-html40">

<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<meta name=ProgId content=Word.Document>
<meta name=Generator content="Microsoft Word 9">
<meta name=Originator content="Microsoft Word 9">
<link rel=File-List href="./Hello_files/filelist.xml">
<title>Hello, comp313</title>
<if gte mso 9]>lt;xml>
 <o:DocumentProperties>
  <o:Author> kent archie</o:Author>
  <o:LastAuthor> kent archie</o:LastAuthor>
  <o:Revision>2</o:Revision>
  <o:TotalTime>1</o:TotalTime>
  <o:Created>2004-03-19T20:15:00Z</o:Created>
  <o:LastSaved>2004-03-19T20:15:00Z</o:LastSaved>
  <o:Pages>1</o:Pages>
  <o:Company> </o:Company>
  <o:Lines>1</o:Lines>
  <o:Paragraphs>1</o:Paragraphs>
  <o:Version>9.2720</o:Version>
 </o:DocumentProperties>
<xml>lt;endif]-->lt;if gte mso 9]>lt;/xml>
 <w:WordDocument>
  <w:AttachedTemplate
   HRef="C:\Documents and Settings\Owner\Application Data\Microsoft\Templates\Normal.dot">lt;/w:AttachedTemplate>
 </w:WordDocument>
<xml>lt;endif]-->
<style>
<
 /* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{mso-style-parent:"";
	margin:0in;
	margin-bottom:.0001pt;
	mso-pagination:widow-orphan;
	font-size:12.0pt;
	font-family:"Times New Roman";
	mso-fareast-font-family:"Times New Roman";}
@page Section1
	{size:8.5in 11.0in;
	margin:1.0in 1.25in 1.0in 1.25in;
	mso-header-margin:.5in;
	mso-footer-margin:.5in;
	mso-paper-source:0;}
div.Section1
	{page:Section1;}
-->
</style>
</head>

<body lang=EN-US style='tab-interval:.5in'>

<div class=Section1>

<p class=MsoNormal align=center style='text-align:center'>lt;b>lt;span
style='font-size:14.0pt;mso-bidi-font-size:12.0pt'>Hello, comp313<o:p>lt;o:p>lt;/span>lt;/b>lt;/p>

</div>

</body>

</html>

It isn't that the MS version is wrong, it's just that is has s different goal. It is trying to create an HTML page that will look as much as possible like the original Word document. The hand made HTML page is not trying to look like anything else. Much of the 'extra' stuff in the Word page we will use to give us more precise control over the appearance of our web pages.

If you have existing Word documents and want to display them on the web without expecting the user to have Word, then using the 'save as html' method is a good idea, a real time saver. But if you are creating new pages, don't write them in Word and them save as HTML.