I have a client who needs to convert each individual page written in Microsoft docx into individual html pages, and the conversion needs to be as close to an exact copy as possible.
From my experience the easiest is to have Word save the document as an html document and then tweak the html so that everything matches (tweaking with Dreamweaver seemed to work very well). Ideally all the styling should be handled by css but I imagine that would require a complete rewrite of the html produced by Word and therefore too costly? The two most annoying bits are:
1. Handling diagrams: if you try a direct conversion often the arrows will misalign. I suggest either first grouping the diagram elements and then hoping Word understands it needs to convert the group to a single jpg; Or taking a screenshot, and then re-inserting the diagram image once it has been converted into html
2. Handling the examples: often the columns are aligned with spacing or tabs. I suggest recreating such examples with a table. Using a table will also sort out the issue of placing a border around these examples (Word seems to do something weird sometimes with the border - see my sample p858).
I have attached an example of a page in docx format and the default conversion by Word into the html which highlights the border and the alignment of the numbers issues (see 2010 Chapter [login to view URL]).
To check that you can do the conversion appropriately, I have included a document (Chpt 30 [login to view URL]). Please convert page 772 into html AND page 773 into html and attach them in your bid. Feel free to highlight any issues that you may find relevant.
In total there are 1009 pages to be converted broken up into 35 Chapters. I'm looking to hire a couple of people to speed up the conversion process and after selection I will ask for a one-on-one quote for each chapter as the number of pages and difficulty in converting will vary.
Thanks