Du bist hier: Snippet-Verzeichnis » Python (121)
Sprache:

Basic MSWord to XML converter

Sprache: English
Programmiersprache: Python
Veröffentlicht von: slimy [nicht registriert]
Letzte Änderung: 15.05.2006
Aufrufe: 977


Beschreibung

a python COM object which does a basic export from MSWord to XML... (this will be extended in the future to handle lists, embedded images and possibly other stuff)

Code

1 2 import win32com.client 3 4 class Word2XMLConverter: 5 6 _reg_clsid_ = "{1AC8A630-AB2A-11D5-A846-00902728E0D0}" 7 _reg_desc_ = "MSWord to XML converter" 8 _reg_progid_ = "Test.Word2XMLConverter" 9 10 _public_methods_ = ["convert"] 11 _public_attrs_ = [] 12 _readonly_attrs_ = [] 13 14 def __init__(self): 15 pass 16 17 def translateStyle(self, style): 18 if style=="Heading 1": return "H1" 19 if style=="Heading 2": return "H2" 20 if style=="Heading 3": return "H3" 21 if style=="Heading 4": return "H4" 22 if style=="Heading 5": return "H5" 23 return "P" 24 25 def convert(self, filename): 26 # prepare output xml document 27 outdoc = win32com.client.Dispatch("MSXML.DOMDocument") 28 outxml = outdoc.createElement("Word2XMLConverter") 29 outxml.setAttribute("srcfile", filename) 30 outdoc.appendChild(outxml) 31 # convert from msword to xml 32 app = win32com.client.Dispatch("Word.Application") 33 doc = app.Documents.Open(filename) 34 para = doc.Paragraphs.First 35 while para.Next(): 36 style = str(para.Style) 37 content = str(para.Range.Text) 38 xmlpara = outdoc.createElement(self.translateStyle(style)) 39 xmlpara.appendChild(outdoc.createTextNode(content)) 40 outxml.appendChild(xmlpara) 41 para=para.Next() 42 return outdoc 43 44 45 if __name__=='__main__': 46 import win32com.server.register 47 win32com.server.register.UseCommandLine(Word2XMLConverter) 48 49

Noch kein Kommentar vorhanden

Dieses Snippet kommentieren

Name *  

E-Mail (wird nicht angezeigt) *    

Website  

Kommentar *  

Sicherheitscode Sicherheitscode *    

RSS