Text Lib
Recent News

Mar. 28, 2013
Responding to numerous requests, MS Outlook parser was released. more

Oct. 16, 2009
New OpenDocument family documents parsers are availble now.

Aug. 31, 2007
New MS Office 2007 documents parsers has been added to the collection.

Nov. 21, 2005
Docs2text 2.0 component released. Supported document formats are MS Word, MS Excel, MS PowerPoint, rtf, Adobe Acrobat PDF.

Our partner

Full Text Indexing and Retrieval library with Approximate Search.

Check also pdf2text, odf2text, xls2text, ppt2text and pst2text

doc2text / doc2txt

doc2text is a component/library designed to convert MS Word documents into an easy editable and readable plain text which is also ease indexing and searching process of your documents.


Below is a short list of the most important features of doc2text:

  • doesn't require MS Word to process documents;
  • fastest possible processing speed - up to 200-300 times faster than using MS Word automation;
  • precise output - in most cases output is better than MS Word «Save As Text» does;
  • full extraction of tables, numbered and bulleted lists, headers, footers;
  • document summary extraction - author, title, keywords etc.;

docx2text / docx2txt

docx2text is a totally new library which is capable of processing new MS Word 2007 documents (docx) into the text.

It provides the same conversion features as doc2text. Please, refer to the features above for more details.

Still have questions - use our feedback form to ask any questions and we'll be happy to assist you.

Proceed to Download page to download doc2text demo which is part of docs2text demo.