Introduction | Download | Contacts

eDoc

Read eDoc White Paper in HTML or PDF

eDoc is a toolkit for the developers working in the areas of document imaging, OCR, and document management. The author has been working on the research of document image processing for many years, which includes the work in Tsinghua University, China and Michigan State University. The engine is built in COM technique and a client is provided for demo and test, which can be run in either GUI or command line mode. The primary functions include: 

Form Identification and Dropout

  • It doesn't rely on pre-designed anchor points in form registration and works for any kind of forms

  • Drop out form frames without use of a blank form

  • Drop out form frames and static form text with use of a blank form

  • Form identification against a set of pre-trained form templates

  • Automatic form template training on blank forms

  • Reconstruct strokes of the characters broken by the removal of form frames

  • Barcode and checkbox location

  • De-skew the filled-in text

  • Output regions of interest (ROIs) for OCR/ICR/OMR

  • High tolerance of difference between form template and filled form:

    • Horizontal and vertical scale: ±5%

    • Horizontal and vertical shift: ±1 inch

    • Skew: ±10"

    • Resolution: > 150 DPI

Skew Detection

  • Detect skew angle in an arbitrary range up to 180 degrees 

  • Work for various document images at any resolution

  • Flexible accuracy from 5 to 0.01 degrees 

  • High throughput 

Related Publication

  • Bin Yu and A. K. Jain, A generic system for form dropout, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 18, 1996, pp. 1127-1134.
  • Bin Yu and A. K. Jain, A robust and fast skew detection algorithm for generic documents, Pattern Recognition, Vol. 29, 1996, pp. 1599-1629.

  • A. K. Jain and Bin Yu, Automatic text location in images and video frames, Pattern Recognition, Vol. 31, 1998, pp. 2055-2076.

  • A. K. Jain and Bin Yu, Document representation and its application to page decomposition, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, 1998, pp. 294-308.
  • Bin Yu and B. Yuan, A global optimum clustering algorithm, Engineering Applications of Artificial Intelligence, Vol. 8, 1995, pp. 223-227.
  • Bin Yu and B. Yuan, A more efficient branch and bound algorithm for feature selection, Pattern Recognition, Vol. 26, 1993, pp. 883-889.
  • Bin Yu and B. Yuan, A dynamic selection algorithm for globally optimal subset, Engineering Applications of Artificial Intelligence, Vol. 5, 1992, pp. 457-4628.
  • Bin Yu, X. Lin, Y. Wu and B. Yuan, Isothetic polygon representation for contours, CVGIP: Image Understanding, Vol. 56, 1992, pp. 264-268.
  • Bin Yu, X. Lin and Y. Wu, The tree representation of the graph used in binary image processing, Information Processing Letters, Vol. 37, 1991, pp. 55-59.
  • A. K. Jain and Bin Yu, Model-based document representation: application to page segmentation, in Proceedings of the 4th International Conference on Document Analysis and Recognition, Ulm, August 1997, pp. 34-38.
  • Bin Yu, A. K. Jain and M. Mohiuddin, Address block location on complex mail pieces, in Proceedings of the 4th International Conference on Document Analysis and Recognition, Ulm, August 1997, pp. 897-901.
  • Bin Yu and A. K. Jain, Lane boundary detection using a multiresolution Hough transform, in Proceedings of the IEEE International Conference on Image Processing, Santa Barbara, October 1997, pp. II 748-751.
  • Bin Yu and A. K. Jain, A form dropout system, in Proceedings of the 13th International Conference on Pattern Recognition, Vol. 3, Vienna, August 1996, pp. 701-705.
  • A. K. Jain, Bin Yu, Y. Zhong, O. Trier and N. Ratha, Document processing research in Michigan State University, in Proceedings of the Symposium on Document Image Understanding Technology, Maryland, October 1995, pp. 126-140.
  • Bin Yu, Automatic understanding of symbol connected diagrams, in Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, August 1995, pp. 803-806.
  • Bin Yu and B. Yuan, A feature selection method for multi-class-set classification, in Proceedings of the IEEE International Joint Conference on Neural Network, Vol. 3, Baltimore, June 1992, pp. 567-572.
  • Bin Yu and X. Lin, The extended binary tree representation of binary image and its application to engineering drawing entry, in Proceedings of the 10th IEEE International Conference on Pattern Recognition, Atlantic, June 1990, pp. 109-114.
  • Bin Yu, X. Lin and Y. Wu, An economical contour extraction algorithm for understanding large-size engineering drawings, in Proceedings of the 1st IEEE International Conference on Systems Integration, Morristown, April 1990, pp. 302-309.
  • Bin Yu, X. Lin and Y. Wu, A BAG-based vectorizer for automatic diagram reader, in Proceedings of the International Conference on CAD & CG, Beijing, August 1989, pp. 498-502.

©2005 Divinev. All rights reserved.