Migraph OCR/DTP/Commercial

From: JJL101@PSUVM.BITNET
Date: 07/18/92-02:19:14 PM Z


From: JJL101@PSUVM.BITNET
Subject: Migraph OCR/DTP/Commercial
Date: Sat Jul 18 14:19:14 1992

Taken from: Atari Explorer Online (#9202)
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= 


 | | |  MIGRAPH OCR
 | | |  By John L. McLaughlin
 | | |  ---------------------------------------------------------------


 Requirements: Any ST/STe/TT computer with 2 MB or more RAM and hard
               disk.  Hand- or full-page scanner optional.

 Summary: Sophisticated, trainable optical character-recognition (OCR)
          package, capable of making short work of data-input.

 Manufacturer:  MiGraph, Inc., 32799 Pacific Highway S., Federal Way, WA
                98003 (206) 838-4677

 Price: $299.00


 Though paper provides a convenient and tangible medium for human
 communication, it's not great for talking to machines.  Scanning has
 solved the problem of how to get images from paper into computer
 memory.  But because computers store images and text in completely
 different ways, images of text, such as a scan of this magazine page,
 require further processing before the information they contain can be
 used by word processors, spreadsheets, and other "text-handling"
 applications.

 MiGraph OCR (short for "Optical Character Recognition") provides the
 missing link -- converting scanned text to ASCII files that can be used
 directly by a wide variety of applications.  The program can accept
 previously-scanned monochrome .IMG or TIFF files; or process input
 directly from a MiGraph or compatible hand-scanner.

 The OCR Process

 MiGraph OCR begins its job by methodically chopping up a scanned image:
 first into discrete lines of text, then into masses identified as words
 and subdivided into characters.  This, alone, is a fairly complicated
 process, involving raster image-processing (to remove spurious
 background shading and stray pixels, improve contrast and separate
 characters, etc.) and geometric analysis (to correct for text
 misalignment).  Next, using a font-recognition engine licensed from
 Omnifont (world leaders in OCR software design), MiGraph OCR turns the
 bitmapped image of each character into a vector expression describing
 its shape in terms unrelated to size or resolution.

 Characters are recognized by comparing their vector descriptions
 against a dictionary of character forms in different fonts and point
 sizes -- a process that yields a far higher percentage of "hits" than
 prior OCR techniques involving bitmap comparisons.  Additional
 refinement is obtained by referencing against a user dictionary,
 created by "training" the device on text with particular
 characteristics.

 As a last step, MiGraph OCR performs a complex lexical and syntactic
 analysis, using one of four supplemental dictionaries based on the
 Proximity/Merriam-Webster Linguibase.  This further assists the program
 in making intelligent "guesses" about characters whose forms remain
 ambiguous.

 Using OCR

 Installing MiGraph OCR is simple.  An INSTALL program is included on
 the main disk that lets you specify the folder into which you want
 program files stored.  The utility also lets you identify which of the
 four supplemental dictionaries you wish installed: versions for
 English, German, French, and Dutch are included on two support disks.
 A minimum of 2 MB free space must exist on the target partition, prior
 to installation.

 OCR's main control screen is simple and well-designed, and a little
 random button-clicking quickly reveals how most of the program works.
 Nevertheless, to help get you started, the manual includes several
 step-by-step, hands-on tutorials.  The general control panel, accessed
 by clicking on the "hammer" icon, lets you specify input source
 (scanner or file), output format, and set refining parameters for the
 OCR process.  Selecting "scanner" as the input device causes the
 appearance of a secondary scanner configuration dialog which lets you
 define resolution, area, and direction of input scans.

 Select "Get Image," and you're flying.  If you've elected to scan, the
 hand scanner is activated and managed automatically -- all you have to
 do is move it down (or across) the page.  OCR performs best when
 presented with a straight scan, so a scanning tray is recommended.  The
 only glitch I noticed was caused, as it turned out, by the fact that I
 was running MiGraph OCR on a Mega STe at 16 MHz, with blitter and
 caches enabled.  Apparently, some combination of these features throws
 off the sample timing, so that illegible scans are produced.  The fix,
 at least until MiGraph issues an upgrade, is to use the Control Panel
 to turn off all enhancements while scanning is in progress.  They can
 (and should) be turned on again, afterwards, since OCR processing
 benefits from the increased system throughput.

 Once scanning is complete, the scanned image appears in OCR's work
 window.  Your first job is to assess the quality of the scan, to
 determine if it is appropriate for OCR processing.  Because low-quality
 scans take unnecessarily long to process, and produce a large number of
 errors, it's best to repeat doubtful scans at this point.

 The next step is to select regions of the scanned image for input to
 OCR.  This is done in very straightforward fashion, by dragging
 rectangles or drawing polyline boxes around desired portions of the
 image.  Multiple regions can be sorted so that they are processed in
 any desired order.  An added plus: to avoid having to make duplicate
 scans of the same material, MiGraph OCR also lets you define the
 graphic regions of any scan, saving them as .IMG or TIFF files.

 When OCR is initiated, the program performs several unattended passes:
 rectifying the image, segmenting it, and generating a first
 interpretation of its content.  Because the process can take a while,
 you are kept appraised of progress by a succession of dialog boxes.  If
 automatic processing has been selected, output text is then saved
 transparently to the designated file.  Otherwise, the interactive
 learning phase begins.

 During interactive learning, the system presents you with problem areas
 of your scan, in greatly enlarged form, and asks you to correct or
 approve of its interpretations.  The process is easily managed, though
 it can be time-consuming if many problems exist (the process can be
 aborted at any point, however, and the resulting text file saved to
 disk with markers inserted to indicate ambiguous characters).  When
 correcting a problem, it's important to determine whether it's a result
 of poor scan quality or from an unfamiliar font or point size.  When
 scan-quality is at fault, you should correct the problem in text,
 without updating the current user dictionary.

 Entering a correction is usually a matter of typing a single letter,
 though occasionally, the program will present you with groups of
 several adjacent letters for identification.  Very rarely, the program
 will assume that two adjacent characters are one, and will not accept
 multiple characters for insertion.

 Alternatively, when you've identified a legitimate "training" situation
 (i.e., the program has failed to recognize text because it contains
 some regular feature (e.g., font, point size, or special letterform)
 which is unfamiliar) you can "train" OCR to recognize the character in
 the future.  A vectorized image of the new letterform is added to the
 current user dictionary, which can be saved back to disk at the end of
 the session.  Over time, dictionaries can be developed and refined for
 each type of text you regularly use as input, and these can add
 remarkably to the accuracy of OCR's interpretation.

 When you tell OCR to "learn" a new character, you must take care to
 input the correction properly.  OCR immediately applies any corrected
 interpretation to similar ambiguities throughout the text -- a process
 designed to prevent your having to correct the same mistake more than
 once.  Unfortunately, however, this also means that an erroneous
 correction can easily be propagated through your output, and -- if
 unrecognized at the end of the session -- perhaps even entered
 accidentally in the current dictionary when it is saved back to disk.
 Unfortunately, there's no way to "edit" the updated dictionary after a
 training pass, nor to return to a problem area during the pass, to re-
 enter a correction.  So a fair amount of dictionary-refinement can be
 lost, if you're not careful.

 While I've described using OCR to process only a single scanned unit of
 text, it's also very easy to append the results of several OCR sessions
 to the same output file, creating a single result document that can be
 imported to a word processor.  Alternatively, however, I've had good
 luck employing utilities such as WizWorks!' Scan-Lite to conjoin
 several scans into one uniform image before importing into OCR.
 Unfortunately, I have no means of testing how well MiGraph OCR would
 perform on input from a full-page flatbed scanner; but I suspect that
 for serious applications, this option should be thoroughly explored.

 Performance

 Once a sufficiently-refined user dictionary has been created for text
 from a particular source, MiGraph OCR is very accurate.  It's also
 fairly quick, at least when processing in automatic mode: a page of
 Courier 10-pitch type, scanned at 300 dpi, can be output as ASCII in
 something like three minutes, which is marginally faster than an
 average-to-good touch typist could enter the same material.  Naturally,
 text output by OCR must be further processed before it can be
 considered correct.  At least part of this process (i.e., spell-
 checking) can be automated, however.

 Because performance accuracy is so dependent on user dictionaries,
 MiGraph OCR is most useful when input is derived from only a limited
 range of text-types.  Even with this constraint, however, it's easy to
 imagine a broad range of applications.  Particularly intriguing is the
 idea of using MiGraph OCR to convert faxes, received via faxmodem, to
 ASCII files -- providing a wholly "paperless" solution to fax
 correspondence in the computer context.

 Only one significant feature is lacking: the ability to queue multiple
 files for input and unattended processing.  Hopefully, this feature
 will be added in a future upgrade, since it would make the program
 highly competitive with Kurzweil and other dedicated OCR systems,
 particularly in the small office environment.


-----------------------------------------
Return to message index