Lesson 8. Automatic vectorization, text recognition

Brief description: During this lesson you will learn how to tune vectorization and text recognition parameters, to vectorize raster images according to these parameters, and correct vectorization results.

For detailed information on these subjects see the following sections of the Quick Start guide: Tuning Vectorization, How to Vectorize, Text Recognition in the Automatic Vectorization section of the Quick Start guide.

Tuning Vectorization

Create a new AutoCAD drawing.

Using the Open command from the rFile menu open the Mech.tif file from the Lesson_8 folder.

You can use one of the pre-defined templates or tune the parameters yourself. During this lesson you'll tune the parameters yourself.

Automatic vectorization is tuned in the R2VConversion Options dialog. To open this dialog, choose Conversion Options from the rConvert menu.

When tuning vectorization you need to specify the following parameters:

  1. The types of raster entities to recognize.
  2. The image geometry.
  3. Whether you want to separate vectors onto different layers.
  4. Text recognition parameters (if you use the OCR module - set word patterns for text recognition).     

How to specify the types of entities to recognize       

  • Open the Recognition tab of the R2V Convertion Options dialog. 
  • Select the entities, which you want to obtain after vectorization of the loaded image - to do this, select the corresponding checkboxes.
    The additional parameters for recognized objects, such as line type, arrows, hatch angle, and others are located on the second level. Click on "+" to get access to these parameters. 
    Specify the following types of entities to recognize on the image - Sample1.tif:
    • Lines - Line types & Arrows (as there are different line types and dimension lines with arrows in this image). 
    • Arcs & Circles - Arrows (as there are dimension arcs with arrows in this image). 
    • Test Areas - OCR (as we intend to recognize texts using the OCR module).   
    • Hatches - 45° (as there are hatches with an angle of 45° in this image)

Setting the geometry of the image

  • Open the Options tab of the R2V Conversion Options dialog.
  • Using the appropriate Measure buttons, specify:
Min Length - the minimum length of a raster object to be recognized.
Max Width - the maximum width of raster lines. Set the value of this parameter slightly greater than the measured line width on your drawing.
Max Break - the maximum length of break in a raster line to be ignored. Set the value of this parameter slightly greater than the distance between dashes in dashed lines.
Text Height - Set the value for this parameter equal to the maximum height of upper case raster text symbols.
Arrow size - the size of dimension arrows in your image. Outline an arrow of average size with a rectangle, as shown in the figure.
  • Move the Accuracy slider to the Low position. This makes the vectorization procedure less sensitive to the drawing errors. 
  • To make the lines orthogonal, select the Orthogonalization checkbox and set the value of Base Angle to 0°. 
The appearance of the Options tab after tuning is complete.

Separating vector objects by width to different layers and colors

The criterion for separating vector objects to different layers and/or colors is the width of the original raster lines. You can define widths of the resulting vector objects and separate them to different layers and/or colors.

  • Let's separate the resulting vector objects by width to different colors:
    Specify a width of 0.5mm and the color red for vector objects produced from vectorizing raster lines whose width is less than 0.8mm.
  • Specify a width of 1mm and the color blue for vector objects produced from vectorizing raster lines whose width is more than 0.8mm.
  • Open the Separate tab of the R2V Conversion Options dialog.
  • Select the Use Table checkbox.
  • Add a new separation interval by entering 0.8 in the New Interval field and pressing the button.       
  • Set the parameters for each interval:
    • In the Width field, enter a width of 0.5 for thin lines; and 1 for the thick lines;
    • In the Color field, select red for thin lines and blue for thick lines.
  • Select the checkboxes for each interval.

Tuning text recognition

WiseImage features various ways of handling raster texts - you can use either the built-in or external OCR modules, recognize raster text areas, or replace raster text with vector polylines and outlines.

In this example you will learn how to use the built-in OCR for recognizing text and creating the corresponding text objects.

  • In the Recognition tab, we have already selected the Text Areas checkbox and recognition method - OCR, and in the Options tab, we have specified Text Height.
  • Open the Texts tab of the R2V Conversion Options dialog.
  • In the Orientation field choose Horizontal and Vertical from the list.
  • Select the Standalone Letters checkbox, as they are presented in this example.
  • Set patterns for recognizing text inscriptions, which are presented in the drawing: 
Text
Pattern
Text
Pattern
,
%D
,
%1E
,
%D%1S
,
%1S%D
2x%2D%1S
M%2D
2x%D
%E
   
  • Enter the patterns one by one in the Patterns field, and press the Add Pattern button after each one.

To help you, standard patterns can be selected from the right-button menu.
If you enter a pattern incorrectly, choose it from the list, and then press the Delete Pattern button.       

  • Select the Patterns checkbox to use the specified patterns when recognizing text.
  • If you want to define the height for recognized texts (e.g. 6mm) enter this value in the Height Table field and select the corresponding checkbox.
  • Choose default.ocr from the Template File list.
  • Select a special layer for recognized texts (e.g., Texts) in the Place to Layer field.
  • Press OK to save the vectorization settings.

Saving vectorization settings for future use

If you want to save the settings for future use:

  • Press the Template button and choose Save.
  • Specify a name for a template file in the File Name field of the Save Template File dialog.
  • To load a previously saved template file, press the Template button, choose Load and specify a file to open.

Running Vectorization

To run vectorization, choose Raster To Vector from the rConvert menu.

Original raster drawing
Vectorization result

Correcting recognized texts

To correct recognized text:

  • Choose OCR Text Corrector from the Convert menu.
  • The first recognized text is displayed on screen. Check it and correct it in the OCR Text Corrector dialog, if necessary.       
  • Use the four buttons located on the left of the dialog to move between the recognized texts.
    • To change the text height and to correct its position, use the Height and Move buttons.
    • To accept the corrected text and move to the next one, press the Accept button.
    • To delete the recognized text, press the Delete  button.
    • When you have finished checking all texts, the following message will be displayed in the AutoCAD command line:
      'No more objects to correct. Command completed'.

 [ welcome ]   [ hybrid graphics ]   [ quick start ]   [ tutorial ]   [ reference ]   [ about ] 
 
 [ Welcome ]  [ WiseImage for Windows ]   [ WiseImage for AutoCAD ] 


 
top