OCR Invoices Pre-NER BIO Format
        Community
        
      
   
     
OCR text extraction and tokenization with BIO format for invoice documents. All tokens are initially tagged as ‘O’ (Outside) for subsequent NER tagging.
Labeling Configuration
<View>
  <!-- The image to annotate -->
  <Image name="image" value="$image" zoomControl="true"/>
  <!-- Bounding-box control that will receive the "rectanglelabels" results
       coming from your OCR model (from_name = "label") -->
  <RectangleLabels name="label" toName="image" choice="single">
    <!-- You only emit the generic "O" class, but feel free to add more labels -->
    <Label value="O" background="#FFA500"/>
  </RectangleLabels>
  <!-- Per-region transcription box (from_name = "transcription").
       Because perRegion="true", one TextArea is linked to each rectangle. -->
  <TextArea name="transcription"
            toName="image"
            perRegion="true"
            editable="true"
            rows="1"
            required="true"
            placeholder="Type or correct OCR text…"/>
</View>About the labeling configuration
All labeling configurations must be wrapped in View tags.
This configuration uses the following tags:
Usage Instructions
This configuration provides a streamlined interface for OCR text verification and correction:
 
                
                