Using the GOCR Command for Optical Character Recognition (with examples)

Using the GOCR Command for Optical Character Recognition (with examples)

GOCR is an Optical Character Recognition (OCR) tool designed to convert images of text into machine-encoded text. The tool can recognize characters using its OCR engine and prompt the user to store unknown patterns in a database for future reference. This makes it particularly useful for tasks such as digitizing printed documents, simplifying data extraction from images, and continuous improvement in recognizing characters over time by updating its database of patterns.

Use case 1: Recognize Characters and Output to a File

Code:

gocr -m 130 -p path/to/db_directory -i path/to/input_image.png -o path/to/output_file.txt

Motivation:

This use case is essential when you have a collection of images containing text that you need to convert into a digital format. By using this command, you store recognized text in a file for easy access, editing, or sharing. The ability of GOCR to learn from new characters and expand its recognition database ensures enhanced accuracy over time.

Explanation:

  • gocr: This is the command to invoke the GOCR tool.
  • -m 130: The mode option 130 enables the GOCR engine to create a new database, use an existing one, and extend it with newly recognized patterns.
  • -p path/to/db_directory: Specifies the path where the pattern database is stored or to be created. Ensure this path exists to avoid the database usage being silently skipped.
  • -i path/to/input_image.png: The path to the input image file containing the text that needs to be recognized.
  • -o path/to/output_file.txt: The path and name of the file where the recognized text will be saved.

Example Output:

Assuming the input image contains the text “Hello World”, the output file output_file.txt will have:

Hello World

Use case 2: Recognize Only Numerical Characters

Code:

gocr -m 130 -p path/to/db_directory -i path/to/input_image.png -o path/to/output_file.txt -C "0..9"

Motivation:

This use case is particularly useful in scenarios where the image contains only numerical data, such as receipts, invoices, or any document where numbers are predominant and need extraction. By focusing exclusively on numerical characters, the command minimizes errors associated with misinterpreting non-numerical text.

Explanation:

  • gocr: Initiates the GOCR tool.
  • -m 130: Allows the use, extension, and creation of the database.
  • -p path/to/db_directory: Specifies the database directory.
  • -i path/to/input_image.png: Input image file path.
  • -o path/to/output_file.txt: Output file path.
  • -C "0..9": Assumes that all characters to be recognized in the image are numeric, ranging from 0 to 9.

Example Output:

If the image contains the text “09/30/2022 Invoice Total: 1830”, the output file will have:

09302022 1830

Use case 3: Recognize Characters with 100% Certainty

Code:

gocr -m 130 -p path/to/db_directory -i path/to/input_image.png -o path/to/output_file.txt -a 100

Motivation:

Using this command is crucial when accuracy is paramount, and you’re either dealing with highly legible text or are willing to manually confirm and add unknown characters to your database. This ensures precise data extraction, although it may increase instances where the tool asks for confirmation due to the high certainty threshold.

Explanation:

  • gocr: Starts the GOCR OCR tool.
  • -m 130: Directs the command to create, use, and extend the database.
  • -p path/to/db_directory: Indicates the directory containing the pattern database.
  • -i path/to/input_image.png: Specifies the input image location.
  • -o path/to/output_file.txt: Determines the output text file path.
  • -a 100: Sets the certainty level to 100%, meaning that only characters recognized with absolute precision are considered known, any others are flagged as unknown.

Example Output:

If the input image clearly shows the text “Data Analysis”, but some letters such as ‘D’ and ‘A’ are imperfectly scanned, the output file may return something like:

Data An lysis

This indicates potential spots needing manual verification and input.

Conclusion

The GOCR command offers a flexible and extensible approach to Optical Character Recognition, ready to tackle text recognition challenges across various applications. By adjusting parameters, users can specify and refine the level of certainty, focus strictly on numerical data, and incrementally expand their recognition database with new patterns, thereby customizing the tool to fit their specific needs and achieving the highest possible accuracy.

Related Posts

How to Use the Command 'mosquitto_pub' (with Examples)

How to Use the Command 'mosquitto_pub' (with Examples)

The mosquitto_pub command is a straightforward MQTT client tool that allows users to publish messages to specific topics and exit immediately after publishing.

Read More
How to Use the Command 'debuild' (with examples)

How to Use the Command 'debuild' (with examples)

The debuild command is a powerful tool used by developers and maintainers to build Debian packages from source code.

Read More
How to use the command `gcrane completion` (with examples)

How to use the command `gcrane completion` (with examples)

The gcrane completion command is a powerful tool within the Google Container Registry Go client, gcrane, that allows users to generate shell autocompletion scripts for easier and more efficient command-line use.

Read More