How to Build Image to text Converter Using Python?

August 4, 2023

How to Build Image to text Converter Using Python?

Python is a flexible programming language that is used worldwide for a variety of purposes. It is an incredibly powerful object-oriented programming language with a relatively simple syntax. Due to its power, it is used a lot in developing AI systems.

Today we are going to look at one such system called an image-to-text converter. The underlying technology behind it is OCR (optical character recognition), an application of AI. In this tutorial, we will teach you a very simple method of creating a basic image-to-text converter to extract text from images.

How to Build an Image-To-Text Converter with Python?

We are going to use the Google Collaboratory Notebook for this exercise. This will avoid all system incompatibility issues that may occur on your device and local environment. Google Collaboratory is free to use and only requires a free Google account.

For this tutorial, we will use the open-source code by Bhadresh Savani provided on GitHub.

1. Install Tesseract

So, step 1 is to open a new notebook, add a code block and install Tesseract. Tesseract is a prebuilt library that has OCR functions that we need to create an image-to-text converter. You can install Tesseract using the following code line.

This will start the installation process. Just let it finish and once it is done, open a new code block. The installation may take a while and looks like this:

You are not done installing Tesseract yet though. You still need to write a command for installing Pytesseract. Just type the following into the new code box.

2. Import Required Libraries

Now that the installation is done, open a new code block. This is where we will write code for importing to other libraries that help us to get an image and provide it to the system. The libraries we used are:

Shutil. For importing, creating, and otherwise processing files.
For enabling the Python code to interface with the OS and to enable Shutil to copy, create and delete files in a folder/directory.
Helps to generate random numbers.
This is the library we just installed and now we are calling it to use its OCR functions.

Here are the commands to install them.

Now, we have to create a function for importing images.

Now, our image-to-text converter is ready. So, time for testing it.

3. Extract Text from the Image

To extract an image, we first have to provide one. To be able to upload or import a file write the following code in a new block.

Once you run it, a button for uploading a file will appear.

We used the following image for our testing.

To extract the information and display it, use these lines of code.

The red text is the name of the file that you upload, so it should look different for you. Anyhow after the last code box is run, you will see your output underneath.

And that’s it for creating an image-to-text converter with Python.

Use Tools for More Convenience

Even if that method was simple and straightforward, it was still cumbersome. So, an even easier way is to use an OCR tool such as Imagetotext.io. This is a free tool that can extract text from your images with more accuracy as well.

Here is an example showing you how this is done.

We visit the website which looks like this.

2. We uploaded the same image that we used for testing our Python code.

Then we simply press submit and wait for the tool.
After a few seconds, the tool generates a clean output that you can copy or download as a text file.

The entire process takes only a few seconds, which is much faster than the 10 or so minutes required for creating a Python program.

Conclusion

This concludes our tutorial on building an image-to-text converter using Python. We used open-source code from a GitHub repository and showed what each chunk of code does. Thus, making it easy to follow.

We also explored how an image-to-text converting tool is much better suited for doing so instead of creating a program from scratch.