By Dmytro Sharapov, CV engineer @It-Jim

Tutorial for Installing Tesseract

You’ve undoubtedly seen it before… It’s widely used to process everything from scanned documents to the handwritten scribbles on your tablet PC and Google Translate. And today you’ll create your first app for text recognition.

What is OCR?

Optical Character Recognition, or OCR, is the process of electronically extracting text from images and reusing it in a variety of ways such as document editing, free-text searches, or compression. In this tutorial, you’ll learn how to install Tesseract, an open-source OCR engine maintained by Google.

How to Install Tesseract for Microsoft Visual Studio?

Step 1:

To install Tesseract you need to install the following programs:





Step 2:

What’s next? That’s right, create a folder where we want to install Tesseract. This can be any directory on your computer, for example: “D:\Tesseract-files”.
After that, run GIT CMD and move to Tesseract`s folder. Your GIT command line should look like this:

Installation Tesseract. Picture 1

Fig. 1. GIT CMD example

Step 3:

Now you need to copy the entire dependency from the GitHub repository to your computer. To do this, we write the following command in GIT CMD:
git clone git:// In the console GIT CMD you will see something like this:

Installation Tesseract. Picture 2

Fig. 2. Clone tesseract-vs2013.git

After executing this command, you will see the following in the console:

Installation Tesseract. Picture 3

Fig. 3. Clone tesseract-vs2013 done

Step 4:

For the next step, run VS2013 developer command Prompt. It is in: {directory of MS VS}\Common7\Tools\Shortcuts\Developer Command Promt VS2013. And move to D:\Tesseract-files\tesseract-vs2013.

Installation Tesseract. Picture 4

Fig. 4. Command promt for VS2013

Now we can perform building using the command msbuild build.proj: 

Installation Tesseract. Picture 5

Fig. 5. Start performing build

After this step, the VS2013 can be closed.

Step 5:

Reopen GIT CMD and check folder and check the working directory. Must be “D:\Tesseract-files\”.  After that, gets the latest source using SVN (print in GIT CMD):   svn checkout

Installation Tesseract. Picture 6

Fig. 6. Checkout Tesseract

After performing this procedure, the new folder appears in a folder D:\Tesseract-files\ which name is Tesseract.git\.
Move in GIT CMD to D:\Tesseract-files\Tesseract.git\trunk and apply the patch provided in tesseract-vs2013 (print in cmd): svn patch D:\Tesseract-files\tesseract-vs2013\vs2013+64bit_support.patch

Installation Tesseract. Picture 7

Fig. 7. Patch provided in tesseract-vs2013

Copy both directory (lib and include) from D:\Tesseract-files\tesseract-vs2013\release into D:\Tesseract-files\Tesseract.git\trunk\
Open D:\Tesseract-files\Tesseract.git\trunk\vs2013\tesseract.sln with Visual Studio 2013.

Step 6:

Open Property pages of libtesseract304 and in Configuration Properties->C/C++->General->Additional Include Directories  add D:\Tesseract-files\Tesseract.git\trunk\include\  and D:\Tesseract-files\Tesseract.git\trunk\include\ leptonica\; In Property  pages open Linker->General->Additional Library Directories add D:\Tesseract-files\Tesseract.git\trunk\lib\x64\;
It is necessary to repeat this operation for Debug and Release. Build the project in Release and Debug.

Step 7:

What would Tesseract recognized the text he needs training files. They can be found in: Download the necessary files and copy them to D: \Tesseract-files\Tesseract.git\trunk\ tessdata\

Step 8:

Copy tesseract`s .dll files to necessary project from D:\Tesseract-files\Tesseract.git\lib copy libtesseract304.dll (or libtesseract304d.dll) to Release (or Debug) folder in necessary project (In this folder must be exe file).From D:\Tesseract-files\tesseract-vs2013\lib\x64 (or X64) copy liblept171.dll (or liblept171d.dll) to Release (or Debug) folder in necessary project (In this folder must be exe file).

Connect Tesseract into project (is necessary for Debug and for Release).

Set properties of necessary project:

  in C/C++ –> General –> Additional Include Directories:

In Linker –> General –> Additional Library Directories:

In Linker –> Input –> Additional Dependencies:

for Debug


for Release


Step 9:

So, create new console application and paste this code:

#include “baseapi.h”

#include “allheaders.h”

int main()


                char *outText;

                tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();

                // Initialize tesseract-ocr  with English, without specifying tessdata path

                if (api->Init(“D:\\Tesseract-files\\Tesseract.git\\trunk”, “eng”)){

                               fprintf(stderr, “Could not initialize tesseract.\n”);



                // Open input image

                Pix *image = pixRead(“yout_image.tif”);


                // set list of allowed characters

                api->SetVariable(“tessedit_char_whitelist”, “abcdefghijklmnoprstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-.;,:/0123456789”);

                // Get OCR result

                outText = api->GetUTF8Text();

                printf(“OCR output:\n%s”, outText);

                // Destroy used object and release memory


                delete[] outText;


return 0;


Then build and compile the project.

As a result, you will get:

Installation Tesseract. Picture 8

Fig.8. Input image


Installation Tesseract. Picture 9

Fig. 9. Output result


Congratulation! You installed and started your first text recognition program!

Tesseract Library Configuration
Tagged on: