Tutorial for installing Tesseract

         You’ve undoubtedly seen it before… It’s widely used to process everything from scanned documents to the handwritten scribbles on your tablet PC and Google Translate. And today you’ll create your first app for text recognition.

What is OCR?

         Optical Character Recognition, or OCR, is the process of electronically extracting text from images and reusing it in a variety of ways such as document editing, free-text searches, or compression. In this tutorial, you’ll learn how to install Tesseract, an open source OCR engine maintained by Google.

How to install Tesseract for Microsoft Visual Studio?

Step 1:

           To install Tesseract you need to install the following programs:








Step 2:

      What’s next? That’s right, create a folder where we want to install Tesseract. This can be any directory on your computer, for example: “D:\Tesseract-files”.
.      After that, run GIT CMD and move to Tesseract`s folder. Your GIT command line should look like this:

Installation Tesseract. Picture 1

Fig. 1. GIT CMD example


Step 3:

         Now you need to copy the entire dependency from the GitHub repository to your computer. To do this, we write the following command in GIT CMD:
git clone git://github.com/pvorb/tesseract-vs2013.git. In the console GIT CMD you will see something like this:

Installation Tesseract. Picture 2

Fig. 2. Clone tesseract-vs2013.git

         After executing this command, you will see the following in the console:

Installation Tesseract. Picture 3

Fig. 3. Clone tesseract-vs2013 done


Step 4:

    For the next step, run VS2013 developer command Prompt. It is in: {directory of MS VS}\Common7\Tools\Shortcuts\Developer Command Promt VS2013. And move to D:\Tesseract-files\tesseract-vs2013.

Installation Tesseract. Picture 4

Fig. 4. Command promt for VS2013

         Now we can perform building using the command msbuild build.proj: 

Installation Tesseract. Picture 5

Fig. 5. Start perform build

After this step the VS2013 can be closed.


Step 5:

         Reopen GIT CMD and check folder and check the working directory. Must be “D:\Tesseract-files\”.  After that, gets the latest source using SVN (print in GIT CMD):   svn checkout https://github.com/svn2github/Tesseract.git.

Installation Tesseract. Picture 6

Fig. 6. Checkout Tesseract


       After performing this procedure, the new folder appears in a folder D:\Tesseract-files\ which name is Tesseract.git\.
.    Move in GIT CMD to D:\Tesseract-files\Tesseract.git\trunk and apply the patch provided in tesseract-vs2013 (print in cmd): svn patch D:\Tesseract-files\tesseract-vs2013\vs2013+64bit_support.patch

Installation Tesseract. Picture 7

Fig. 7. Patch provided in tesseract-vs2013


    Copy both directory (lib and include) from D:\Tesseract-files\tesseract-vs2013\release into D:\Tesseract-files\Tesseract.git\trunk\
.       Open D:\Tesseract-files\Tesseract.git\trunk\vs2013\tesseract.sln with Visual Studio 2013.


Step 6:

     Open Property pages of libtesseract304 and in Configuration Properties->C/C++->General->Additional Include Directories  add D:\Tesseract-files\Tesseract.git\trunk\include\  and D:\Tesseract-files\Tesseract.git\trunk\include\ leptonica\; In Property  pages open Linker->General->Additional Library Directories add D:\Tesseract-files\Tesseract.git\trunk\lib\x64\;
.       It is necessary to repeat this operation for Debug and Release. Build the project in Release and Debug.

Step 7:

     What would Tesseract recognized the text he needs training files. They can be found in: https://github.com/tesseract-ocr/tessdata. Download the necessary files and copy them to D: \Tesseract-files\Tesseract.git\trunk\ tessdata\


Step 8

     Copy tesseract`s .dll files to necessary project from D:\Tesseract-files\Tesseract.git\lib copy libtesseract304.dll (or libtesseract304d.dll) to Release (or Debug) folder in necessary project (In this folder must be exe file).From D:\Tesseract-files\tesseract-vs2013\lib\x64 (or X64) copy liblept171.dll (or liblept171d.dll) to Release (or Debug) folder in necessary project (In this folder must be exe file).

         Connect Tesseract into project (is necessary for Debug and for Release).

Set properties of necessary project:

  in C/C++ –> General –> Additional Include Directories:

In Linker –> General –> Additional Library Directories:

In Linker –> Input –> Additional Dependencies:

for Debug


for Release


Step 9

So, create new console application and paste this code:

#include “baseapi.h”

#include “allheaders.h”

int main()


                char *outText;

                tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();

                // Initialize tesseract-ocr  with English, without specifying tessdata path

                if (api->Init(“D:\\Tesseract-files\\Tesseract.git\\trunk”, “eng”)){

                               fprintf(stderr, “Could not initialize tesseract.\n”);



                // Open input image

                Pix *image = pixRead(“yout_image.tif”);


                // set list of allowed characters

                api->SetVariable(“tessedit_char_whitelist”, “abcdefghijklmnoprstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-.;,:/0123456789”);

                // Get OCR result

                outText = api->GetUTF8Text();

                printf(“OCR output:\n%s”, outText);

                // Destroy used object and release memory


                delete[] outText;


return 0;



Then build and compile the project.


As a result you will get:

Installation Tesseract. Picture 8

Fig.8. Input image


Installation Tesseract. Picture 9

Fig. 9. Output result


Congratulation! You installed and started your first text recognition program!

Tesseract library configuration

3 thoughts on “Tesseract library configuration

  • 15.02.2016 at 15:55

    Wow great. so excited to try this tutorial.

  • 22.08.2016 at 20:43

    thank you very much just hope with this beautiful tutorial if able to compile tesseract to use visual studio . incidentally be correct this tutorial procedure would apply equally to compile tesseract for visual studio 2015

  • 29.03.2018 at 06:32

    I want to know what parameters the config file used by Tesseract OCR accepts, how to write a config file, etc.


Leave a Reply

Your email address will not be published. Required fields are marked *