JavaScripting

The definitive source of the best
JavaScript libraries, frameworks, and plugins.


  • ×

    Tesseract.js

    Pure Javascript OCR for 62 Languages 📖🎉🖥
    Filed under 

    • 🔾90%Overall
    • 17,991
    • 1.7 days
    • 🕩1258
    • 👥15

    Tesseract.js

    Build Status Financial Contributors on Open Collective npm version Maintenance License Code Style Downloads Total Downloads Month

    Version 2 beta is now available and under development in the master branch, read a story about v2 beta: Why I refactor tesseract.js v2?
    Check the support/1.x branch for version 1


    Tesseract.js is a javascript library that gets words in almost any language out of images. (Demo)

    Image Recognition

    fancy demo gif

    Video Real-time Recognition

    Tesseract.js Video

    Tesseract.js wraps an emscripten port of the Tesseract OCR Engine. It works in the browser using webpack or plain script tags with a CDN and on the server with Node.js. After you install it, using it is as simple as:

    import Tesseract from 'tesseract.js';
    
    Tesseract.recognize(
      'https://tesseract.projectnaptha.com/img/eng_bw.png',
      'eng',
      { logger: m => console.log(m) }
    ).then(({ data: { text } }) => {
      console.log(text);
    })
    

    Or more imperative

    import { createWorker } from 'tesseract.js';
    
    const worker = createWorker({
      logger: m => console.log(m)
    });
    
    (async () => {
      await worker.load();
      await worker.loadLanguage('eng');
      await worker.initialize('eng');
      const { data: { text } } = await worker.recognize('https://tesseract.projectnaptha.com/img/eng_bw.png');
      console.log(text);
      await worker.terminate();
    })();
    

    Check out the docs for a full explanation of the API.

    Major changes in v2 beta

    • Upgrade to tesseract v4.1 (using emscripten 1.38.45)
    • Support multiple languages at the same time, eg: eng+chi_tra for English and Traditional Chinese
    • Supported image formats: png, jpg, bmp, pbm
    • Support WebAssembly (fallback to ASM.js when browser doesn't support)
    • Support Typescript

    Installation

    Tesseract.js works with a <script> tag via local copy or CDN, with webpack via npm and on Node.js with npm/yarn.

    CDN

    <!-- v2 -->
    <script src='https://unpkg.com/tesseract.js@v2.0.0-beta.1/dist/tesseract.min.js'></script>
    
    <!-- v1 -->
    <script src='https://unpkg.com/tesseract.js@1.0.19/src/index.js'></script>
    

    After including the script the Tesseract variable will be globally available.

    Node.js

    Tesseract.js currently requires Node.js v6.8.0 or higher

    # For v2
    npm install tesseract.js@next
    yarn add tesseract.js@next
    
    # For v1
    npm install tesseract.js
    yarn add tesseract.js
    

    Documentation

    Use tesseract.js the way you like!

    Contributing

    Development

    To run a development copy of Tesseract.js do the following:

    # First we clone the repository
    git clone https://github.com/naptha/tesseract.js.git
    cd tesseract.js
    
    # Then we install the dependencies
    npm install
    
    # And finally we start the development server
    npm start
    

    The development server will be available at http://localhost:3000/examples/browser/demo.html in your favorite browser. It will automatically rebuild tesseract.dev.js and worker.dev.js when you change files in the src folder.

    You can also run the development server in Gitpod ( a free online IDE and dev environment for GitHub that will automate your dev setup ) with a single click.

    Open in Gitpod

    Building Static Files

    To build the compiled static files just execute the following:

    npm run build
    

    This will output the files into the dist directory.

    Contributors

    Code Contributors

    This project exists thanks to all the people who contribute. [Contribute].

    Financial Contributors

    Become a financial contributor and help us sustain our community. [Contribute]

    Individuals

    Organizations

    Support this project with your organization. Your logo will show up here with a link to your website. [Contribute]

    Show All