Text Recognition App with MLKit Android [PROJECT]

ibrahimcanerdogan
5 min readMar 8, 2023

--

Text recognition technology has come a long way in recent years, and has opened up a plethora of possibilities for developers to create innovative and useful applications. With the rise of mobile technology, it is now possible to develop text recognition apps that run on smartphones, bringing this technology to a wider audience. In this article, we will explore how to build a text recognition app using MLKit for Android Kotlin.

FREE UDEMY COURSES

What is MLKit?

MLKit is a mobile SDK that provides developers with pre-built machine learning models and tools to easily integrate them into their applications. It is a powerful tool that can help developers create intelligent and responsive apps without having to become machine learning experts themselves.

What is the Text Recognition?

The ML Kit Text Recognition API can recognize text in any Latin-based character set. It can also be used to automate data-entry tasks such as processing credit cards, receipts, and business cards.

Key capabilities

  • Recognize text across Latin-based languages Supports recognizing text using Latin script
  • Analyze structure of text Supports detection of words/elements, lines and paragraphs
  • Identify language of text Identifies the language of the recognized text
  • Small application footprint On Android, the API is offered as an unbundled library through Google Play Services
  • Real-time recognition Can recognize text in real-time on a wide range of devices

Text structure

The Text Recognizer segments text into blocks, lines, elements and symbols. Roughly speaking:

  • a Block is a contiguous set of text lines, such as a paragraph or column,
  • a Line is a contiguous set of words on the same axis, and
  • an Element is a contiguous set of alphanumeric characters (“word”) on the same axis in most Latin languages, or a word in others
  • an Symbol is a single alphanumeric character on the same axis in most Latin languages, or a character in others

The image below highlights examples of each of these in descending order. The first highlighted block, in cyan, is a Block of text. The second set of highlighted blocks, in blue, are Lines of text. Finally, the third set of highlighted blocks, in dark blue, are Words.

For all detected blocks, lines, elements and symbols, the API returns the bounding boxes, corner points, rotation information, confidence score, recognized languages and recognized text.

Build Text Recognition App

build.gradle

Add the dependencies for the ML Kit Android libraries to your module’s app-level gradle file, which is usually app/build.gradle and we will use viewbinding, we should not forget to add the necessary additions in it.

android {
namespace 'com.ibrahimcanerdogan.textrecognitionapp'
compileSdk 33

...
buildFeatures {
viewBinding true
}
}
 // Text Recognition
mplementation 'com.google.android.gms:play-services-mlkit-text-recognition:18.0.2'

AndroidManifest.xml

The following permissions must be added for gallery and camera access.

<uses-permission android:name="android.permission.CAMERA" />
<uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE" />
<uses-permission
android:name="android.permission.WRITE_EXTERNAL_STORAGE"
android:maxSdkVersion="29"
tools:ignore="ScopedStorage" />

Change “android:theme” for NoActionBar.

android:theme="@style/Theme.MaterialComponents.Light.NoActionBar"

MainActivity.kt

private lateinit var binding : ActivityMainBinding

private var viewModel : MainViewModel = MainViewModel()

private var detectImage : Bitmap? = null
private var detectImageUri : Uri? = null
    override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
binding = ActivityMainBinding.inflate(layoutInflater)
setContentView(binding.root)

// Scrool textView and copy button front of the screen.
binding.textView.movementMethod = ScrollingMovementMethod()
binding.buttonCopy.bringToFront()

copy()
buttonClick()
}

To copy the text written in the TextView;

private fun copy() {
if (!binding.textView.text.isNullOrEmpty()) {
binding.buttonCopy.setOnClickListener {
val clipboard : ClipboardManager = getSystemService(CLIPBOARD_SERVICE) as ClipboardManager
val clip = ClipData.newPlainText("copied", binding.textView.text)
clipboard.setPrimaryClip(clip)
Toast.makeText(this, "Text Copied!", Toast.LENGTH_LONG).show()
}
77 }
}

Clicking all buttons on the screen;

private fun buttonClick() {
// Camera
binding.buttonCamera.setOnClickListener {
controlCameraPermission()
}
// Gallery
binding.buttonGallery.setOnClickListener {
controlGalleryPermission()
}
// Recognition Image
binding.buttonSearch.setOnClickListener {
detectImage?.let {
Toast.makeText(this, "Re Recognition Image!", Toast.LENGTH_SHORT).show()
setRecognitionTextFromImageView(it)
}
}
// Bitmap & Uri variable send to other activity.
binding.buttonShowImage.setOnClickListener {
if (detectImage != null && detectImageUri != null) {
val intent = Intent(this, MainActivity2::class.java)
intent.putExtra("uriimage", detectImageUri.toString())
startActivity(intent)
} else if(detectImage != null) {
val intent =Intent(this, MainActivity2::class.java)
intent.putExtra("bitmapimage", detectImage)
} else {
Toast.makeText(this, "No images have been selected!", Toast.LENGTH_SHORT).show()
}
}
}

Camera launcher for image capture action;

private fun openCamera() {
val intent = Intent(MediaStore.ACTION_IMAGE_CAPTURE)
cameraLauncher.launch(intent)
}

private val cameraLauncher = registerForActivityResult(ActivityResultContracts.StartActivityForResult()) { result ->
if (result.resultCode == RESULT_OK) {
// get image from camera as a bitmap file
detectImage = result.data?.extras?.get("data") as Bitmap
setRecognitionTextFromImageView(detectImage!!)
}
}

Gallery launcher for image pick action;

private fun openGallery() {
val intent = Intent(Intent.ACTION_PICK, MediaStore.Images.Media.EXTERNAL_CONTENT_URI)
galleryLauncher.launch(intent)
}

private val galleryLauncher = registerForActivityResult(ActivityResultContracts.StartActivityForResult()) { result ->
// get image from gallery as a uri
val uri : Uri? = result.data?.data

if (result.resultCode == RESULT_OK && uri != null) {
detectImageUri = uri
// if uri is not null convert to bitmap.
val inputStream = contentResolver.openInputStream(detectImageUri!!)
detectImage = BitmapFactory.decodeStream(inputStream)
setRecognitionTextFromImageView(detectImage!!)
}
}

Camera permission;

private fun controlCameraPermission() {
// check and request permission
if (ContextCompat.checkSelfPermission(this, android.Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {
ActivityCompat.requestPermissions(this, arrayOf(android.Manifest.permission.CAMERA), CAMERA_PERMISSION_CODE)
} else {
// if app has permission open camera
openCamera()
}
}

Gallery permission;

private fun controlGalleryPermission() {
// check and request permission
if (ContextCompat.checkSelfPermission(this, android.Manifest.permission.READ_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED) {
ActivityCompat.requestPermissions(
this,
arrayOf(android.Manifest.permission.READ_EXTERNAL_STORAGE, android.Manifest.permission.WRITE_EXTERNAL_STORAGE),
GALLERY_PERMISSION_CODE
)
} else {
// if app has permission open gallery
openGallery()
}
}

onRequestPermissionsResult;

override fun onRequestPermissionsResult(
requestCode: Int,
permissions: Array<out String>,
grantResults: IntArray
) {
super.onRequestPermissionsResult(requestCode, permissions, grantResults)

when(requestCode) {
CAMERA_PERMISSION_CODE -> {
if (grantResults.isNotEmpty() && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
openCamera()
} else {
Toast.makeText(this, "Camera Permission Denied!", Toast.LENGTH_SHORT).show()
}
}
GALLERY_PERMISSION_CODE -> {
if (grantResults.isNotEmpty() && grantResults[0] == PackageManager.PERMISSION_GRANTED) {
openGallery()
} else {
Toast.makeText(this, "Gallery Permission Denied!", Toast.LENGTH_SHORT).show()
}
}
}
}

companion object {
private const val CAMERA_PERMISSION_CODE : Int = 0
private const val GALLERY_PERMISSION_CODE : Int = 1
}

MainActivity2.kt

Get a image from intent and set imageView.

private lateinit var binding : ActivityMain2Binding

override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
binding = ActivityMain2Binding.inflate(layoutInflater)
setContentView(binding.root)

val (bitmap, uri) = getImageFormatFromIntent()
setImage(uri, bitmap)

binding.buttonBack.setOnClickListener {
onBackPressed()
}
}

Set image;

// use variable for setImage if which variable is not null
private fun setImage(uri: String?, bitmap: Bitmap?) {
binding.imageView.setImageDrawable(null)
if (uri != null) {
binding.imageView.setImageURI(Uri.parse(uri))
} else {
binding.imageView.setImageBitmap(bitmap)
}
}

Get image from intent;

private fun getImageFormatFromIntent() : Pair<Bitmap?, String?> {
val intent : Intent = getIntent()

// use getParcelableExtra according to BuildVersion
val bitmap = if (VERSION.SDK_INT >= Build.VERSION_CODES.TIRAMISU) intent.getParcelableExtra("bitmapimage", Bitmap::class.java)
else intent.getParcelableExtra("bitmapimage")

val uri = intent.getStringExtra("uriimage")

return Pair(bitmap, uri)
}

MainViewModel.kt

Use bitmap with recognizer process.

fun textRecognizer(context : Context, textView: TextView, bitmap: Bitmap) {
val image = InputImage.fromBitmap(bitmap, 0)
// Recognizer client.
val recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)
recognizer.process(image)
.addOnSuccessListener { visionText ->
processText(context, textView, visionText)
}
.addOnFailureListener {
Toast.makeText(context, "Text could not be read!", Toast.LENGTH_SHORT).show()
}
}

Read block, line and element. All element append to StringBuilder().

private fun processText(context: Context, textView: TextView, visionText: Text) {
val blocks = visionText.textBlocks
if (blocks.size == 0) {
Toast.makeText(context, "No Text Detected In Image!", Toast.LENGTH_SHORT).show()
}

val text = StringBuilder()

for (block in blocks) {
for (line in block.lines) {
for (element in line.elements) {
text.append(element.text + " ")
}
}
}

textView.text = text.toString()
}

Preview App

All Source Code

Github Link

Linkedin

Udemy

--

--

ibrahimcanerdogan
ibrahimcanerdogan

Written by ibrahimcanerdogan

Hi, My name is Ibrahim, I am developing ebebek android app within Ebebek. I publish various articles in the field of programming and self-improvement.

No responses yet