Google MLKIT Natural Language Identification App | Android Development

ibrahimcanerdogan
6 min readFeb 26, 2024

--

You can use ML Kit to identify the language of a string of text. You can get the string’s most likely language as well as confidence scores for all of the string’s possible languages.

ML Kit recognizes text in more than 100 different languages in their native scripts. In addition, romanized text can be recognized for Arabic, Bulgarian, Chinese, Greek, Hindi, Japanese, and Russian. See the complete list of supported languages and scripts.

🟢 ANDROID WITH MACHINE LEARNING! (COURSE)

🟢 KOTLIN INTERVIEW BOOTCAMP! (COURSE)

Model

This Kotlin code defines a data class named IdentificationLanguage with three properties: languageEnglish, languageNative, and languageConfidence. It is designed to represent information related to language identification, where:

  • languageEnglish is a nullable String representing the English name of the language.
  • languageNative is a nullable String representing the native name of the language.
  • languageConfidence is a nullable Float representing the confidence level in the language identification.
data class IdentificationLanguage(
val languageEnglish: String? = null,
val languageNative: String? = null,
val languageConfidence: Float? = null
) {

override fun toString(): String {
return "$languageEnglish - $languageEnglish - $languageConfidence \n"
}
}

This code defines an enum class named LanguageIdentificationSupportedLanguages in a programming language, likely Kotlin or Java, due to the syntax used. The enum class consists of a list of languages with their corresponding native script, script in Latin, and language code.

enum class LanguageIdentificationSupportedLanguages (val langEnglish: String, val langNative: String, val langCode: String) {
AFRIKAANS("Afrikaans", "Latin", "af"),
AMHARIC("Amharic", "Ge'ez", "am"),
ARABIC("Arabic", "Arabic", "ar"),
ARABIC_LATIN("Arabic", "Latin", "ar-Latn"),
AZERBAIJANI("Azerbaijani", "Latin", "az"),
BELARUSIAN("Belarusian", "Cyrillic", "be"),
BULGARIAN("Bulgarian", "Cyrillic", "bg"),
BULGARIAN_LATIN("Bulgarian", "Latin", "bg-Latn"),
BENGALI("Bengali", "Bengali", "bn"),
BOSNIAN("Bosnian", "Latin", "bs"),
CATALAN("Catalan", "Latin", "ca"),
CEBUANO("Cebuano", "Latin", "ceb"),
CORSICAN("Corsican", "Latin", "co"),
CZECH("Czech", "Latin", "cs"),
WELSH("Welsh", "Latin", "cy"),
DANISH("Danish", "Latin", "da"),
GERMAN("German", "Latin", "de"),
GREEK("Greek", "Greek", "el"),
GREEK_LATIN("Greek", "Latin", "el-Latn"),
ENGLISH("English", "Latin", "en"),
ESPERANTO("Esperanto", "Latin", "eo"),
SPANISH("Spanish", "Latin", "es"),
ESTONIAN("Estonian", "Latin", "et"),
BASQUE("Basque", "Latin", "eu"),
PERSIAN("Persian", "Arabic", "fa"),
FINNISH("Finnish", "Latin", "fi"),
FILIPINO("Filipino", "Latin", "fil"),
FRENCH("French", "Latin", "fr"),
WESTERN_FRISIAN("Western Frisian", "Latin", "fy"),
IRISH("Irish", "Latin", "ga"),
SCOTS_GAELIC("Scots Gaelic", "Latin", "gd"),
GALICIAN("Galician", "Latin", "gl"),
GUJARATI("Gujarati", "Gujarati", "gu"),
HAUSA("Hausa", "Latin", "ha"),
HAWAIIAN("Hawaiian", "Latin", "haw"),
HEBREW("Hebrew", "Hebrew", "he"),
HINDI("Hindi", "Devanagari", "hi"),
HINDI_LATIN("Hindi", "Latin", "hi-Latn"),
HMONG("Hmong", "Latin", "hmn"),
CROATIAN("Croatian", "Latin", "hr"),
HAITIAN("Haitian", "Latin", "ht"),
HUNGARIAN("Hungarian", "Latin", "hu"),
ARMENIAN("Armenian", "Armenian", "hy"),
INDONESIAN("Indonesian", "Latin", "id"),
IGBO("Igbo", "Latin", "ig"),
ICELANDIC("Icelandic", "Latin", "is"),
ITALIAN("Italian", "Latin", "it"),
JAPANESE("Japanese", "Japanese", "ja"),
JAPANESE_LATIN("Japanese", "Latin", "ja-Latn"),
JAVANESE("Javanese", "Latin", "jv"),
GEORGIAN("Georgian", "Georgian", "ka"),
KAZAKH("Kazakh", "Cyrillic", "kk"),
KHMER("Khmer", "Khmer", "km"),
KANNADA("Kannada", "Kannada", "kn"),
KOREAN("Korean", "Korean", "ko"),
KURDISH("Kurdish", "Latin", "ku"),
KYRGYZ("Kyrgyz", "Cyrillic", "ky"),
LATIN("Latin", "Latin", "la"),
LUXEMBOURGISH("Luxembourgish", "Latin", "lb"),
LAO("Lao", "Lao", "lo"),
LITHUANIAN("Lithuanian", "Latin", "lt"),
LATVIAN("Latvian", "Latin", "lv"),
MALAGASY("Malagasy", "Latin", "mg"),
MAORI("Maori", "Latin", "mi"),
MACEDONIAN("Macedonian", "Cyrillic", "mk"),
MALAYALAM("Malayalam", "Malayalam", "ml"),
MONGOLIAN("Mongolian", "Cyrillic", "mn"),
MARATHI("Marathi", "Devanagari", "mr"),
MALAY("Malay", "Latin", "ms"),
MALTESE("Maltese", "Latin", "mt"),
BURMESE("Burmese", "Myanmar", "my"),
NEPALI("Nepali", "Devanagari", "ne"),
DUTCH("Dutch", "Latin", "nl"),
NORWEGIAN("Norwegian", "Latin", "no"),
NYANJA("Nyanja", "Latin", "ny"),
PUNJABI("Punjabi", "Gurmukhi", "pa"),
POLISH("Polish", "Latin", "pl"),
PASHTO("Pashto", "Arabic", "ps"),
PORTUGUESE("Portuguese", "Latin", "pt"),
ROMANIAN("Romanian", "Latin", "ro"),
RUSSIAN("Russian", "Cyrillic", "ru"),
RUSSIAN_LATIN("Russian", "English", "ru-Latn"),
SINDHI("Sindhi", "Arabic", "sd"),
SINHALA("Sinhala", "Sinhala", "si"),
SLOVAK("Slovak", "Latin", "sk"),
SLOVENIAN("Slovenian", "Latin", "sl"),
SAMOAN("Samoan", "Latin", "sm"),
SHONA("Shona", "Latin", "sn"),
SOMALI("Somali", "Latin", "so"),
ALBANIAN("Albanian", "Latin", "sq"),
SERBIAN("Serbian", "Cyrillic", "sr"),
SESOTHO("Sesotho", "Latin", "st"),
SUNDANESE("Sundanese", "Latin", "su"),
SWEDISH("Swedish", "Latin", "sv"),
SWAHILI("Swahili", "Latin", "sw"),
TAMIL("Tamil", "Tamil", "ta"),
TELUGU("Telugu", "Telugu", "te"),
TAJIK("Tajik", "Cyrillic", "tg"),
THAI("Thai", "Thai", "th"),
TURKISH("Turkish", "Latin", "tr"),
UKRAINIAN("Ukrainian", "Cyrillic", "uk"),
URDU("Urdu", "Arabic", "ur"),
UZBEK("Uzbek", "Latin", "uz"),
VIETNAMESE("Vietnamese", "Latin", "vi"),
XHOSA("Xhosa", "Latin", "xh"),
YIDDISH("Yiddish", "Hebrew", "yi"),
YORUBA("Yoruba", "Latin", "yo"),
CHINESE("Chinese", "Chinese", "zh"),
CHINESE_LATIN("Chinese", "Latin", "zh-Latn"),
ZULU("Zulu", "Latin", "zu")
}

ViewModel

This code appears to be part of an Android ViewModel class in Kotlin or Java, specifically designed for language identification. Let’s break down the key components:

private val language = MutableLiveData<List<IdentifiedLanguage>?>(): This creates a MutableLiveData instance to hold a list of identified languages (IdentifiedLanguage). The use of MutableLiveData suggests that the data can be updated over time.

val languageData: LiveData<List<IdentifiedLanguage>?> get() = language: This is a public property exposing a LiveData view of the MutableLiveData. This is often used to observe changes in the data from UI components, ensuring that the UI stays updated based on changes in the underlying data.

private val languageIdentifier = LanguageIdentification.getClient(...): This initializes a LanguageIdentification client using LanguageIdentificationOptions with a confidence threshold of 0.2. This client is likely responsible for identifying languages based on the provided text.

  • fun processIdentification(identificationText: String): This function takes a text input and uses the languageIdentifier to identify possible languages.
  • It utilizes the addOnSuccessListener to handle the successful identification, updating the language MutableLiveData with the identified languages.
  • If there’s a failure during language identification, it sets the language to null, logs an error, and throws an exception.
class LanguageIdentificationViewModel : ViewModel() {

private val language = MutableLiveData< List<IdentifiedLanguage>?>()
val languageData : LiveData<List<IdentifiedLanguage>?>
get() = language

private val languageIdentifier = LanguageIdentification
.getClient(
LanguageIdentificationOptions.Builder().setConfidenceThreshold(0.2f).build()
)

fun processIdentification(identificationText: String) {
languageIdentifier.identifyPossibleLanguages(identificationText)
.addOnSuccessListener { identifiedLanguages ->
language.postValue(identifiedLanguages)
}
.addOnFailureListener {
language.postValue(null)
Log.e(TAG, it.message.toString())
throw Exception(it.message.toString())
}
}

companion object {
private val TAG = LanguageIdentificationViewModel::class.java.simpleName.toString()
}
}

Fragment

The LanguageIdentificationFragment is an Android Fragment responsible for language identification in an app. Key features include:

Utilizes view binding to interact with UI elements.

  • Instantiates a LanguageIdentificationViewModel using the viewModels() delegate, facilitating communication between the UI and business logic.
  • Inflates the layout in onCreateView and sets up UI interactions in onViewCreated.
  • The “Identify Language” button triggers language identification with error handling.
  • Monitors changes in textViewLanguage and adjusts the visibility of buttonLanguageCopy accordingly.
  • Uses the ViewModel to identify languages based on the entered text.
  • Updates the UI with identified languages, displaying English name, native script, and confidence level.

Implements a function to copy identified language information to the clipboard.

  • Displays a toast message upon successful copying.
  • Takes care of view destruction in onDestroyView.

Logs language identification information, including language tags and confidence levels.

This fragment plays a crucial role in integrating language identification functionality into the app’s user interface.

class LanguageIdentificationFragment : Fragment() {

private var _binding: FragmentLanguageIdentificationBinding? = null
private val binding get() = _binding!!

private val viewModel: LanguageIdentificationViewModel by viewModels()

override fun onCreateView(
inflater: LayoutInflater, container: ViewGroup?,
savedInstanceState: Bundle?
): View {
_binding = FragmentLanguageIdentificationBinding.inflate(inflater, container, false)
return binding.root
}

override fun onViewCreated(view: View, savedInstanceState: Bundle?) {
super.onViewCreated(view, savedInstanceState)

with(binding) {
buttonLanguage.setOnClickListener {
try {
viewModel.processIdentification(editTextLanguage.text.toString())
} catch (e: Exception) {
Toast.makeText(requireContext(), e.message.toString(), Toast.LENGTH_SHORT).show()
}
}

textViewLanguage.addTextChangedListener(object : TextWatcher {
override fun beforeTextChanged(s: CharSequence?, start: Int, count: Int, after: Int) {}
override fun afterTextChanged(s: Editable?) {}

override fun onTextChanged(s: CharSequence?, start: Int, before: Int, count: Int) {
buttonLanguageCopy.visibility = if (s.isNullOrEmpty()) View.INVISIBLE else View.VISIBLE
}
})

buttonLanguageCopy.setOnClickListener { copyTextToClipboard() }
}

viewModel.languageData.observe(viewLifecycleOwner, ::getLanguageData)
}

private fun getLanguageData(languages: List<IdentifiedLanguage>?) {
var langList : String? = ""
languages?.let { listIdentifiedLanguages ->
for (language in listIdentifiedLanguages) {
val languageTag = language.languageTag
val confidenceLevel = language.confidence
LanguageIdentificationSupportedLanguages.entries.forEach{
if (it.langCode == languageTag) {
langList += IdentificationLanguage(it.langEnglish, it.langNative, confidenceLevel).toString()
}
}
Log.i(TAG, "Language: $languageTag - Confidence: $confidenceLevel")
}

binding.textViewLanguage.text = langList
}
}

private fun copyTextToClipboard() {
val clipboardManager = requireContext().getSystemService(Context.CLIPBOARD_SERVICE) as ClipboardManager
val clip = ClipData.newPlainText("Copied Text", binding.textViewLanguage.text.toString())
clipboardManager.setPrimaryClip(clip)

Toast.makeText(requireContext(), "Text copied to clipboard!", Toast.LENGTH_SHORT).show()
}


override fun onDestroyView() {
super.onDestroyView()
_binding = null
}

companion object {
private val TAG = LanguageIdentificationFragment::class.java.simpleName.toString()
}
}

Design

This XML layout file defines the user interface for the LanguageIdentificationFragment. Here's a breakdown of the key components:

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
xmlns:app="http://schemas.android.com/apk/res-auto"
android:layout_marginBottom="?android:actionBarSize"
android:orientation="vertical"
tools:context=".identification.LanguageIdentificationFragment">


<androidx.appcompat.widget.AppCompatEditText
android:id="@+id/editTextLanguage"
android:layout_width="match_parent"
android:layout_height="350dp"
android:fontFamily="@font/poppins_regular"
android:padding="20dp"
android:textSize="20sp"
android:hint="Enter the text you want to Identification Language!"
tools:text="My hovercraft is full of eels." />


<FrameLayout
android:layout_width="match_parent"
android:layout_height="match_parent">

<androidx.appcompat.widget.AppCompatImageButton
android:id="@+id/buttonLanguageCopy"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_gravity="end"
android:padding="10dp"
android:src="@drawable/icon_copy"
android:visibility="invisible"
android:background="@android:color/transparent"
android:tint="@color/material_dynamic_tertiary60"/>

<androidx.appcompat.widget.AppCompatTextView
android:id="@+id/textViewLanguage"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:layout_gravity="center"
android:fontFamily="@font/poppins_medium"
android:gravity="center"
android:padding="20dp"
android:textSize="15sp"
android:hint="The Identified languages will appear here."
tools:text="en (English)" />

<com.google.android.material.floatingactionbutton.ExtendedFloatingActionButton
android:id="@+id/buttonLanguage"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_margin="16dp"
android:layout_gravity="bottom|end"
android:text="PROCESS"
android:fontFamily="@font/poppins_bold"
app:icon="@drawable/icon_search"/>
</FrameLayout>
</LinearLayout>

İbrahim Can Erdoğan

LINKEDIN

YOUTUBE

UDEMY

GITHUB

--

--

ibrahimcanerdogan
ibrahimcanerdogan

Written by ibrahimcanerdogan

Hi, My name is Ibrahim, I am developing ebebek android app within Ebebek. I publish various articles in the field of programming and self-improvement.

No responses yet