Project

General

Profile

Actions

Feature #46

closed

identify-non-translatable-patterns

Added by Pawan Kumar about 2 months ago. Updated about 2 months ago.

Status:
Closed
Priority:
High
Assignee:
Start date:
02/01/2026
Due date:
03/01/2026
% Done:

100%

Estimated time:

Description

Identify non-translatable patterns in the collected text.
These patterns needs to be handled differently or need not be translated at all.
Examples of such patterns are:

  • Numbers (1, 2,302)
  • Numbers with thousand separators
  • Sections (1.3c, 1.3.4(b), 1.3.4-1, etc.)
  • versions (0.3p, 1.4.9-H, etc)
  • decimal (1.43679)
  • bullets (1.1, a.1, 3.c, -, *,
  • Numbers with prefix (Size 4.2
  • Numbers with suffix (4.3 Kg, 78 Lit, 400gm, etc.)
  • Numbers enclosed with brackets ([123], (239.3))
  • Numbers enclosed with starting brackets ([12 )
  • Special Tokens with numbers (Size: 5.2 Mb, etc.)
  • emails like (email: , , etc)
  • phone (various combination, Ph: 86754321, Mob. 9633 638 382, M. +91-3456-003-32, contact: 2345678)
  • whatsapp
  • Acronyms (standalone acronyms)
  • Stanslone Punctuations (.,",",`,-,+,=,-,@#$%^&*())
  • Protocols (http://, https://, sftp://, ftp://, etc.)
  • Domain names (http://www.abc.com, https://abc.com, https://abc.com/index.html, etc.)
  • Patterns like: [Dated: 17-09-2024]
  • Year (like "Year, 2005", 2025, 2024-25, 2024-2025, etc.)
  • Month: (Sep, Oct, January,
  • Dates: like ("Mar, 5, 2004", "25 April 2003", "5 May 04", "1 jun'4"
  • References: like, E-12020/03/2015-E&A [Updated on
Actions

Also available in: Atom PDF