Skip to content

master-wayne7/safe_text

Repository files navigation

SafeText Banner

pub version pub likes pub points CI MIT license

platforms

A high-performance Flutter package for filtering offensive language (profanity) and detecting phone numbers. Powered by the Aho-Corasick algorithm for O(N) single-pass scanning across 80+ languages and 55,000+ curated words.


Table of Contents


What's New in 2.0.0

  • Aho-Corasick Algorithm — Near-instant multi-pattern search in O(N) complexity.
  • Up to 20x faster than the legacy regex-loop approach.
  • 80+ languages — Full human-readable enum names (e.g., Language.hindi, Language.spanish).
  • Modular APISafeTextFilter for profanity, PhoneNumberChecker for phone numbers.
  • Memory efficient — Single-pass string building via StringBuffer.
  • Leet-speak normalization — Catches bypasses like f@ck or b4d with zero extra overhead.

Features

  • Scans thousands of bad words in a single pass of the input text.
  • Catches common character substitutions: @→a, 4→a, 3→e, 0→o, $→s, and more.
  • Detects phone numbers in digits, words, mixed formats, and multiplier words (e.g., "triple five").
  • Multiple masking strategies — full (******), partial (f**k), or custom replacement ([censored]).
  • Customizable — add your own words or exclude specific phrases.
  • Non-blocking — PhoneNumberChecker runs in a separate isolate via compute.
  • Works on Android, iOS, Web, macOS, Linux, and Windows.

Installation

Add safe_text to your project using the Flutter CLI:

flutter pub add safe_text

Or manually add it to your pubspec.yaml:

dependencies:
  safe_text: ^2.1.2

Then run:

flutter pub get

Quick Start

import 'package:safe_text/safe_text.dart';

void main() async {
  // Initialize once at app startup
  await SafeTextFilter.init(language: Language.english);

  // Filter profanity (full masking — default)
  final clean = SafeTextFilter.filterText(text: "What the f@ck!");
  print(clean); // "What the ****!"

  // Partial masking — keeps first & last characters for 4+ letter words
  final partial = SafeTextFilter.filterText(
    text: "What the f@ck!",
    strategy: const MaskStrategy.partial(),
  );
  print(partial); // "What the f**k!"

  // Custom replacement
  final custom = SafeTextFilter.filterText(
    text: "What the f@ck!",
    strategy: const MaskStrategy.custom(replacement: '[censored]'),
  );
  print(custom); // "What the [censored]!"

  // Check for bad words
  final hasBad = await SafeTextFilter.containsBadWord(text: "Some bad input");
  print(hasBad); // true or false

  // Detect phone numbers
  final hasPhone = await PhoneNumberChecker.containsPhoneNumber(
    text: "Call me at nine 7 eight 3 triple four",
  );
  print(hasPhone); // true
}

API Reference

SafeTextFilter.init

Must be called once before using filterText or containsBadWord. Builds the Aho-Corasick trie from the selected word list(s).

// Single language
await SafeTextFilter.init(language: Language.english);

// Custom combination
await SafeTextFilter.init(languages: [Language.english, Language.hindi, Language.spanish]);

// All 75+ languages
await SafeTextFilter.init(language: Language.all);
Parameter Type Default Description
language Language? Language.english A single language to load. Use Language.all to load every language. Ignored when languages is provided.
languages List<Language>? null A custom list of languages. Takes precedence over language.

Note: If neither parameter is provided, the filter defaults to Language.english.


SafeTextFilter.filterText

Synchronous. Returns the input text with matched bad words masked according to the chosen MaskStrategy.

// Full masking (default)
String result = SafeTextFilter.filterText(
  text: "Hello b4dass world!",
  extraWords: ["badterm"],      // optional: add custom words
  excludedWords: ["bass"],      // optional: never filter these
  useDefaultWords: true,        // use the built-in word list
);
// Result: "Hello ****** world!"

// Partial masking
String partial = SafeTextFilter.filterText(
  text: "Hello b4dass world!",
  strategy: const MaskStrategy.partial(),
);
// Result: "Hello b****s world!"

// Custom replacement
String custom = SafeTextFilter.filterText(
  text: "Hello b4dass world!",
  strategy: const MaskStrategy.custom(), // defaults to "[censored]"
);
// Result: "Hello [censored] world!"
Parameter Type Default Description
text String required The input string to process.
extraWords List<String>? null Additional words to filter on top of (or instead of) the built-in list.
excludedWords List<String>? null Words that must never be filtered, even if they appear in the list.
useDefaultWords bool true Include the built-in language word list. Set to false to use only extraWords.
strategy MaskStrategy? null (defaults to MaskStrategy.full()) Masking strategy. See Masking Strategies below.
fullMode bool true Deprecated. Use strategy instead. true maps to MaskStrategy.full(), false maps to MaskStrategy.partial().
obscureSymbol String * Deprecated. Pass obscureSymbol via MaskStrategy.full() or MaskStrategy.partial() instead.

Precedence: When strategy is provided, it takes full precedence over the deprecated fullMode and obscureSymbol parameters. When strategy is omitted, fullMode: true maps to MaskStrategy.full(obscureSymbol: obscureSymbol) and fullMode: false maps to MaskStrategy.partial(obscureSymbol: obscureSymbol).

Masking Strategies

Strategy Constructor Output Example Description
Full MaskStrategy.full(obscureSymbol: '*') badass****** Replaces every character with the obscure symbol.
Partial MaskStrategy.partial(obscureSymbol: '*') damnd**n, assa** Keeps first character visible. For 4+ letter words, also keeps the last character.
Custom MaskStrategy.custom(replacement: '[censored]') badass[censored] Replaces the entire word with a fixed string.

Note: obscureSymbol must be exactly one character. This is enforced via assert in debug mode — a multi-character string will trigger an AssertionError during development.


SafeTextFilter.containsBadWord

Asynchronous. Returns true if the text contains at least one filtered word.

bool hasBadWord = await SafeTextFilter.containsBadWord(
  text: "Don't be a pendejo",
  extraWords: ["badterm"],   // optional
  excludedWords: ["pend"],   // optional
  useDefaultWords: true,     // optional
);
Parameter Type Default Description
text String required The input string to check.
extraWords List<String>? null Additional words to check against.
excludedWords List<String>? null Words to ignore even if matched.
useDefaultWords bool true Include the built-in word list in the check.

PhoneNumberChecker.containsPhoneNumber

Asynchronous. Runs in a separate isolate via Flutter's compute function so it never blocks the UI thread.

Detects phone numbers expressed as:

  • Pure digits: 9783444
  • Word-based: nine seven eight three four four four
  • Mixed: 9 seven 8 3444
  • Multiplier words: nine seven eight three triple four

Supported multiplier words: double, triple, quadruple, quintuple, sextuple, septuple, octuple, nonuple, decuple.

bool hasPhone = await PhoneNumberChecker.containsPhoneNumber(
  text: "Call me at nine 7 eight 3 triple four",
  minLength: 7,   // minimum digit count to be considered a phone number
  maxLength: 15,  // maximum digit count
);
Parameter Type Default Description
text String required The input string to check.
minLength int 7 Minimum number of digits for a valid phone number.
maxLength int 15 Maximum number of digits for a valid phone number.

Supported Languages

Pass any of these Language enum values to SafeTextFilter.init. Use Language.all to load every language simultaneously.

View all 82 supported languages
Enum Language
Language.afrikaans Afrikaans
Language.amharic Amharic
Language.arabic Arabic
Language.azerbaijani Azerbaijani
Language.belarusian Belarusian
Language.bulgarian Bulgarian
Language.catalan Catalan
Language.cebuano Cebuano
Language.czech Czech
Language.welsh Welsh
Language.danish Danish
Language.german German
Language.dzongkha Dzongkha
Language.greek Greek
Language.english English
Language.esperanto Esperanto
Language.spanish Spanish
Language.estonian Estonian
Language.basque Basque
Language.persian Persian
Language.finnish Finnish
Language.filipino Filipino
Language.french French
Language.scottishGaelic Scottish Gaelic
Language.galician Galician
Language.hindi Hindi
Language.croatian Croatian
Language.hungarian Hungarian
Language.armenian Armenian
Language.indonesian Indonesian
Language.icelandic Icelandic
Language.italian Italian
Language.japanese Japanese
Language.kabyle Kabyle
Language.kannada Kannada
Language.khmer Khmer
Language.korean Korean
Language.latin Latin
Language.lithuanian Lithuanian
Language.latvian Latvian
Language.maori Maori
Language.macedonian Macedonian
Language.malayalam Malayalam
Language.mongolian Mongolian
Language.marathi Marathi
Language.malay Malay
Language.maltese Maltese
Language.burmese Burmese
Language.dutch Dutch
Language.norwegian Norwegian
Language.norfuk Norfuk / Pitcairn
Language.piapoco Piapoco
Language.polish Polish
Language.portuguese Portuguese
Language.romanian Romanian
Language.kriol Kriol
Language.russian Russian
Language.slovak Slovak
Language.slovenian Slovenian
Language.samoan Samoan
Language.albanian Albanian
Language.serbian Serbian
Language.swedish Swedish
Language.tamil Tamil
Language.telugu Telugu
Language.tetum Tetum
Language.thai Thai
Language.klingon Klingon
Language.tongan Tongan
Language.turkish Turkish
Language.ukrainian Ukrainian
Language.uzbek Uzbek
Language.vietnamese Vietnamese
Language.yiddish Yiddish
Language.chinese Chinese
Language.zulu Zulu
Language.bengali Bengali
Language.gujarati Gujarati
Language.punjabi Punjabi
Language.swahili Swahili
Language.urdu Urdu
Language.all All of the above

How it Works

Legacy approach (v1.x): For each bad word in a list of 10,000+ words, run a separate regex scan over the entire input — O(W × N) where W is the word count.

v2.0.0 approach: The Aho-Corasick algorithm builds a Finite State Automaton (Trie) once from the entire word list. The engine then scans the input exactly once, matching all patterns simultaneously in O(N) time where N is the length of the text — regardless of how many words are in the list.

Input text  ──► [Normalizer] ──► [Aho-Corasick FSA] ──► Match ranges ──► [StringBuffer] ──► Filtered text
                  (leet-speak)     (single O(N) pass)    (merged)         (single-pass)

Migrating from v1.x

The original SafeText class is still available but marked @Deprecated. It internally delegates to the new classes. Migrate when ready:

v1.x v2.0.0
await SafeTextFilter.init(...) Required — call once at startup
SafeText.filterText(text: ...) SafeTextFilter.filterText(text: ...)
await SafeText.containsBadWord(text: ...) await SafeTextFilter.containsBadWord(text: ...)
await SafeText.containsPhoneNumber(text: ...) await PhoneNumberChecker.containsPhoneNumber(text: ...)

Before:

// v1.x — no init required, but slow
bool bad = await SafeText.containsBadWord(text: "some input");

After:

// v2.0.0 — init once, then use anywhere
await SafeTextFilter.init(language: Language.english); // once, e.g. in main()
bool bad = await SafeTextFilter.containsBadWord(text: "some input");

Limitations

  • SafeTextFilter.init must be called before use. Calling filterText or containsBadWord before init will fall back to a small built-in word list without the full multilingual dataset.
  • Phone number detection is English-word based. Words like "nine", "triple", etc. are English only — the detector does not parse written numbers in other languages.
  • False positives on technical terms. Short words in the filter list may match substrings of unrelated technical terms. Use excludedWords to suppress known false positives.
  • Language.all increases init time. Loading all 75+ language files is I/O-heavy. For most apps a targeted language list is faster.

Contributing

Contributions are welcome! Please read CONTRIBUTING.md for the full guidelines. The short version:

  1. Clone the repo and check out the develop branch.
  2. Create a feature branch: git checkout -b feature/your-feature
  3. Add tests for any new behaviour.
  4. Run checks before submitting:
    flutter analyze
    flutter test
  5. Open a pull request targeting develop. Ensure CI passes.

For major changes, please open an issue first to discuss the approach.


Data Source

SafeText uses the List of Dirty, Naughty, Obscene, and Otherwise Bad Words dataset:

  • 80+ dialects and languages
  • 55,000+ curated words

We are grateful to the contributors of this dataset for providing a robust multilingual foundation.


Authors

Ronit Rameja
Ronit Rameja

LinkedInReport an IssueDiscussionsBuy me a coffee


Contributors

Thanks to everyone who has contributed to SafeText!

Contributors

Made with contrib.rocks

About

A high-performance Flutter package for filtering offensive language (profanity) and detecting phone numbers.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages