-
Notifications
You must be signed in to change notification settings - Fork 0
Enhance email normalization providers by adding custom behavior for h… #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,16 +7,26 @@ | |
| /** | ||
| * Outlook | ||
| * | ||
| * Handles Outlook, Hotmail, and Live email normalization | ||
| * - TODO: Plus addressing removal commented out until manual confirmation | ||
| * Handles Outlook, Hotmail, and Live email normalization based on validator.js rules | ||
| * - Removes plus addressing (subaddress) | ||
| * - Preserves dots in local part | ||
| * - Normalizes to outlook.com domain | ||
| */ | ||
| class Outlook extends Provider | ||
| { | ||
| private const SUPPORTED_DOMAINS = [ | ||
| 'outlook.com', 'hotmail.com', 'live.com', | ||
| 'outlook.co.uk', 'hotmail.co.uk', 'live.co.uk', | ||
| 'outlook.com', 'outlook.at', 'outlook.be', 'outlook.cl', 'outlook.co.il', 'outlook.co.nz', 'outlook.co.th', 'outlook.co.uk', | ||
| 'outlook.com.ar', 'outlook.com.au', 'outlook.com.br', 'outlook.com.gr', 'outlook.com.pe', 'outlook.com.tr', 'outlook.com.vn', | ||
| 'outlook.cz', 'outlook.de', 'outlook.dk', 'outlook.es', 'outlook.fr', 'outlook.hu', 'outlook.id', 'outlook.ie', | ||
| 'outlook.in', 'outlook.it', 'outlook.jp', 'outlook.kr', 'outlook.lv', 'outlook.my', 'outlook.ph', 'outlook.pt', | ||
| 'outlook.sa', 'outlook.sg', 'outlook.sk', | ||
| 'hotmail.com', 'hotmail.at', 'hotmail.be', 'hotmail.ca', 'hotmail.cl', 'hotmail.co.il', 'hotmail.co.nz', 'hotmail.co.th', 'hotmail.co.uk', | ||
| 'hotmail.com.ar', 'hotmail.com.au', 'hotmail.com.br', 'hotmail.com.gr', 'hotmail.com.mx', 'hotmail.com.pe', 'hotmail.com.tr', 'hotmail.com.vn', | ||
| 'hotmail.cz', 'hotmail.de', 'hotmail.dk', 'hotmail.es', 'hotmail.fr', 'hotmail.hu', 'hotmail.id', 'hotmail.ie', | ||
| 'hotmail.in', 'hotmail.it', 'hotmail.jp', 'hotmail.kr', 'hotmail.lv', 'hotmail.my', 'hotmail.ph', 'hotmail.pt', | ||
| 'hotmail.sa', 'hotmail.sg', 'hotmail.sk', | ||
| 'live.com', 'live.be', 'live.co.uk', 'live.com.ar', 'live.com.mx', 'live.de', 'live.es', 'live.eu', 'live.fr', 'live.it', 'live.nl', | ||
| 'msn.com', 'passport.com', | ||
| ]; | ||
|
Comment on lines
17
to
30
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chainVerify the expanded domain list is complete and correct. The SUPPORTED_DOMAINS list expanded significantly from ~6 to 60+ domains. While this appears comprehensive for Outlook/Hotmail/Live variants, it should be validated against official Microsoft documentation. 🌐 Web query: 💡 Result: Short answer — Microsoft supports addresses on these official Microsoft-owned domains (including many country-code variants of each):
Notes:
Sources: Microsoft Support (Outlook/Hotmail/MSN support) [1]; Outlook.com / Microsoft account pages and historical domain lists compiled (examples) [2][3]. Add missing @windowslive.com legacy domain to SUPPORTED_DOMAINS list. Microsoft's documentation treats @windowslive.com as a supported legacy Microsoft account domain still maintained for existing accounts. The current list is missing Add 🤖 Prompt for AI Agents |
||
|
|
||
| private const CANONICAL_DOMAIN = 'outlook.com'; | ||
|
|
@@ -31,9 +41,13 @@ public function getCanonical(string $local, string $domain): array | |
| // Convert to lowercase | ||
| $normalizedLocal = $this->toLowerCase($local); | ||
|
|
||
| // TODO: Commented out until manual confirmation of Outlook's plus addressing support | ||
| // Remove plus addressing (everything after +) | ||
| // $normalizedLocal = $this->removePlusAddressing($normalizedLocal); | ||
| // Remove plus addressing (subaddress) - everything after + | ||
| $normalizedLocal = $this->removePlusAddressing($normalizedLocal); | ||
|
|
||
| // Ensure local part is not empty after normalization | ||
| if (empty($normalizedLocal)) { | ||
| throw new \InvalidArgumentException('Email local part cannot be empty after normalization'); | ||
| } | ||
|
|
||
| return [ | ||
| 'local' => $normalizedLocal, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,15 +7,15 @@ | |
| /** | ||
| * Yahoo | ||
| * | ||
| * Handles Yahoo email normalization | ||
| * - TODO: Plus addressing, dots, and hyphens removal commented out until manual confirmation | ||
| * - Preserves dots and hyphens in local part | ||
| * Handles Yahoo email normalization based on validator.js rules | ||
| * - Removes hyphen-based subaddress (everything after last -) | ||
| * - Preserves dots in local part | ||
| * - Normalizes to yahoo.com domain | ||
| */ | ||
| class Yahoo extends Provider | ||
| { | ||
| private const SUPPORTED_DOMAINS = [ | ||
| 'yahoo.com', 'yahoo.co.uk', 'yahoo.ca', | ||
| 'yahoo.com', 'yahoo.co.uk', 'yahoo.ca', 'yahoo.de', 'yahoo.fr', 'yahoo.in', 'yahoo.it', | ||
| 'ymail.com', 'rocketmail.com', | ||
| ]; | ||
|
|
||
|
|
@@ -31,20 +31,13 @@ public function getCanonical(string $local, string $domain): array | |
| // Convert to lowercase | ||
| $normalizedLocal = $this->toLowerCase($local); | ||
|
|
||
| // TODO: Commented out until manual confirmation of Yahoo's plus addressing, dots, and hyphens support | ||
| // Check if there's plus addressing | ||
| // $hasPlus = strpos($normalizedLocal, '+') !== false && strpos($normalizedLocal, '+') > 0; | ||
| // Remove hyphen-based subaddress (everything after last -) | ||
| $normalizedLocal = $this->removeHyphenSubaddress($normalizedLocal); | ||
|
|
||
| // Remove plus addressing (everything after +) | ||
| // $normalizedLocal = $this->removePlusAddressing($normalizedLocal); | ||
|
|
||
| // Remove dots only if there was plus addressing (Yahoo treats dots as aliases only with plus) | ||
| // if ($hasPlus) { | ||
| // $normalizedLocal = $this->removeDots($normalizedLocal); | ||
| // } | ||
|
|
||
| // Remove hyphens (Yahoo treats hyphens as aliases) | ||
| // $normalizedLocal = $this->removeHyphens($normalizedLocal); | ||
| // Ensure local part is not empty after normalization | ||
| if (empty($normalizedLocal)) { | ||
| throw new \InvalidArgumentException('Email local part cannot be empty after normalization'); | ||
| } | ||
|
Comment on lines
+34
to
+40
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chainVerify test coverage for empty local part validation. The implementation correctly validates that the local part is not empty after hyphen subaddress removal. However, I don't see test cases in YahooTest.php that verify this exception is thrown for edge cases like:
Run the following script to check if the exception scenario is tested: 🏁 Script executed: #!/bin/bash
# Description: Search for test cases that verify the InvalidArgumentException
# Check if YahooTest.php tests the exception
rg -n "expectException|InvalidArgumentException|^\s*\['-'" tests/Canonicals/Providers/YahooTest.phpLength of output: 0 Length of output: 40 🏁 Script executed: #!/bin/bash
# Check tests directory structure
find tests/ -type d | head -20Length of output: 66 🏁 Script executed: #!/bin/bash
# Search for any test files in Canonicals/Providers directory
fd --type f tests/Canonicals/Providers/ 2>/dev/null || find tests/Canonicals/Providers/ -type f 2>/dev/nullLength of output: 382 🏁 Script executed: #!/bin/bash
# Broader search for InvalidArgumentException in all test files
rg -n "InvalidArgumentException" tests/ --type phpLength of output: 0 Length of output: 4356 🏁 Script executed: #!/bin/bash
# Display Yahoo.php implementation
cat src/Emails/Canonicals/Providers/Yahoo.phpLength of output: 1560 🏁 Script executed: #!/bin/bash
# Search for test data or fixtures related to Yahoo tests
rg -n "dataProvider|@test" tests/Canonicals/Providers/YahooTest.php -A 2Length of output: 0 Add test cases to verify InvalidArgumentException for empty local parts. The implementation correctly throws Current test cases do not include edge cases that would result in an empty local part after normalization:
Add test cases to verify the exception is thrown for these inputs. 🤖 Prompt for AI Agents |
||
|
|
||
| return [ | ||
| 'local' => $normalizedLocal, | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,50 @@ | ||
| <?php | ||
|
|
||
| namespace Utopia\Emails\Canonicals\Providers; | ||
|
|
||
| use Utopia\Emails\Canonicals\Provider; | ||
|
|
||
| /** | ||
| * Yandex | ||
| * | ||
| * Handles Yandex email normalization based on validator.js rules | ||
| * - Preserves all characters in local part (no subaddress removal) | ||
| * - Normalizes to yandex.ru domain | ||
| */ | ||
| class Yandex extends Provider | ||
| { | ||
| private const SUPPORTED_DOMAINS = [ | ||
| 'yandex.ru', 'yandex.ua', 'yandex.kz', 'yandex.com', 'yandex.by', 'ya.ru', | ||
| ]; | ||
|
|
||
| private const CANONICAL_DOMAIN = 'yandex.ru'; | ||
|
|
||
| public function supports(string $domain): bool | ||
| { | ||
| return in_array($domain, self::SUPPORTED_DOMAINS, true); | ||
| } | ||
|
|
||
| public function getCanonical(string $local, string $domain): array | ||
| { | ||
| // Convert to lowercase | ||
| $normalizedLocal = $this->toLowerCase($local); | ||
|
|
||
| // Yandex doesn't remove subaddresses or dots | ||
| // Just normalize case and domain | ||
|
|
||
| return [ | ||
| 'local' => $normalizedLocal, | ||
| 'domain' => self::CANONICAL_DOMAIN, | ||
| ]; | ||
| } | ||
|
|
||
| public function getCanonicalDomain(): string | ||
| { | ||
| return self::CANONICAL_DOMAIN; | ||
| } | ||
|
|
||
| public function getSupportedDomains(): array | ||
| { | ||
| return self::SUPPORTED_DOMAINS; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Verify the stability of the utopia-php/validators dependency.
The version constraint
^0.0.1indicates an early-stage package (0.x.x semantics), which typically implies active development and potential breaking changes. Confirm that this package version is stable enough for production use and that the replacement ofutopia-php/frameworkis intentional and comprehensive.Please verify the following:
utopia-php/validatorspackage exists and is appropriate for the enhanced email normalization rules mentioned in the PR.^0.0.1is intentional for this library, especially given the 0.x.x versioning.utopia-php/frameworkpackage is no longer needed elsewhere in the codebase.Run the following script to check if
utopia-php/frameworkis referenced in other code files:🏁 Script executed:
Length of output: 30
🌐 Web query:
💡 Result:
Short answer: No — there is no utopia-php/validators package listed on Packagist, and therefore there is no published version 0.0.1 on Packagist. Packagist's utopia-php package index (which lists the available utopia-php packages) does not include a "validators" package. [1]
If you have a reference (composer.json, repo, or error) that mentions utopia-php/validators v0.0.1, paste it and I can check whether it's a private package, a typo, or a placeholder/dev artifact. [1]
Non-existent dependency: "utopia-php/validators" does not exist on Packagist.
The utopia-php organization publishes packages like utopia-php/emails (for parsing and validating email addresses), but there is no "validators" package. The
composer.jsonreferences a package that cannot be installed, causing composer to fail. Verify the correct package name—given the PR context (email normalization), this may be intended asutopia-php/emailsinstead.🤖 Prompt for AI Agents