-
Notifications
You must be signed in to change notification settings - Fork 13
Handle Unicode BOM in the session file #1233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
See how PowerShell creates a file with that when redirecting with '>' See: PowerShell/PowerShell#8592
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Adds support for reading session IDs from files that include a Unicode BOM (notably produced by PowerShell redirection), ensuring --session @file works across common Windows/PowerShell encodings.
Changes:
- Update
SessionIdto try decoding@filecontents withutf-8-sig, then fall back toutf-16. - Add unit tests covering session ID files encoded as UTF-8 (with/without BOM) and UTF-16 (with BOM).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
smart_tests/utils/session.py |
Adds BOM-aware decoding strategy when reading session IDs from @path files. |
tests/utils/test_session.py |
Introduces tests to validate session file decoding across UTF-8/UTF-16 variants. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| except UnicodeDecodeError: | ||
| with open(file_path, 'r', encoding='utf-16') as f: | ||
| id = f.read().strip() |
Copilot
AI
Feb 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the file isn’t valid UTF-8 or UTF-16 (e.g., latin-1, UTF-16 without BOM, or corrupted), the UTF-16 fallback can raise UnicodeError (often “UTF-16 stream does not start with BOM”), which currently bypasses the outer FileNotFoundError/IOError handling and will surface as an unhandled exception. Consider catching UnicodeError around the decoding attempts and re-raising as BadCmdLineException with a clear message (or iterating through candidate encodings and failing gracefully).
| except UnicodeDecodeError: | |
| with open(file_path, 'r', encoding='utf-16') as f: | |
| id = f.read().strip() | |
| except UnicodeError: | |
| try: | |
| with open(file_path, 'r', encoding='utf-16') as f: | |
| id = f.read().strip() | |
| except UnicodeError as e: | |
| raise BadCmdLineException( | |
| f"Session file '{file_path}' is not a valid UTF-8 or UTF-16 text file: {e}" | |
| ) |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
See how PowerShell creates a file with that when redirecting with '>'
See: PowerShell/PowerShell#8592