feat(auth): Implement secure httpOnly cookie authentication for Okta#1920
feat(auth): Implement secure httpOnly cookie authentication for Okta#1920arjunp99 wants to merge 3 commits intodata-dot-all:mainfrom
Conversation
Replace localStorage token storage with httpOnly cookies to prevent XSS attacks. Implements custom PKCE flow for Okta authentication while maintaining existing Cognito/Amplify behavior unchanged. Changes: - Add PKCE utility for secure OAuth code exchange - Add Callback view for handling OAuth redirects - Add backend auth_handler for token exchange endpoints - Update GenericAuthContext with cookie-based auth for Okta - Update useClient to work without Authorization header for Okta - Configure CloudFront to proxy /auth/*, /graphql/*, /search/* paths - Update Lambda API with auth endpoints and CORS for cookies - Update custom authorizer to read tokens from Cookie header Security improvements: - Tokens stored in httpOnly cookies (not accessible via JavaScript) - SameSite=Lax prevents CSRF while allowing OAuth redirects - Secure flag ensures HTTPS-only transmission
Security improvements: - Add structured logging with sanitized error messages - Remove hardcoded CloudFront URL fallback (requires proper config) - Move SimpleCookie import to module level for better performance Frontend enhancements: - Add 30-second timeout to token exchange requests - Fix useEffect dependency array in useClient hook - Implement OAuth callback handler with PKCE validation Infrastructure updates: - Configure auth handler Lambda for cookie-based authentication - Add API Gateway routes for token exchange, logout, and userinfo - Improve CloudFront URL parsing documentation All changes pass Ruff linting and formatting checks.
| def logout_handler(event): | ||
| """Clear all auth cookies""" | ||
| cookies = [] | ||
| for cookie_name in ['access_token', 'id_token', 'refresh_token']: |
There was a problem hiding this comment.
Should we also logout from okta. Here we are deleting those cookies but that doesn't mean we will be logging out of Okta. Should we make a call to the okta endpoint to let Okta know that we want to logout ?
There was a problem hiding this comment.
Yes, good point! I'll implement Okta logout by redirecting to Okta's /v1/logout endpoint from the frontend after clearing cookies. This fully ends the Okta session so the user must re-authenticate on next login.
The flow will be:
Frontend calls /auth/logout to clear cookies
Frontend redirects to {okta_url}/v1/logout?id_token_hint=...&post_logout_redirect_uri=...
This is handled on the frontend side since it requires a browser redirect.
| elif path == '/auth/userinfo' and method == 'GET': | ||
| return userinfo_handler(event) | ||
| else: | ||
| return error_response(404, 'Not Found', event) |
There was a problem hiding this comment.
Instead of 'Not Found' can we say something more descriptive like Incorrect route for authentication etc ?
There was a problem hiding this comment.
Changed to: 'Auth endpoint not found. Valid routes: /auth/token-exchange, /auth/logout, /auth/userinfo'
| def userinfo_handler(event): | ||
| """Return user info from id_token cookie""" | ||
| try: | ||
| cookie_header = event.get('headers', {}).get('Cookie') or event.get('headers', {}).get('cookie', '') |
There was a problem hiding this comment.
does the headers change between browsers. Asking this since you are fetching both Cookie and cookie key
There was a problem hiding this comment.
Yes, HTTP header names can be passed with different casing depending on the proxy/gateway configuration. API Gateway sometimes normalizes headers to lowercase (cookie) while the HTTP spec uses Cookie. Checking both ensures we handle all cases reliably.
| return error_response(401, 'Invalid token format', event) | ||
|
|
||
| payload = parts[1] | ||
| padding = 4 - len(payload) % 4 |
There was a problem hiding this comment.
Can you please add a comment and document what you are trying to do here
There was a problem hiding this comment.
Added comments on this logic.
# JWT format: header.payload.signature (base64url encoded)
payload = parts[1]
# Base64 requires padding to be multiple of 4 characters
# URL-safe base64 in JWTs often omits padding, so we add it back
padding = 4 - len(payload) % 4
if padding != 4:
payload += '=' * padding
| ), | ||
| } | ||
|
|
||
| except Exception as e: |
There was a problem hiding this comment.
If there are any specific exception raised by base64.urlsafe_b64decode or other package you are using . Catch the important ones
There was a problem hiding this comment.
Added specific exception handling:
binascii.Error, ValueError - for base64 decode failures
json.JSONDecodeError - for invalid JSON in JWT payload
Generic Exception kept as fallback for unexpected errors
All errors are logged with details but return generic messages to clients.
| ) | ||
|
|
||
| # Add API Gateway behaviors for cookie-based authentication (when using custom_auth) | ||
| if custom_auth and backend_region: |
There was a problem hiding this comment.
what's the backend region check for ?
There was a problem hiding this comment.
The backend_region check is redundant - looking at pipeline.py, it's always set via backend_region=target_env.get('region', self.region), so it will never be None. Removed the unnecessary check.
| cloudfront_distribution.add_behavior( | ||
| path_pattern='/auth/*', | ||
| origin=api_gateway_origin, | ||
| cache_policy=cloudfront.CachePolicy.CACHING_DISABLED, |
There was a problem hiding this comment.
Is this the typical behaviour for auth endpoints to not have caching ?
There was a problem hiding this comment.
Yes, this is standard practice. Auth endpoints should never be cached - each request is unique (one-time auth codes, session-specific cookies, user-specific data). Caching would return stale data and break login/logout.
|
|
||
| # Add behavior for /graphql/* routes | ||
| cloudfront_distribution.add_behavior( | ||
| path_pattern='/graphql/*', |
There was a problem hiding this comment.
We didn't have any cloudfront dist behaviour for graphql or search. What's the benefit of adding it here ?
There was a problem hiding this comment.
This is required for httpOnly cookies to work. Browsers only send cookies to same-origin requests. Before, tokens were in localStorage and sent via Authorization header (works cross-origin). Now with httpOnly cookies, the browser won't send them to a different origin (API Gateway). Routing through CloudFront makes frontend and API same-origin, so cookies are sent automatically.
| api_handler_env['frontend_domain_url'] = f'https://{custom_domain.get("hosted_zone_name", None)}' | ||
| if custom_auth: | ||
| api_handler_env['custom_auth'] = custom_auth.get('provider', None) | ||
| api_handler_env['custom_auth_url'] = custom_auth.get('url', None) |
| api_handler_env['frontend_domain_url'] = f'https://{custom_domain.get("hosted_zone_name", None)}' | ||
| if custom_auth: | ||
| api_handler_env['custom_auth'] = custom_auth.get('provider', None) | ||
| api_handler_env['custom_auth_url'] = custom_auth.get('url', None) |
| vpc=vpc, | ||
| security_groups=[auth_handler_sg], | ||
| memory_size=512 if prod_sizing else 256, | ||
| timeout=Duration.seconds(30), |
There was a problem hiding this comment.
Does this set timeout for the AWS lambda ?
|
|
||
| # Initialize Klayers | ||
| klayers = Klayers(self, python_version=PYTHON_LAMBDA_RUNTIME, region=self.region) | ||
| runtime = _lambda.Runtime.PYTHON_3_12 |
There was a problem hiding this comment.
Why are we not using the config defined python runtime - PYTHON_LAMBDA_RUNTIME
| handler=self.authorizer_fn, | ||
| identity_sources=[apigw.IdentitySource.header('Authorization')], | ||
| # Empty identity_sources allows Lambda to be invoked without specific headers | ||
| # This enables cookie-based auth where tokens come from Cookie header |
There was a problem hiding this comment.
nit: the comment could be extended
This enables cookie-based auth where tokens come from Cookie header and also auth with Authorization header
| if custom_domain and custom_domain.get('hosted_zone_name'): | ||
| cors_origin = f'https://{custom_domain.get("hosted_zone_name")}' | ||
| else: | ||
| cors_origin = '' # Must be configured via custom_domain in cdk.json |
There was a problem hiding this comment.
If custom domain is not present then should we default to the domain URL provided by Cloudfront and add it here ?
There was a problem hiding this comment.
Can you include this file into the index.js and then import this file over here - frontend/src/authentication/contexts/GenericAuthContext.js
|
|
||
| // Use relative URL for custom auth (CloudFront proxy), otherwise use env var | ||
| const graphqlUri = CUSTOM_AUTH | ||
| ? '/graphql/api' |
There was a problem hiding this comment.
I don't understand why we need the cloudfront URL vs the API Gateway URL . Can you explain on why we created the cloufront distribution ?
| signInWithRedirect, | ||
| signOut | ||
| } from 'aws-amplify/auth'; | ||
| import { generatePKCE, generateState } from '../../utils/pkce'; |
There was a problem hiding this comment.
After this - https://github.com/data-dot-all/dataall/pull/1920/changes#r2886370803
you can import it like import { generatePKCE, generateState } from 'utils/pkce';
| requestInfo: null | ||
| } | ||
| }); | ||
| await logout(); |
There was a problem hiding this comment.
Can you let me know if you tested the ReAuth flow ?
TejasRGitHub
left a comment
There was a problem hiding this comment.
Hey @arjunp99 , The changes made by you look very solid. I have added comments some of which are for clarification and some cosmetic. But mostly everything looks solid
There are a few things which I think are missing,
- What happens when the user token expires, in the current implementation, the webapp automatically resets to the login page. There is an internal event set when the token reaches expiration. I think if possible we should mimick that.
- The logout flow currently only clears the cookies but it should also logout from okta if such an endpoint is present to invalid the tokens in okta at the same time user logouts
Summary
Replace localStorage token storage with httpOnly cookies to prevent XSS attacks for Okta authentication. This implements a custom PKCE flow while maintaining existing Cognito/Amplify behavior unchanged.
Security Improvements
Changes
Frontend
frontend/src/utils/pkce.js- PKCE utility for secure OAuth code exchangefrontend/src/authentication/views/Callback.js- OAuth callback handlerfrontend/src/authentication/contexts/GenericAuthContext.js- Cookie-based auth for Oktafrontend/src/services/hooks/useClient.js- Relative URLs + credentials for cookiesfrontend/src/routes.js- Added /callback routeBackend
backend/auth_handler.py- Token exchange, userinfo, logout endpointsdeploy/stacks/lambda_api.py- Auth handler Lambda + API routesdeploy/stacks/cloudfront.py- Proxy /auth/, /graphql/, /search/* to API Gatewaydeploy/custom_resources/custom_authorizer/custom_authorizer_lambda.py- Read tokens from Cookie headerHow It Works
Backward Compatibility