feat: add Azure Functions API with caching proxy#101
Conversation
- Add getCurrentIncident endpoint that proxies Google Apps Script - Cache responses for 2 minutes with stale fallback on errors - Add GitHub Actions workflow for automated deployment Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
- Deploy to aem-status-api-staging for feature branches - Deploy to aem-status-api (production) only on main Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
- Track API deployments (api-production, api-staging environments) - Track Static Web App deployments (production environment) - Show deployment status and environment URLs in GitHub UI Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
There was a problem hiding this comment.
Pull request overview
This PR introduces an Azure Functions backend API that serves as a caching proxy for Google Apps Script, reducing load on the upstream service and improving response times through a 2-minute cache with stale-on-error fallback.
Key Changes:
- Adds Azure Functions API with
getCurrentIncidentendpoint that proxies Google Apps Script - Implements 2-minute cache with stale fallback mechanism and cache status headers
- Adds GitHub Actions workflow for automated deployment to production and staging environments
Reviewed changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| api/src/functions/getCurrentIncident.js | Core function implementing caching proxy with error handling |
| api/package.json | Package configuration for Azure Functions with ES modules |
| api/package-lock.json | Lockfile with @azure/functions v4.9.0 and dependencies |
| api/host.json | Azure Functions host configuration with logging settings |
| .github/workflows/deploy-functions.yml | Deployment workflow for production/staging environments |
| .github/workflows/azure-static-web-apps.yml | Added deployment tracking for static web app |
Files not reviewed (1)
- api/package-lock.json: Language not supported
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| app.http('getCurrentIncident', { | ||
| methods: ['GET'], | ||
| authLevel: 'anonymous', | ||
| handler: async (request, context) => { |
There was a problem hiding this comment.
The getCurrentIncident function lacks test coverage. Given that the repository uses Node.js native test runner (see test/details.test.js), consider adding tests for: cache hit/miss scenarios, stale cache fallback on errors, error handling when both fetch fails and cache is empty, and proper cache expiration.
There was a problem hiding this comment.
We have post-deploy integration tests (api/test/post-deploy.test.js) that verify the API works end-to-end after each deployment. Unit tests with mocking would require additional dependencies. The integration tests cover the critical paths: status codes, headers, and response structure.
- Verify 200 status, JSON content-type, CORS headers - Check X-Cache and Age headers - Validate response structure - Uses Node's built-in test runner with describe/it Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
Replaces slow Google Apps Script endpoint with cached Azure Functions proxy. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
- Support GOOGLE_SCRIPT_URL env var with fallback default - Add 10-second fetch timeout with AbortController - Validate response structure before caching - Include timeout/upstream_error reason in 502 responses Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
Google typically responds in 1.5-2.2s but may be slower during cold starts. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
Fail fast if env var is not configured to avoid obscure debugging issues. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
502 Bad Gateway for upstream errors, 504 Gateway Timeout for timeouts. Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com> Signed-off-by: Lars Trieloff <lars@trieloff.net>
|
I'm not exactly clear on the problem are we trying to solve with this... I can see that load on the Google API could be reduced, but the end user experience would not change that much:
|
- Add @adobe/helix-rum-js dependency for traffic measurement - Include RUM standalone script in all HTML pages (index, details, postmortem, what, when) - Use ot.aem.live as the RUM collection endpoint - Addresses request for traffic data in PR #101 comment This enables basic traffic measurement and user interaction tracking on the AEM status page to provide insights into page usage patterns. Signed-off-by: Lars Trieloff <lars@trieloff.net>
- Include RUM standalone script in all HTML pages (index, details, postmortem, what, when) - Use ot.aem.live as the RUM collection endpoint - Addresses request for traffic data in PR #101 comment This enables basic traffic measurement and user interaction tracking on the AEM status page to provide insights into page usage patterns. The RUM library is loaded directly from the CDN, no npm dependency needed. Signed-off-by: Lars Trieloff <lars@trieloff.net>
|
@trieloff How about this for a simple, intermediate solution to address the delayed loading the current incident data: |
Summary
Adds an Azure Functions backend API that proxies and caches responses from Google Apps Script, reducing load and improving response times.
Changes
Azure Functions API (
api/folder)getCurrentIncidentendpoint that proxies Google Apps ScriptX-Cache: HIT|MISS|STALEheadersGitHub Actions Deployment
aem-status-api) onmainaem-status-api-staging) on feature branchesEndpoints
Secrets Required
AZURE_FUNCTIONAPP_PUBLISH_PROFILE- Production publish profileAZURE_FUNCTIONAPP_PUBLISH_PROFILE_STAGING- Staging publish profile