Skip to content

1. HackBot server core#440

Open
ReehalS wants to merge 10 commits intomainfrom
hackbot-server-core
Open

1. HackBot server core#440
ReehalS wants to merge 10 commits intomainfrom
hackbot-server-core

Conversation

@ReehalS
Copy link
Copy Markdown
Member

@ReehalS ReehalS commented Mar 12, 2026

HackBot server core: API route, actions, utils, types, data, and CI/CD

What's Added:

  • OpenAI streaming chat endpoint with get_events and provide_links tools
  • Server actions for knowledge CRUD, reseed, import, usage metrics
  • Event filtering/formatting utilities with timezone-aware LA time handling
  • System prompt builder with profile-aware personalization and prefix caching
  • Vector search context retrieval with retry/backoff
  • Knowledge base JSON (55 entries: FAQ, tracks, judging, submission, general)
  • CI/CD seed scripts for hackbot_knowledge to hackbot_docs
  • Auth session extended with position, is_beginner, name fields
  • Tailwind hackbot-slide-in animation keyframe (for frontend)
  • Dependencies: ai@6, @ai-sdk/openai

All work was done on hackbot branch and moved to this one for PR purposes

…CI/CD

- OpenAI streaming chat endpoint with get_events and provide_links tools
- Server actions for knowledge CRUD, reseed, import, usage metrics
- Event filtering/formatting utilities with timezone-aware LA time handling
- System prompt builder with profile-aware personalization and prefix caching
- Vector search context retrieval with retry/backoff
- Knowledge base JSON (55 entries: FAQ, tracks, judging, submission, general)
- CI/CD seed scripts for hackbot_knowledge to hackbot_docs
- Auth session extended with position, is_beginner, name fields
- Tailwind hackbot-slide-in animation keyframe
- Dependencies: ai@6, @ai-sdk/openai
@ReehalS ReehalS changed the title Add HackBot server core: API route, actions, utils, types, data, and CI/CD HackBot server core: API route, actions, utils, types, data, and CI/CD Mar 12, 2026
@ReehalS
Copy link
Copy Markdown
Member Author

ReehalS commented Mar 12, 2026

Closes #441

@ReehalS ReehalS changed the title HackBot server core: API route, actions, utils, types, data, and CI/CD 1. HackBot server core Mar 12, 2026
@ReehalS ReehalS linked an issue Mar 12, 2026 that may be closed by this pull request
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces the initial “HackBot” server core: a streaming chat API route backed by OpenAI + MongoDB vector search, plus supporting hackbot utilities/actions and deployment seeding.

Changes:

  • Added /api/hackbot/stream streaming endpoint with get_events / provide_links tools and custom data-stream output.
  • Implemented hackbot utilities (system prompt builder, event filtering/formatting, retry/backoff, embeddings) + server actions for knowledge/metrics.
  • Added CI/CD seed scripts and workflow steps to embed knowledge docs into hackbot_docs; added new AI SDK dependencies.

Reviewed changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
tailwind.config.ts Adds hackbot-slide-in keyframe + animation for HackBot UI.
scripts/hackbotSeedCI.mjs CI seeding script to upsert knowledge docs into hackbot_docs with embeddings.
scripts/hackbotSeed.mjs Local interactive seeding script for hackbot docs.
package.json Adds hackbot:seed script + AI SDK dependencies.
package-lock.json Locks new dependency graph for ai / @ai-sdk/openai and transitive deps.
auth.ts Extends NextAuth user/session/jwt fields for HackBot personalization.
app/_types/hackbot.ts Adds shared HackBot types for docs/messages/events/links.
app/_data/hackbot_knowledge_import.json Adds initial knowledge base content for importing.
app/(api)/api/hackbot/stream/route.ts Implements streaming HackBot endpoint + get_events and provide_links tools.
app/(api)/_utils/hackbot/systemPrompt.ts Adds system prompt builder with profile/page-context personalization and caching strategy.
app/(api)/_utils/hackbot/retryWithBackoff.ts Adds retry/backoff helper used by vector search embedding step.
app/(api)/_utils/hackbot/eventFormatting.ts Adds LA-timezone-aware date parsing/formatting helpers.
app/(api)/_utils/hackbot/eventFiltering.ts Adds profile relevance/recommendation and time-filtering helpers.
app/(api)/_utils/hackbot/embedText.ts Adds embedding helper using the ai SDK + OpenAI embedding model.
app/(api)/_datalib/hackbot/getHackbotContext.ts Adds vector-search context retrieval from hackbot_docs.
app/(api)/_actions/hackbot/saveKnowledgeDoc.ts Adds server action to create/update knowledge docs + embeddings.
app/(api)/_actions/hackbot/reseedHackbot.ts Adds server action to re-embed all knowledge docs into hackbot_docs.
app/(api)/_actions/hackbot/importKnowledgeDocs.ts Adds server action to bulk import knowledge docs + embeddings.
app/(api)/_actions/hackbot/getUsageMetrics.ts Adds server action to aggregate token usage metrics.
app/(api)/_actions/hackbot/getKnowledgeDocs.ts Adds server action to list knowledge docs for admin UI.
app/(api)/_actions/hackbot/getHackerProfile.ts Adds server action to read profile fields from session for prompt personalization.
app/(api)/_actions/hackbot/deleteKnowledgeDoc.ts Adds server action to delete knowledge docs and embedded docs.
app/(api)/_actions/hackbot/clearKnowledgeDocs.ts Adds server action to clear knowledge + embedded docs.
.github/workflows/staging.yaml Runs hackbot seeding during deploy and syncs OpenAI-related env vars.
.github/workflows/production.yaml Runs hackbot seeding during deploy and syncs OpenAI-related env vars.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@HackDavis HackDavis deleted a comment from gitguardian bot Mar 31, 2026
@ReehalS
Copy link
Copy Markdown
Member Author

ReehalS commented Mar 31, 2026

Refactored app/(api)/api/hackbot/stream/route.ts/api/hackbot/stream/route.ts into modular stream utils in _utils/hackbot/stream/ and addressed copilot comments as well: hardened request/auth/tool behavior (message sanitization, 401 gating, better event/tag/role handling including lunch -> brunch and designer intent), improved retrieval efficiency/tests, and added local dev stream pretty-printer scripts for pre-UI validation.

How to run the no UI test:
Start app:
npm run dev

In a new terminal, run pretty stream test:

node scripts/dev/runHackbotPretty.mjs --prompt "What can a developer do at HackDavis?" --path / --base http://localhost:3000

It needs

ADMIN_EMAIL=<admin email here>
ADMIN_PASSWORD=<admin pwd here> 

in the .env to work, unless you want a specific email/pwd to be used, which you can pass in by --email "<email>" --password "<password>" as more args.

@gitguardian
Copy link
Copy Markdown

gitguardian bot commented Apr 1, 2026

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@michelleyeoh michelleyeoh deployed to development April 1, 2026 01:14 — with GitHub Actions Active
@michelleyeoh michelleyeoh requested a review from Copilot April 1, 2026 01:19
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 35 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +18 to +36
const sanitizedMessages: HackbotClientMessage[] = [];
for (const message of messages) {
const role = message?.role;
const content = message?.content;
if (
!ALLOWED_MESSAGE_ROLES.has(role) ||
typeof content !== 'string' ||
!content.trim()
) {
return Response.json(
{ error: 'Invalid message history format.' },
{ status: 400 }
);
}
sanitizedMessages.push({
role: role as 'user' | 'assistant',
content,
});
}
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the last message is capped by length, but earlier messages entries can be arbitrarily long / numerous. That allows a client to send a huge history that is fully parsed/validated server-side even though you later slice to the last N messages, creating an avoidable memory/CPU/latency vector. Consider enforcing (1) a max messages.length and (2) a max content length for every message (or at least a total payload char budget) during validation.

Copilot uses AI. Check for mistakes.
}

export function isSimpleGreetingMessage(content: string): boolean {
return /^(hi|hello|hey|thanks|thank you|ok|okay)\b/i.test(content.trim());
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isSimpleGreetingMessage matches any message that merely starts with "thanks"/"ok"/etc (e.g. "ok what’s the judging rubric?"), which will be misclassified as a greeting and will skip context retrieval. Tighten this to only treat messages as simple greetings when the entire content is just a greeting/ack (optionally with punctuation), or add a max-word/length constraint.

Suggested change
return /^(hi|hello|hey|thanks|thank you|ok|okay)\b/i.test(content.trim());
const trimmed = content.trim();
const normalized = trimmed.replace(/[.!?]+$/, '');
return /^(hi|hello|hey|thanks|thank you|ok|okay)$/i.test(normalized);

Copilot uses AI. Check for mistakes.
Comment on lines +28 to +38
function buildSearchPattern(search: string): string {
const q = search.trim();
if (!q) return q;

// Day 2 meal phrasing is often "lunch" in user language, but schedule uses
// "brunch". Include both so meal queries still resolve correctly.
if (/\blunch\b/i.test(q)) {
return q.replace(/\blunch\b/gi, '(lunch|brunch)');
}

return q;
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildSearchPattern returns user-/LLM-controlled text that is passed directly into MongoDB $regex. This allows regex meta-characters (and potentially catastrophic backtracking patterns) to be injected into the query, which can become a DoS/perf issue. Escape regex special characters in search before building the pattern (and then apply the lunch→(lunch|brunch) expansion on the escaped form).

Suggested change
function buildSearchPattern(search: string): string {
const q = search.trim();
if (!q) return q;
// Day 2 meal phrasing is often "lunch" in user language, but schedule uses
// "brunch". Include both so meal queries still resolve correctly.
if (/\blunch\b/i.test(q)) {
return q.replace(/\blunch\b/gi, '(lunch|brunch)');
}
return q;
function escapeRegex(input: string): string {
// Escape characters with special meaning in regular expressions so that
// user-controlled input is treated as literal text in MongoDB $regex queries.
return input.replace(/[.*+?^${}()|[\]\\]/g, '\\$&');
}
function buildSearchPattern(search: string): string {
const trimmed = search.trim();
if (!trimmed) return trimmed;
// First escape all regex meta-characters so the user input is treated literally.
const escaped = escapeRegex(trimmed);
// Day 2 meal phrasing is often "lunch" in user language, but schedule uses
// "brunch". Include both so meal queries still resolve correctly.
if (/\blunch\b/i.test(escaped)) {
// Apply the lunch→(lunch|brunch) expansion on the escaped form.
return escaped.replace(/\blunch\b/gi, '(lunch|brunch)');
}
return escaped;

Copilot uses AI. Check for mistakes.
Comment on lines +167 to +172
const typeFiltered = include_activities
? dateFiltered
: dateFiltered.filter(
(ev: any) => (ev.type ?? '').toUpperCase() !== 'ACTIVITIES'
);

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the model calls get_events with type: "ACTIVITIES" but leaves include_activities as null/false, the query will fetch only ACTIVITIES from Mongo and then typeFiltered will immediately filter them all out, returning empty results. Consider treating type === "ACTIVITIES" as implicitly include_activities: true (or changing the post-filter to only exclude activities when type is not ACTIVITIES).

Suggested change
const typeFiltered = include_activities
? dateFiltered
: dateFiltered.filter(
(ev: any) => (ev.type ?? '').toUpperCase() !== 'ACTIVITIES'
);
// Treat an explicit request for ACTIVITIES as implicitly including activities,
// even if include_activities was not set to true.
const activitiesRequested = (type ?? '').toUpperCase() === 'ACTIVITIES';
const typeFiltered =
include_activities || activitiesRequested
? dateFiltered
: dateFiltered.filter(
(ev: any) => (ev.type ?? '').toUpperCase() !== 'ACTIVITIES'
);

Copilot uses AI. Check for mistakes.
])}\n`
);
} else if (part?.type === 'tool-result') {
if (textHasBeenOutput) suppressText = true;
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text suppression only activates after a tool-result if some text was already emitted (textHasBeenOutput). If the model ever does a bare tool call first (or emits tool results before its intro sentence), any subsequent text deltas will still stream and can violate the “no text after tool results” UI contract. Consider suppressing text after any tool result (or at least after get_events / provide_links results) regardless of prior output.

Suggested change
if (textHasBeenOutput) suppressText = true;
// Suppress any subsequent text after a tool result to honor the
// "no text after tool results" UI contract.
suppressText = true;

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,271 @@
import type { HackerProfile } from '@typeDefs/hackbot';

// TODO: StarterKit id's need to be updated
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling/grammar: "id's" should be "IDs" (no apostrophe).

Suggested change
// TODO: StarterKit id's need to be updated
// TODO: StarterKit IDs need to be updated

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +93
const { model, maxOutputTokens } = getModelConfig();

// eslint-disable-next-line @typescript-eslint/no-explicit-any
const result = streamText({
model: openai(model) as any,
messages: chatMessages.map((m: any) => ({
role: m.role as 'system' | 'user' | 'assistant',
content: m.content,
})),
maxOutputTokens,
stopWhen: shouldStopStreaming,
tools: {
get_events: tool({
description:
'Fetch the live HackDavis event schedule from the database. Use this for ANY question about event times, locations, schedule, or what is happening when.',
inputSchema: GET_EVENTS_INPUT_SCHEMA,
execute: (input) =>
executeGetEvents(input, profile, lastMessage.content),
}),
provide_links: tool({
description: PROVIDE_LINKS_DESCRIPTION,
inputSchema: PROVIDE_LINKS_INPUT_SCHEMA,
execute: executeProvideLinks,
}),
},
});

const stream = createResponseStream(result, model);
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

streamText is typically async (returns a Promise). Here it’s called without await, so result may be a Promise and createResponseStream will later try to iterate result.fullStream, causing a runtime failure. If streamText is async in the ai@6 version used here, change this to const result = await streamText(...) before passing it to createResponseStream.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HackBot Server Core

3 participants