๐ AWS H0 Hackathon โ Database Showcase
How ChatScroll uses
AWS Databases
A technical overview of every AWS Database feature used โ from pgvector semantic search to DynamoDB TTL.
๐pgvector โ Semantic search๐ฒltree โ Folder hierarchy๐tsvector โ Full-text searchโกDynamoDB TTL โ Auto-expiry๐Composite Keys โ Message indexing๐คgemini-embedding-001
Aurora PostgreSQL (Serverless v2)
Relational core โ notes, folders, users, conversations, semantic search
Notes / Scrolls Table
notes- pgvector extension stores 3072-dim embeddings per scroll
gemini-embedding-001generates embeddings on save viaTrySaveEmbeddingAsync- Cosine similarity search surfaces semantically related scrolls
- tsvector full-text index for exact keyword matching
- Hybrid search: semantic nearest-neighbour combined with keyword results
-- Cosine distance (pgvector)
embedding <=> queryVector
-- Semantic search with threshold
WHERE 1 - (embedding <=> $vec) > 0.5
ORDER BY embedding <=> $vec
LIMIT 3
-- Full-text search
WHERE search_vector @@ plainto_tsquery('english', $q)
ORDER BY ts_rank(search_vector, ...) DESC
-- Helper
SELECT vector_dims(embedding) -- โ 3072Folders Table
folders- ltree extension stores folder paths as dot-separated label trees
- Enables hierarchical queries without recursive CTEs
- e.g.
programming.containersโ Programming โบ Containers - Related-scrolls scoping: find root parent, then include all descendant folder IDs
parent_idFK used for 2-level hierarchy traversal in API
-- ltree path matching WHERE path ~ 'programming.*' -- All folders in a subtree WHERE path = $root OR path ~ ($root || '.*') -- Foreign-key traversal (used in GetRelated) WHERE folder_id = ANY($allowedIds::uuid[])
Conversations Table
conversations- Conversation metadata (title, created_at, user_id) lives in Aurora
- Message content lives in DynamoDB โ dual-database pattern
- Aurora handles referential integrity; DynamoDB handles message volume
- Anonymous session IDs bridge the two stores before login
- On login,
migrate-anonymousre-parents both Aurora rows and DynamoDB items
Dual-database pattern: Aurora owns structure + search; DynamoDB owns the high-volume chat message stream.
Users Table
users- Cognito JWT
subclaim decoded server-side โ no middleware or JWT library needed - Anonymous users get a browser-generated UUID via
X-User-Idheader EnsureUserExistsAsyncauto-creates the row on first API request โ no registration step needed- User profile (display name, email) synced from Cognito JWT claims on each call
-- Identity resolution order (ApiControllerBase) 1. Authorization: Bearer <cognitoJwt> โ decode payload โ parse sub as UUID 2. X-User-Id header โ parse as UUID 3. Hard-coded dev fallback (local only)
Amazon DynamoDB
High-volume chat messages โ composite keys, TTL, and pay-per-request
Chat Messages Table
chatscroll-messages- Partition key groups all messages in a conversation together
- Sort key
timestamp#messageIdenables chronological reads and time-range queries - TTL attribute auto-expires messages after 90 days โ no cron jobs, no manual deletes
PAY_PER_REQUESTbilling โ scales to millions of messages with zero capacity planning- Each item stores role (user/assistant), content, folderSuggestion metadata
Access Patterns
PKconversationId = :convId
SKbegins_with SK, ''
Fetch all messages in a conversation, ordered chronologically
PKconversationId = :convId
SKbetween :t1 and :t2
Fetch messages within a time range (e.g. last 100 messages)
PK conversationId (String)
SK {timestamp}#{messageId} (String)
TTL expiresAt (Unix epoch, 90d)
System Architecture
Request flow from browser to databases
Browser
Next.js / Vercel
โ
AWS ECS Fargate
ASP.NET Core API
โ
โ
Aurora PostgreSQL
๐ scrolls + embeddings
๐ฒ folders (ltree)
๐ค users (Cognito sub)
๐ฌ conversation metadata
DynamoDB
๐ฌ chat messages
โก TTL (90-day expiry)
๐ composite sort keys
๐ pay-per-request scale
Next.js
App Router ยท Vercel
App Router ยท Vercel
ASP.NET Core
ECS Fargate ยท ECR
ECS Fargate ยท ECR
Aurora PG
Serverless v2 ยท pgvector
Serverless v2 ยท pgvector
DynamoDB
On-demand ยท TTL
On-demand ยท TTL
CI/CD Deployment
GitHub Actions โ ECR โ ECS Fargate
Pipeline
- Push to
mastertriggers GitHub Actions workflow - Docker image built and pushed to Amazon ECR
- ECS task definition registered automatically per deploy
- ECS service updated; polling loop waits for stabilization
Infrastructure
- ECS Fargate โ serverless containers, no EC2 management
- ALB with HTTPS termination in front of ECS tasks
- Secrets (DB password, Gemini key) via ECS task environment
- Aurora Serverless v2 โ auto-pauses when idle
Observability
- Health check endpoint:
GET /api/health - Live stats endpoint:
GET /api/stats - Embedding backfill:
POST /api/notes/admin/backfill-embeddings - Storage type exposed in stats response:
aurora_pgvector
git pushโGitHub Actionsโdocker buildโECR pushโECS deployโโ
live
See it in action โ save an AI answer as a Scroll and watch the embedding get generated.
Try ChatScroll โ