AI & Automation

Cursor for Notes — AI-Powered Note Synthesis

Next.js B2C application that brings semantic intelligence to personal knowledge management through real-time vector embeddings, achieving 67% weekly retention and 58 NPS with 100+ early users.

Client

100+ Active Users

No video available

Cursor for Notes applies the philosophy of context-aware AI assistance (inspired by Cursor IDE) to personal knowledge management. Rather than treating notes as static documents, the application understands an entire knowledge base semantically, automatically organizing content through vector embeddings, and providing real-time intelligent suggestions as users type. Built on PostgreSQL with pgvector, Next.js, and advanced clustering algorithms, the product validated strong product-market fit signals through systematic user research across Reddit, LinkedIn, and in-person events.

Skills

Vector Embeddings & Semantic SearchPostgreSQL with pgvectorReal-Time IndexingNext.js Full-StackMachine Learning EngineeringUser Research & Product Development

Key Deliverables

Implement real-time vector embedding generation with sub-500ms latency
Build incremental HNSW index system for live note clustering
Create cosine similarity-based semantic clustering algorithm
Conduct multi-channel user research across Reddit, LinkedIn, and events (100+ users)
Optimize pgvector queries for 10,000+ notes with sub-second performance
Validate product-market fit with 67% weekly retention and 58 NPS

The Challenge: Intelligence in Note Organization

Modern knowledge workers face an increasingly acute challenge: managing vast amounts of information scattered across notes, documents, and digital fragments. Traditional note-taking applications rely on manual tagging, folder hierarchies, or keyword search—approaches that break down as note collections grow. Users spend precious time on organizational overhead that detracts from actual thinking and insight generation.

The fundamental problems were clear: manual categorization requires constant upfront decisions, keyword search misses semantically related content, static organization doesn't adapt as connections evolve, and discovery relies on remembering exact terminology. A user searching for 'budget planning' would never find notes about 'financial strategy' or 'cost allocation' because the tools only matched exact keywords.

The vision emerged from a simple insight: if AI can understand an entire codebase and assist developers contextually (as Cursor IDE does), why can't it understand a personal knowledge base and assist with note organization and discovery? This became the founding hypothesis for Cursor for Notes.

The Knowledge Management Problem

Manual categorization overhead, keyword search limitations, and static organization preventing users from discovering semantic connections in their knowledge bases.

The Solution: Semantic Understanding Through Vector Embeddings

Rather than forcing users into manual organizational frameworks, Cursor for Notes analyzes the semantic meaning of every note through vector embeddings—high-dimensional representations capturing conceptual content. Unlike keyword-based approaches, these embeddings represent the underlying concepts of a note, enabling true semantic understanding.

The product positioning was explicit: just as Cursor IDE provides context-aware assistance by understanding an entire codebase, Cursor for Notes understands the full semantic landscape of a knowledge base to provide real-time categorization suggestions, semantic discovery, and adaptive clustering that evolves as the knowledge base grows. Every note is automatically related to every other note through semantic similarity.

The technical insight that differentiated this solution was the ability to provide value in real-time as users type. Traditional vector search systems require batch processing—collect documents, generate embeddings, build indexes—incompatible with interactive note-taking. Cursor for Notes implemented an incremental re-indexing strategy where embeddings update immediately as content changes, with the HNSW index automatically adjusting its graph structure without full rebuilds.

Semantic Architecture

Vector embedding generation, incremental re-indexing, and real-time clustering enabling intelligent note organization as users type.

Technical Innovation: Real-Time Vector Indexing

Vector Embedding Generation

Every note is transformed into a high-dimensional vector embedding (1536 dimensions) that captures its semantic meaning. Notes with similar meanings produce similar vectors, even when using different vocabulary. A note about "marketing campaigns" and another about "advertising strategies" would have similar embeddings because the underlying concepts are semantically close. This enables genuine semantic search: searching "budget planning" surfaces "financial strategy" and "cost allocation" because these concepts share semantic space.

Incremental Re-Indexing Strategy

The breakthrough technical achievement was enabling real-time updates without blocking the UI. As users type: (1) Debounced embedding generation waits 500ms for a pause in typing to balance responsiveness with API efficiency. (2) Updated embeddings are written directly to PostgreSQL, replacing previous versions instantly. (3) The HNSW index incrementally adjusts its graph structure to accommodate new vectors without full rebuilds. (4) Background clustering recalculates note groupings using updated similarity scores. This approach provides near-instant categorization feedback—typically 50-100ms for index updates—without expensive full re-indexing operations.

Cosine Similarity-Based Clustering

The application uses cosine similarity (measuring the angle between vectors) as its distance metric for clustering. This metric ranges from -1 (opposite) to 1 (identical), with values above 0.7 indicating strong semantic similarity. The clustering algorithm: (1) Computes similarity matrix for each note against all others using PostgreSQL's optimized vector operations. (2) Identifies clusters by grouping notes with similarity scores above 0.72-0.75 (threshold empirically tuned through user testing). (3) Generates cluster labels by extracting common themes from central notes. (4) Updates assignments in real-time as content changes, ensuring organization reflects current semantic relationships. This dynamic approach means clusters evolve as the user's knowledge base grows, continually surfacing new connections and themes.

PostgreSQL pgvector Optimization

To achieve sub-second query performance across thousands of notes, the application leverages several pgvector optimizations: HNSW indexing provides approximate nearest neighbor search with tunable precision/recall tradeoffs (ef_search parameter enables 100-millisecond queries). Half-precision vector storage reduces memory footprint by 50% with minimal accuracy loss. Strategic use of SET LOCAL parameters optimizes vector operations for specific query patterns. Partitioning by user ID ensures query performance remains constant as the user base grows. These optimizations enable the application to maintain responsiveness even with growing knowledge bases.

Technical Implementation

Vector embedding generation, incremental HNSW indexing, and PostgreSQL pgvector optimization enabling sub-500ms latency at scale.

User Experience: Real-Time Intelligence

The technical architecture translates directly into a frictionless user experience. Users create notes naturally without manual categorization—the application handles intelligence invisibly. As they type, the sidebar displays semantically related notes and suggested categories, enabling serendipitous discovery. Notes automatically group into themes and topics based on content, with cluster assignments evolving as the knowledge base grows.

A knowledge worker researching market expansion might write a note about 'European growth opportunities.' The system instantly identifies related notes about 'regional market analysis,' 'competitive landscape studies,' and 'pricing strategies'—all connected through semantic understanding rather than explicit tags. A researcher accumulating literature notes sees their collection automatically organize into meta-themes: methodology discussions, theoretical frameworks, empirical findings, and citations.

The system provides immediate value from the first note through incremental clustering. Early users with just 3-5 notes see initial semantic connections. As the collection grows, the organization becomes richer. By note 50, users see sophisticated thematic clustering. Unlike systems with minimum thresholds ('requires 50 notes before clustering becomes useful'), Cursor for Notes provides value from day one through the incremental indexing approach.

User Research & Product Development

Multi-Channel Research Methodology

The product was developed through systematic user research across three channels. Reddit community engagement in r/productivity, r/Notion, and r/ObsidianMD provided qualitative feedback on pain points and feature validation. LinkedIn professional outreach targeted knowledge-intensive roles—consultants, researchers, writers—for structured interviews about current workflows and value proposition validation. In-person events at tech meetups and productivity workshops enabled usability testing with live demos and observation of real interaction friction. This multi-channel approach collected feedback from 100+ users representing diverse use cases, technical sophistication, and note-taking workflows.

Key Insights Shaping the Product

User research revealed critical insights that shaped the roadmap: (1) Manual categorization is a dealbreaker—users overwhelmingly rejected systems requiring explicit tagging or folder organization. The promise of automatic categorization was the primary driver of interest. (2) Search precision over recall—users preferred small, highly relevant result sets over comprehensive but noisy matches, informing cosine similarity threshold tuning. (3) Real-time feedback essential—batch processing or delayed categorization was perceived as 'not intelligent enough.' Users expected AI assistance to feel responsive like a live assistant. (4) Trust through transparency—users wanted to understand why notes grouped together, leading to similarity score badges and 'show related concepts' features explaining connections. (5) Incremental value from day one—early versions with minimum thresholds were rejected. The incremental indexing approach enabling value from the first note was critical for adoption.

Product-Market Fit Validation

The development followed a lean product cycle: (1) Problem validation confirmed note organization and discovery were genuine pain points. (2) Solution validation tested the semantic search concept with clickable prototypes before building the full vector infrastructure. (3) MVP launch released core features (embedding, search, basic clustering) to early adopters. (4) Iterative enhancement used feature analytics and qualitative feedback to prioritize the backlog. (5) Cohort analysis tracked retention and engagement across user segments. The metrics validated strong product-market fit: 67% of users returned within 7 days (vs 20-30% industry benchmark), 84% of active users used semantic search at least once per session, and average users created 47 notes in the first month.

Research-Driven Development

Multi-channel user research across Reddit, LinkedIn, and in-person events with 100+ users validating product-market fit.

Technical Challenges & Solutions

Performance Optimization

Real-time embedding generation initially took 2-3 seconds, too slow for responsive typing. Solution: API call optimization (batching requests), aggressive caching (reusing embeddings for unchanged sections), and connection pooling reduced latency to sub-500ms. HNSW index modifications optimized to 50-100ms through query parameter tuning. PostgreSQL query optimization using strategic index hints and query planning achieved consistent sub-second search performance across 10,000+ notes.

Threshold Tuning & Semantic Precision

Cosine similarity thresholds required empirical calibration through extensive testing. Too low (< 0.6) produced noisy clusters mixing unrelated concepts. Too high (> 0.85) missed valuable semantic connections. User testing revealed 0.72-0.75 as the optimal range, balancing precision with discovery. The threshold became configurable per user, allowing personalization as different users have different clustering preferences.

Cold Start Problem

New users with 1-3 notes don't benefit from clustering. Solution: Curated onboarding content suggesting note types and topics, combined with 'seed clusters' based on common topic tags. Suggested prompt templates help users create more semantically rich notes from the start. This enables early value while the knowledge base grows.

Model Selection & Cost

Balancing embedding quality, cost, and latency required evaluating multiple models (OpenAI, Cohere, open-source alternatives). Final selection prioritized a 1536-dimensional model offering optimal semantic capture for short-form text while managing API costs through aggressive caching and batching. Different embedding models have different characteristics—larger models provide better semantic understanding but higher latency; smaller models are faster but less precise.

Embedding Drift Over Time

As users' vocabularies and writing styles evolve, older embeddings can become stale. Solution: Periodic re-embedding of older notes (triggered monthly) maintains consistency. When new embedding models become available, strategic re-embedding enables model upgrades without breaking clustering. This ensures organization remains semantically accurate as users' knowledge bases evolve.

Engineering Solutions

Performance optimization, threshold tuning, cold start handling, and long-term embedding consistency enabling production reliability.

Market Validation & Results

User Acquisition: Organic growth through word-of-mouth and community engagement achieving 40% month-over-month growth during beta phase

User Retention: 67% weekly active user rate significantly exceeding industry benchmarks for productivity apps (typically 20-30%)

User Satisfaction: Net Promoter Score of 58, indicating strong satisfaction and willingness to recommend to others

Feature Adoption: 84% of active users used semantic search at least once per session, indicating strong engagement with core value proposition

Usage Depth: Average user created 47 notes in first month, demonstrating sustained engagement and meaningful value delivery

Market Positioning: Successfully differentiated from incumbent tools (Notion, Evernote, Obsidian) through AI-first positioning

Product-Market Fit: Early user feedback validated core hypotheses about pain points and willingness to pay for semantic intelligence

Market Impact

100+ active users, 67% weekly retention, 58 NPS, and organic growth validating strong product-market fit.

Technology Stack

Frontend & Application

Next.js 15 with React 19 for server-side rendering and optimistic UI updates
TypeScript for type-safe, maintainable frontend code
TailwindCSS 4 for responsive UI design
Framer Motion for smooth animations and transitions

Backend & Data Layer

Supabase with managed PostgreSQL hosting and real-time subscriptions
PostgreSQL with pgvector extension for vector storage and semantic search
HNSW indexing for efficient approximate nearest neighbor search
Node.js API routes for embedding generation and business logic

AI/ML Components

Vector embedding models (1536-dimensional) for semantic representation
Cosine similarity metrics for semantic matching and clustering
K-means and hierarchical clustering for note organization
Incremental re-indexing algorithm for real-time updates

Infrastructure & Deployment

Vercel for Next.js hosting, edge functions, and CDN distribution
Supabase for database hosting, authentication, and real-time APIs
GitHub Actions for CI/CD pipeline and automated testing
Monitoring with analytics and error tracking for production reliability

Strategic Significance

Cursor for Notes demonstrates how AI transforms traditional productivity tools through semantic understanding and real-time intelligence. By applying principles from developer tools (context-awareness, real-time assistance, semantic understanding) to note-taking, the application created a differentiated user experience that resonated with over 100 early users and achieved strong product-market fit signals.

The project showcases several critical competencies: practical application of vector embeddings and semantic search in production B2C environments; user-centric product development driven by multi-channel research and quantitative engagement metrics; scalable technical architecture combining Next.js, PostgreSQL with pgvector, and incremental indexing; systematic feedback collection and iteration based on actual user behavior.

The success validates a broader trend: AI assistance is expanding beyond coding into every domain where context-aware intelligence can reduce cognitive overhead and surface hidden connections. Just as Cursor IDE helps developers navigate codebases, Cursor for Notes helps knowledge workers navigate their own minds, turning scattered notes into a unified semantic knowledge space.

Transforming Knowledge Work

From manual organization to semantic intelligence enabling discovery and insight across personal knowledge bases.

Intelligence That Understands Context

Cursor for Notes proves that applying semantic AI to personal knowledge management fundamentally changes how people think, organize, and discover. By providing intelligent categorization, real-time suggestions, and semantic search, the application transforms note-taking from a task of manual organization into a process of interactive thinking with an intelligent assistant.

100+ Active Users

Strong early adoption validating product-market fit

67% Weekly Retention

3x industry benchmark for productivity applications

58 Net Promoter Score

Strong user satisfaction and willingness to recommend

Ready to bring your vision to life?

Let's collaborate on your next project with the same precision and innovation demonstrated in this case study.

Schedule a Meeting

Ready to discuss your project? Choose a convenient time to meet with us.

Contact Information

Schedule a consultation to discuss your software development needs. I'm here to help bring your ideas to life.

Location

San Francisco, California, USA

Email

silas@rhyneerconsulting.com

Phone

+1 (907) 406-8543

Your Name

Email Address

Phone Number

Timezone

Select a Date

October 2025

Explore Another Project

Compoze Labs AI Voice Avatar

Real-time AI coaching platform