⚗ ◈ ⚗

Alkembic

Product Concept Document  ·  Version 0.1  ·  Internal Use

Distilling raw knowledge into structured insight — offline, private, yours.

The Fragmented Knowledge Problem

Scrum Masters and delivery leads operate across a fragmented information landscape. Critical knowledge — decisions made in ceremonies, blockers raised in standups, action items from retrospectives, planning notes from PI sessions — ends up scattered across Confluence pages, Slack threads, Teams transcripts, Excel trackers, and SharePoint folders.

There is no single place to ask: "What did we decide about this last sprint?" or "What recurring blockers has this team raised?"

The result is knowledge loss, repeated discussions, and heavy reliance on individual memory rather than team institutional knowledge.

Alkembic solves this by acting as a personal data distillation layer — a lightweight tool that ingests raw inputs from these sources, structures them into typed artifacts, and makes them queryable through a fast offline search interface. No cloud AI dependency. No external API calls. Fully functional inside locked client environments.

How Alkembic is Built

Alkembic is a fully client-side JavaScript application with no backend server requirement. All processing, storage, and search happens in the browser or local Node environment.

Core Layers

Ingestion Layer — Accepts raw input via paste, file upload (CSV, JSON, TXT), or manual form entry. User adds minimal context: a title, data type, and short description. The content is parsed and normalized before storage.

Artifact Storage — Each entry is saved as a structured artifact object containing Title, Description, Data Type, Raw Content, Tags, Labels, and Timestamp. Stored in IndexedDB for persistence across sessions without a backend.

Search & Indexing — Two-tier search approach. Lunr.js handles keyword and boolean search across artifact fields. Transformers.js (all-MiniLM-L6-v2 via ONNX) handles semantic similarity search when the environment permits the one-time model download.

Query Output Layer — Search results are rendered as ranked bullet points or structured tables. No text generation — output is retrieved and ranked artifacts in a readable format.

Tech Stack

Layer Library / Tool Purpose
UI FrameworkVanilla JS or lightweight ReactInterface rendering, state management
Full-Text SearchLunr.js / MiniSearchKeyword + boolean search across artifacts
Semantic SearchTransformers.js (HuggingFace ONNX)Sentence embeddings for similarity search
Local StorageIndexedDB (via idb wrapper)Persistent artifact + vector storage
File ParsingPapa Parse, native JSON, plain textCSV, JSON, TXT ingestion
Fuzzy MatchingFuse.jsTypo-tolerant search fallback

Three Phases of Development

Phase 01

Data Capture — Alkembic Core

Reliable ingestion and structured storage of raw SM artifacts.

  • Paste interface
  • Title + Tags + Data Type form
  • TXT / JSON / CSV / TABLE support
  • Preview before save
  • IndexedDB persistence
  • Keyword search via Lunr.js
  • Export all as JSON
Phase 02

Search & Retrieval — Query Layer

Make stored artifacts queryable in a useful way.

  • Field-boosted full-text search
  • Tag-based filtering
  • Date range filtering
  • Fuzzy search for typos
  • Bullet / table result views
  • Semantic search (optional)
Phase 03

Insight Layer — Patterns & Trends

Surface patterns and summaries from accumulated artifacts.

  • Keyword frequency analysis
  • Sprint / week timeline view
  • Tag cloud + trend view
  • Rule-based summarization
  • Export filtered results (CSV / report)

How Data Moves Through Alkembic

Ingestion Flow

User Input
    
    
[ Paste / Upload / Type Raw Content ]
    
    
[ Add Context: Title → Data Type → Tags → Description ]
    
    
[ Preview Parsed Structure ]
    
    ├── Looks good ──▶ [ Save Artifact ]
                            
                            
                   [ IndexedDB Storage ]
                            
                ┌───────────┴────────────┐
                                        
       [ Lunr.js Index ]     [ Transformers.js Vector ]
    
    └── Needs edit ──▶ [ Back to Form ]

Query Flow

User Query
    
    
[ Search Box — keyword or natural phrase ]
    
    
[ Lunr.js keyword match + optional vector similarity score ]
    
    
[ Ranked Artifact Results ]
    
    
[ Display: Bullet Points / Table / Expanded Card View ]

Design Decisions Under Constraints

Constraint No external AI APIs in client environment
Decision

All intelligence is local. Generative responses — summaries, rewrites — are out of scope. The tool provides structured retrieval, not generation. This is sufficient for the primary SM use case: finding what was said, not paraphrasing it.

Constraint No backend server permitted
Decision

IndexedDB is the persistence layer. All artifacts live in the browser. An export-to-JSON feature ensures data is never trapped. If a Node or Electron wrapper is permitted in a future phase, SQLite can replace IndexedDB with minimal code change.

Constraint Transformers.js model requires one-time download (~23MB)
Decision

Semantic search is treated as an optional enhancement. Phase 1 and 2 work entirely without it using Lunr.js. If the environment permits the model fetch, it is cached locally via the browser Cache API and never re-downloaded. If not permitted, fuzzy keyword search via MiniSearch covers most retrieval needs adequately.

Constraint No live access to Confluence, Slack, or SharePoint APIs
Decision

Alkembic is copy-paste and file-upload first. Users export content from those tools manually and paste into Alkembic. This is intentional — it keeps the tool dependency-free and works in any environment. Automated connectors can be added in a future phase if API access is ever granted.