⚗ ◈ ⚗

Alkembic

Product Concept Document · Version 0.1 · Internal Use

Distilling raw knowledge into structured insight — offline, private, yours.

⚗

01 Problem Statement

The Fragmented Knowledge Problem

Scrum Masters and delivery leads operate across a fragmented information landscape. Critical knowledge — decisions made in ceremonies, blockers raised in standups, action items from retrospectives, planning notes from PI sessions — ends up scattered across Confluence pages, Slack threads, Teams transcripts, Excel trackers, and SharePoint folders.

There is no single place to ask: "What did we decide about this last sprint?" or "What recurring blockers has this team raised?"

The result is knowledge loss, repeated discussions, and heavy reliance on individual memory rather than team institutional knowledge.

Alkembic solves this by acting as a personal data distillation layer — a lightweight tool that ingests raw inputs from these sources, structures them into typed artifacts, and makes them queryable through a fast offline search interface. No cloud AI dependency. No external API calls. Fully functional inside locked client environments.

02 Architecture & Tech Stack

How Alkembic is Built

Alkembic is a fully client-side JavaScript application with no backend server requirement. All processing, storage, and search happens in the browser or local Node environment.

Core Layers

Ingestion Layer — Accepts raw input via paste, file upload (CSV, JSON, TXT), or manual form entry. User adds minimal context: a title, data type, and short description. The content is parsed and normalized before storage.

Artifact Storage — Each entry is saved as a structured artifact object containing Title, Description, Data Type, Raw Content, Tags, Labels, and Timestamp. Stored in IndexedDB for persistence across sessions without a backend.

Search & Indexing — Two-tier search approach. Lunr.js handles keyword and boolean search across artifact fields. Transformers.js (all-MiniLM-L6-v2 via ONNX) handles semantic similarity search when the environment permits the one-time model download.

Query Output Layer — Search results are rendered as ranked bullet points or structured tables. No text generation — output is retrieved and ranked artifacts in a readable format.

Tech Stack

Layer	Library / Tool	Purpose
UI Framework	Vanilla JS or lightweight React	Interface rendering, state management
Full-Text Search	Lunr.js / MiniSearch	Keyword + boolean search across artifacts
Semantic Search	Transformers.js (HuggingFace ONNX)	Sentence embeddings for similarity search
Local Storage	IndexedDB (via idb wrapper)	Persistent artifact + vector storage
File Parsing	Papa Parse, native JSON, plain text	CSV, JSON, TXT ingestion
Fuzzy Matching	Fuse.js	Typo-tolerant search fallback

03 Phased Roadmap

Three Phases of Development

Phase 01

Data Capture — Alkembic Core

Reliable ingestion and structured storage of raw SM artifacts.

Paste interface
Title + Tags + Data Type form
TXT / JSON / CSV / TABLE support
Preview before save
IndexedDB persistence
Keyword search via Lunr.js
Export all as JSON

Phase 02

Search & Retrieval — Query Layer

Make stored artifacts queryable in a useful way.

Field-boosted full-text search
Tag-based filtering
Date range filtering
Fuzzy search for typos
Bullet / table result views
Semantic search (optional)

Phase 03

Insight Layer — Patterns & Trends

Surface patterns and summaries from accumulated artifacts.

Keyword frequency analysis
Sprint / week timeline view
Tag cloud + trend view
Rule-based summarization
Export filtered results (CSV / report)

04 Data Flow / Wireflow

How Data Moves Through Alkembic

Ingestion Flow

User Input
    │
    ▼
[ Paste / Upload / Type Raw Content ]
    │
    ▼
[ Add Context: Title → Data Type → Tags → Description ]
    │
    ▼
[ Preview Parsed Structure ]
    │
    ├── Looks good ──▶ [ Save Artifact ]
    │                        │
    │                        ▼
    │               [ IndexedDB Storage ]
    │                        │
    │            ┌───────────┴────────────┐
    │            ▼                        ▼
    │   [ Lunr.js Index ]     [ Transformers.js Vector ]
    │
    └── Needs edit ──▶ [ Back to Form ]

Query Flow

User Query
    │
    ▼
[ Search Box — keyword or natural phrase ]
    │
    ▼
[ Lunr.js keyword match + optional vector similarity score ]
    │
    ▼
[ Ranked Artifact Results ]
    │
    ▼
[ Display: Bullet Points / Table / Expanded Card View ]

05 Offline Constraints & Decisions

Design Decisions Under Constraints

Constraint No external AI APIs in client environment

Decision

All intelligence is local. Generative responses — summaries, rewrites — are out of scope. The tool provides structured retrieval, not generation. This is sufficient for the primary SM use case: finding what was said, not paraphrasing it.

Constraint No backend server permitted

Decision

IndexedDB is the persistence layer. All artifacts live in the browser. An export-to-JSON feature ensures data is never trapped. If a Node or Electron wrapper is permitted in a future phase, SQLite can replace IndexedDB with minimal code change.

Constraint Transformers.js model requires one-time download (~23MB)

Decision

Semantic search is treated as an optional enhancement. Phase 1 and 2 work entirely without it using Lunr.js. If the environment permits the model fetch, it is cached locally via the browser Cache API and never re-downloaded. If not permitted, fuzzy keyword search via MiniSearch covers most retrieval needs adequately.

Constraint No live access to Confluence, Slack, or SharePoint APIs

Decision

Alkembic is copy-paste and file-upload first. Users export content from those tools manually and paste into Alkembic. This is intentional — it keeps the tool dependency-free and works in any environment. Automated connectors can be added in a future phase if API access is ever granted.