Pipeline UI
Dashboard
New Client
DrugInteraction
cms_hospitals
diet
ecommerce
evanstoncity
exploreminnesota
gdpr
gt
gtMay16
gt_es
gt_local
mass_dcr2
mass_dcr_parks
miami_dade
mndnr2
mndnr3
mndnr4
mndnr5
mndnr6
mndnr9
mndnr_local
nps
nps_hybrid
nps_prose
ny_golf
ridb2
ridb2026
ridb3
ridb_scale
Pipeline Management UI
New Client Setup
Basics
Client Name
Folder: clients/<name>/
KB Backend
Elasticsearch
RAGFlow
Embedding Model
text-embedding-3-large
text-embedding-3-small
Domain
Primary Domain
Auto-detect from documents
🏋️ Recreation — classes, programs, memberships
🏕 Outdoor Recreation — parks, camping, trails, passes
🏥 Healthcare — medications, interactions, clinical
🛒 E-commerce — products, pricing, reviews
⚖️ Legal — regulations, compliance, requirements
🎓 Education — courses, curriculum, academic
💼 HR/Benefits — employee benefits, policies
📞 Customer Support — KB articles, troubleshooting
⚽ Sports & Fitness — leagues, teams, schedules
🌳 Parks & Recreation — municipal parks departments
✏️ Other — custom domain
Secondary Domains
(optional)
Recreation
Outdoor Recreation
E-commerce
Sports & Fitness
Hold Ctrl/Cmd to select multiple
Entity Types
Relationships
Enrichment
Domain Settings
Elasticsearch Settings
ES KB Index Name
Created automatically when pipeline runs with --kb
LLM Model
GPT-4o-mini (OpenAI)
Claude Haiku (Anthropic)
RAGFlow Settings
RAGFlow URL
RAGFlow API Key
Dataset Prefix
Chat Name
Knowledge Base Sources
(optional)
Website URL(s) to crawl
Crawl Depth
1 — seed pages only
2 — follow links one level
3 — two levels deep
Follow Pattern
Only follow links matching this path
— OR — Local docs folder
PDF, HTML, or text files to extract from
— OR — Pre-chunked KB file
JSON file with pre-chunked content + metadata
Structured Data Scraper
(optional)
If the org has a class registration system (CivicRec, Innosoft, etc.)
Registration URL
Platform Adapter
None — no scraper
activenet
amilia
civicrec
exploreminnesota
innosoft
perfectmind
recdesk
usedirect
vermont
yodelpass
Chat Proxy (advanced)
Enable
Scheduled Operations
(optional)
Enable scheduled ops
Scrape interval (hours)
Test interval (hours)
Create Client
Cancel