Transforming raw institutional knowledge into accurate, scalable conversational and search experiences. I combine product ownership (citizen & civil servant chat platforms) with deep engineering across retrieval, performance, security, and infrastructure.
High-precision hybrid retrieval (graph + vector + structured filters) β’ Production chatbot ownership β’ Performance + security baked into lifecycle β’ Resilient active-active infrastructure.
| Area | Repo(s) | Summary |
|---|---|---|
| Graph & Retrieval | neo4j-document-pipeline | Graph ingestion + retrieval API for LLM workflows |
| Vector Benchmarks | tidb-vector-llm-testbed | Hybrid scoring, indexing & relevance experiments |
| Embedding Pipelines | mysql-to-pgvector-embeddings | MySQL β embeddings β pgVector semantic layer |
| FAQ Base | faq-retrieval-system | Structured query layer powering GPT-style retrieval |
| Performance | playwright-dayang β’ k6-for-custom-dify | Chatbot UX & API load test suites |
| Security Automation | zap-security-api | OWASP ZAP scan API (baseline/quick/full) |
| Experimentation | playwright-study β’ besu-ibft2.0 | Testing paradigms & consensus exploration |
Each project illustrates a stage of the lifecycle: ingestion β enrichment β retrieval β validation.
- Dayang chatbot (citizen portal): Product ownership, retrieval tuning, performance modeling.
- Civil servant assistant: Document-grounded Q&A with authoritative source controls.
- Load strategy: Concurrency thresholds, ramp profiles (k6, Locust).
- Active-active infra: Alibaba Cloud ECS, Nginx routing, SSL/TLS hardening.
- Security automation: OWASP ZAP scans exposed via API for CI/CD or ad hoc use.
| Strength | Why It Matters |
|---|---|
| Hybrid Retrieval Engineering | Precision lift vs pure vector recall |
| Product + Engineering Fusion | Faster iteration, fewer handoff losses |
| Embedded Performance & Security | Prevents late-stage surprises |
| Government Domain Exposure | Designs for trust & high factual accuracy |
| Benchmark-Driven Choices | Tech decisions backed by measurable outcomes |
| Area | Example Outcome |
|---|---|
| Retrieval Accuracy | ~30β40% fewer irrelevant answers (hybrid approach) |
| Performance Readiness | Concurrency limits defined pre-launch (no collapse) |
| Security Cycle Time | Hours β minutes via automated ZAP API |
| Resilience | Active-active reduces failover disruption window |
- Multi-pass hybrid ranking (graph traversal + semantic rerank)
- Domain-adaptive embedding strategies
- Unified Playwright + k6 harness (UX + load synergy)
- Retrieval explainability overlays & confidence shaping
Show Retrieval Flow Diagram
flowchart LR
A[Data Sources<br/>MySQL β’ Docs β’ FAQs] --> B[Normalize & Clean]
B --> C[Embeddings Generation]
B --> D[Graph Modeling (Neo4j)]
C --> E[(Vector Store<br/>pgVector / TiDB)]
D --> F[Graph Relations]
E --> G[Hybrid Retrieval Layer]
F --> G
G --> H[LLM Orchestrator]
H --> I[Post-Processing & Ranking]
I --> J[Chatbot / API Consumers]
subgraph QUALITY_GATES[Quality Gates]
K[Performance Tests]
L[Security Scans (ZAP)]
end
J --> K
J --> L
If Mermaid doesn't render, click βRawβ or use a Mermaid viewer.
Ideal matches: AI Infrastructure Engineer β’ Retrieval Engineer β’ Platform Engineer (LLM enablement) β’ Technical Product Owner (Knowledge Systems).
Traits brought: architecture clarity β’ benchmark discipline β’ product empathy β’ automation-first mindset.
- LinkedIn: https://www.linkedin.com/in/nurhajjariahk/
- Email: nurhajjariahk@gmail.com
Build systems that are observable, evolvable, and grounded in measurable user impact β not novelty.
Thanks for visiting β letβs build something meaningful.

