Chatbot তৈরি — মেশিন লার্নিং

Hook — LLM যুগের App

চ্যাটবট আজকাল কয়েক লাইনে বানানো যায়। কিন্তু একটা *ভালো* chatbot মানে state, memory, tool use, guardrail — সব মিলিয়ে একটা mini-system।

Scope ঠিক করো

Domain — general, customer support, tutor, code helper?
Persona — কেমন কথা বলবে, কোন language?
Memory — শুধু session, না user-level persistent?
Tools — search, calculator, database query?
Safety — কোন topic refuse করবে?

Architecture

high-level

User ↔ FastAPI ↔ Orchestrator
              │
   ┌──────────┼──────────┐
 System    Memory     Tool/RAG
 Prompt    Store      Layer
              │
            LLM API

System Prompt Pattern

system.py

SYSTEM = """You are Koro, a friendly Bangla-English tutor.
- Always reply in the user's language.
- Be concise (under 4 sentences).
- If unsure, say so and ask a clarifying question.
- Refuse: medical, legal, financial personalized advice.
"""

Memory — Short + Long

Short-term: last N turn buffer।
Summary memory: long conversation auto-summarize।
Long-term: user facts → vector DB (pgvector)।
Retrieval: প্রতি turn এ relevant fact pull করে context এ insert।

Code — Streaming Chat Endpoint

chat.py

from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from pydantic import BaseModel
from openai import OpenAI

client = OpenAI()
app = FastAPI()
SESSIONS: dict[str, list[dict]] = {}

class Msg(BaseModel):
    session_id: str
    text: str

@app.post("/chat")
def chat(m: Msg):
    history = SESSIONS.setdefault(m.session_id, [{"role":"system","content":SYSTEM}])
    history.append({"role":"user","content":m.text})

    def stream():
        full = ""
        for chunk in client.chat.completions.create(
            model="gpt-4o-mini", messages=history, stream=True
        ):
            delta = chunk.choices[0].delta.content or ""
            full += delta
            yield delta
        history.append({"role":"assistant","content":full})

    return StreamingResponse(stream(), media_type="text/plain")

Tool Use — Function Calling

tools.py

tools = [{
  "type":"function",
  "function":{
    "name":"get_weather",
    "description":"Get weather for a city",
    "parameters":{"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}
  }
}]

resp = client.chat.completions.create(
    model="gpt-4o-mini", messages=history, tools=tools, tool_choice="auto"
)
if resp.choices[0].message.tool_calls:
    call = resp.choices[0].message.tool_calls[0]
    result = run_tool(call.function.name, call.function.arguments)
    history.append({"role":"tool","tool_call_id":call.id,"content":result})

Guardrail & Safety

Input filter — PII / prompt injection check।
Output filter — toxicity, regex secrets।
Refusal templates — soft, helpful redirect।
Rate limit + cost cap per user।
Log every conversation (with consent) for QA।

Evaluation

Offline: golden Q/A set → exact / LLM-as-judge।
Online: thumbs-up rate, conversation length, retention।
Red-teaming: jailbreak prompt regularly test।

Summary

এক নজরে

Chatbot = Prompt + Memory + Tool + Guardrail। Streaming UX আবশ্যক। Evaluation skip করলে hallucination silent ভাবে বাড়ে।