Genesis

What /comprendre does, how to prompt it, what goes in, what comes out.
Two modes: full with Claude Haiku · degraded regex-only

1. What /comprendre does

POST /comprendre takes raw French text — a customer complaint, question, request — and returns a full classification with routing, urgency, churn risk, suggested actions, and an explainable audit trail.

It is the casual endpoint. Minimal input, maximum output. The client just vents. The system figures out where to send it and how fast.

Pipeline

Raw text (French, any length)
    |
    v
Extraction (Claude Haiku or regex fallback)
    intention, entites, emotion, mots-cles
    |
    v
Snake SAT classification (4 models, <4ms total)
    routing_model      -> service cible
    urgency_model      -> urgence
    sub_routing_model   -> sous-categorie
    churn_model        -> risque churn
    |
    v
JSON response
    routage + priorite + risque + actions + xai audit

2. How to prompt it

Minimal request

{
  "text": "J'attends ma commande depuis 10 jours."
}

With factory

{
  "text": "J'attends ma commande CMD-2026-1847 de feuillete 44.2 depuis le 3 avril. Chantier bloque. Troisieme appel. Je passe chez le concurrent.",
  "factory_id": 3
}

That's it. Two fields. text is required. factory_id defaults to 3 (Monce).

What makes a good prompt

Signal	Example in text	What it triggers
Order reference	`CMD-2026-1847`	Extracted as ref_commande, included in actions
Product name	`feuillete 44.2`	Extracted, matched against catalog (future)
Date/delay	`depuis 10 jours`, `le 3 avril`	retard_jours, date_prevue, urgency signal
Emotion	`inadmissible`, `bloque`, `!!`	frustration score, urgency bump
Churn signal	`concurrent`, `je pars`	menace_churn=true, churn risk elevated
Escalade	`troisieme fois`	escalade field, urgency bump
Domain keywords	`facture`, `livraison`, `technique`	Service routing signal

You don't need all of these. A single sentence works. The more signals, the sharper the classification.

3. Datastructure in

{
  "text": string,       // Required. The raw customer message.
  "factory_id": int     // Optional. Default: 3 (Monce). Values: 1 (VIT), 3, 4 (VIP), 9 (Euro), 10 (TGVI).
}

Content-Type: application/json. That's the full schema.

4. Datastructure out

{
  "factory_id": 3,
  "version": "v0.1.0",
  "mode": "full" | "degraded",            // full = Haiku extraction, degraded = regex
  "haiku_enabled": true | false,           // Is ANTHROPIC_API_KEY configured?
  "extraction_method": "claude_haiku" | "regex_fallback",
  "mock": false,                           // true only if Snake models missing (fallback heuristics)
  "extraction": {                          // Raw extraction output (intermediate step)
    "intention_primaire": "reclamation_livraison",
    "intention_secondaire": "menace_depart" | null,
    "entites": {
      "ref_commande": "CMD-2026-1847" | null,
      "produit_mentionne": "feuillete 44.2" | null,
      "date_livraison_prevue": "2026-04-03" | null,
      "retard_jours": 7 | null,
      "lieu": null
    },
    "contexte_emotionnel": {
      "frustration": 0.92,                 // float 0.0-1.0
      "urgence_percue": "critique",        // critique/haute/normale/basse
      "menace_churn": true,
      "escalade": "troisieme contact" | null
    },
    "mots_cles": ["attends", "bloque", ...],
    "action_attendue": "reponse immediate"
  },
  "routage": {
    "service": {
      "Prediction": "Logistique",
      "Probability": {"Logistique": 0.985, "SAV": 0.015, ...},
      "method": "snake_sat",              // snake_sat | fallback
      "tier": 1
    },
    "sous_categorie": { ... },             // Same structure
    "urgence": { ... },
    "priorite": 1                          // 1 (P1 critical) to 4 (low)
  },
  "risque_client": {
    "churn": {
      "Prediction": "Risque eleve",
      "Probability": {"Risque eleve": 0.926, "Risque modere": 0.074, "Pas de risque": 0.0}
    },
    "facteurs": [
      "Menace explicite de depart",
      "Escalade : troisieme contact",
      "Retard 7 jours sur date ferme"
    ]
  },
  "refs_commande": { ... },               // Extracted order/product refs
  "actions_suggerees": [
    {"priorite": 1, "action": "Rappel immediat par responsable logistique", "delai": "< 2h"},
    {"priorite": 2, "action": "Verifier statut CMD-2026-1847 dans ERP", "delai": "immediat"}
  ],
  "quality_score": 0.90,                  // 0.90 with Haiku, 0.55 with regex
  "latency_ms": 680,
  "xai": {
    "routing_audit": "Route Logistique: intention 'reclamation_livraison', ...",
    "urgence_audit": "Critique: frustration 0.92, ...",
    "churn_audit": "Risque eleve: menace_churn=True, ..."
  }
}

The response is always valid JSON with the same structure, regardless of mode. What changes between modes is the quality, not the shape.

5. Two modes: full vs degraded

full — Haiku enabled

Extraction: Claude Haiku (claude-haiku-4-5-20251001) reads the message, understands context, extracts structured fields with semantic understanding.

Quality: 0.90

Latency: ~680ms (dominated by LLM call)

Strengths: understands nuance, sarcasm, implicit references, complex sentences. Extracts dates, product names, and emotional tone with high accuracy.

degraded — Regex fallback

Extraction: pattern matching on the raw text. Regex for CMD references, product names, dates, frustration keywords, churn signals.

Quality: 0.55

Latency: <5ms

Strengths: zero external dependency, instant, always works. Good enough for clear signals (CMD refs, explicit "concurrent", product names).

Weaknesses: misses nuance, can't parse complex sentences, doesn't understand implicit frustration or indirect churn signals.

The mode and extraction_method fields in the response tell you which path was taken. The quality_score reflects extraction confidence — Snake classification quality is the same in both modes (same models, same AUROC).

error_fallback — Everything failed

If both Haiku AND regex fail (shouldn't happen — regex can't crash on valid text), the endpoint still returns a valid JSON payload with uniform probabilities and a manual-review action. Never a 500.

6. Setting up the API key on EC2

From your .zshrc

Your .zshrc has the key as ANTHROPIC_API_KEY (currently commented out). To push it to the EC2:

# 1. Uncomment in .zshrc or export directly
export ANTHROPIC_API_KEY="your-key-here"

# 2. SSH in and set it in the systemd service
ssh -i ~/.ssh/vlm-extraction-key.pem ubuntu@13.36.166.132

# 3. Add the Environment line to the service file
sudo systemctl edit requestclassifier --force
# Add under [Service]:
#   Environment="ANTHROPIC_API_KEY=your-key-here"

# 4. Restart
sudo systemctl daemon-reload
sudo systemctl restart requestclassifier

Using AWS_BEARER_TOKEN from .zshrc

If you prefer to use your AWS_BEARER_TOKEN_BEDROCK instead of a direct Anthropic key, the extraction module checks ANTHROPIC_API_KEY. To bridge:

# Option A: Set ANTHROPIC_API_KEY directly in the systemd unit
sudo tee /etc/systemd/system/requestclassifier.service.d/override.conf <<EOF
[Service]
Environment="ANTHROPIC_API_KEY=your-key"
EOF

# Option B: Use a .env file
echo 'ANTHROPIC_API_KEY=your-key' | sudo tee /opt/requestclassifier/.env
# Then add to service: EnvironmentFile=/opt/requestclassifier/.env

One-liner deploy with key

# From your Mac, push the key from your current shell env:
ssh -i ~/.ssh/vlm-extraction-key.pem ubuntu@13.36.166.132 \
  "sudo mkdir -p /etc/systemd/system/requestclassifier.service.d && \
   echo '[Service]' | sudo tee /etc/systemd/system/requestclassifier.service.d/override.conf && \
   echo 'Environment="ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}"' | sudo tee -a /etc/systemd/system/requestclassifier.service.d/override.conf && \
   sudo systemctl daemon-reload && \
   sudo systemctl restart requestclassifier"

Verify

# Check which mode is active
curl -s https://requestclassifier.aws.monce.ai/comprendre \
  -H 'Content-Type: application/json' \
  -d '{"text": "test"}' | python3 -m json.tool | grep -E '"mode"|"haiku_enabled"|"extraction_method"'

# Expected with key:
#   "mode": "full",
#   "haiku_enabled": true,
#   "extraction_method": "claude_haiku"

# Expected without key:
#   "mode": "degraded",
#   "haiku_enabled": false,
#   "extraction_method": "regex_fallback"

7. curl examples

Basic

curl -s https://requestclassifier.aws.monce.ai/comprendre \
  -H 'Content-Type: application/json' \
  -d '{"text": "J attends ma commande depuis 10 jours"}'

Full example (README scenario)

curl -s https://requestclassifier.aws.monce.ai/comprendre \
  -H 'Content-Type: application/json' \
  -d '{
    "text": "Ca fait 10 jours que j attends ma commande CMD-2026-1847 de feuillete 44.2 qui devait etre livree le 3 avril. J ai un chantier qui est bloque a cause de vous, mes poseurs sont sur place et ne peuvent rien faire. C est la troisieme fois que j appelle. Si je n ai pas de reponse aujourd hui je passe chez le concurrent.",
    "factory_id": 3
  }' | python3 -m json.tool

Python

import httpx

r = httpx.post("https://requestclassifier.aws.monce.ai/comprendre", json={
    "text": "Ma facture F-2026-3421 ne correspond pas au bon de commande. Ecart de 2400 euros.",
    "factory_id": 3,
})
data = r.json()
print(data["routage"]["service"]["Prediction"])    # "Comptabilite"
print(data["routage"]["urgence"]["Prediction"])     # "Haute"
print(data["mode"])                                  # "full" or "degraded"

8. Trust score (0-100)

Every response includes a trust_score object that assesses how confident you should be in the result. It's a composite of four independent signals:

Component	Weight	What it measures	Range
extraction_quality	25%	How good is the extraction method? Haiku=90, regex=45, error=0.	0-90
signal_density	25%	How many extraction fields are populated? CMD ref, product, retard, frustration, escalade, menace, keywords, date, intention, secondary intention.	0-100
classification_confidence	35%	Average top-class probability across the 4 Snake models. High = models are decisive. Low = models are unsure.	0-100
model_agreement	15%	Do urgence and churn tell the same story? Both high or both low = 100. Contradictory (one says fire, other says calm) = 40.	40-100

The composite score drives an interpretation label:

Score	Label	Meaning
75-100	high	Strong signals, confident models, consistent story. Act on it.
50-74	medium	Decent but incomplete. Some signals missing or models less certain. Review before escalating.
25-49	low	Weak extraction, vague message, or conflicting signals. Treat as hint, not decision.
0-24	very_low	Almost no usable signal. Manual review required.

Example trust scores

{
  "trust_score": {
    "score": 78,
    "components": {
      "extraction_quality": 45,    // regex mode
      "signal_density": 90,        // 9/10 fields populated
      "classification_confidence": 84,  // models are decisive
      "model_agreement": 100       // urgence=Critique + churn=Risque eleve = aligned
    },
    "interpretation": "high"
  }
}

How to use it: A trust score of 78 means "the regex found plenty of signals, the models are confident, and everything points the same direction — this is a real P1 churn risk." A score of 52 means "vague message, few signals, classification is a best guess — maybe route it but don't auto-escalate."

9. What the outcome will be

With Haiku enabled

Metric	Value
Extraction	Semantic — understands context, nuance, implicit signals
Quality score	0.90
Latency	~680ms
Routing AUROC	0.990
Urgency AUROC	0.979
Churn AUROC	0.901
Cost	~$0.0003/request (Haiku pricing)

Without Haiku (regex only)

Metric	Value
Extraction	Pattern matching — catches explicit signals only
Quality score	0.55
Latency	<5ms
Routing AUROC	0.990 (same models)
Urgency AUROC	0.979 (same models)
Churn AUROC	0.901 (same models)
Cost	$0.00/request

Key insight: Snake classification quality is identical in both modes — same models, same AUROC. The difference is in extraction quality. Regex gives Snake noisier features. The quality_score reflects this: 0.90 vs 0.55 is about extraction confidence, not classification accuracy.

In practice, degraded mode handles 70-80% of clear-cut cases correctly (explicit references, obvious domain keywords, clear frustration). It struggles with: ambiguous messages, indirect complaints, sarcasm, complex multi-topic messages, and implicit churn signals.

10. Endpoint summary

Endpoint	Method	Auth	Description
`/comprendre`	POST	None	Casual classification — text in, full analysis out
`/classify`	POST	None	Structured classification — full schema with client_id, historique
`/health`	GET	None	Health check
`/`	GET	None	Landing page with live demo
`/genesis`	GET	None	This page — /comprendre documentation
`/paper`	GET	None	Technical paper — Dana Theorem, models, learning curve
`/businesssummary`	GET	None	Non-technical pitch
`/docs`	GET	None	Auto-generated Swagger