Genesis

What /comprendre does, how to prompt it, what goes in, what comes out.
Two modes: full with Claude Haiku · degraded regex-only

1. What /comprendre does

POST /comprendre takes raw French text — a customer complaint, question, request — and returns a full classification with routing, urgency, churn risk, suggested actions, and an explainable audit trail.

It is the casual endpoint. Minimal input, maximum output. The client just vents. The system figures out where to send it and how fast.

Pipeline

Raw text (French, any length)
    |
    v
Extraction (Claude Haiku or regex fallback)
    intention, entites, emotion, mots-cles
    |
    v
Snake SAT classification (4 models, <4ms total)
    routing_model      -> service cible
    urgency_model      -> urgence
    sub_routing_model   -> sous-categorie
    churn_model        -> risque churn
    |
    v
JSON response
    routage + priorite + risque + actions + xai audit

2. How to prompt it

Minimal request

{
  "text": "J'attends ma commande depuis 10 jours."
}

With factory

{
  "text": "J'attends ma commande CMD-2026-1847 de feuillete 44.2 depuis le 3 avril. Chantier bloque. Troisieme appel. Je passe chez le concurrent.",
  "factory_id": 3
}

That's it. Two fields. text is required. factory_id defaults to 3 (Monce).

What makes a good prompt

SignalExample in textWhat it triggers
Order referenceCMD-2026-1847Extracted as ref_commande, included in actions
Product namefeuillete 44.2Extracted, matched against catalog (future)
Date/delaydepuis 10 jours, le 3 avrilretard_jours, date_prevue, urgency signal
Emotioninadmissible, bloque, !!frustration score, urgency bump
Churn signalconcurrent, je parsmenace_churn=true, churn risk elevated
Escaladetroisieme foisescalade field, urgency bump
Domain keywordsfacture, livraison, techniqueService routing signal

You don't need all of these. A single sentence works. The more signals, the sharper the classification.

3. Datastructure in

{
  "text": string,       // Required. The raw customer message.
  "factory_id": int     // Optional. Default: 3 (Monce). Values: 1 (VIT), 3, 4 (VIP), 9 (Euro), 10 (TGVI).
}

Content-Type: application/json. That's the full schema.

4. Datastructure out

{
  "factory_id": 3,
  "version": "v0.1.0",
  "mode": "full" | "degraded",            // full = Haiku extraction, degraded = regex
  "haiku_enabled": true | false,           // Is ANTHROPIC_API_KEY configured?
  "extraction_method": "claude_haiku" | "regex_fallback",
  "mock": false,                           // true only if Snake models missing (fallback heuristics)
  "extraction": {                          // Raw extraction output (intermediate step)
    "intention_primaire": "reclamation_livraison",
    "intention_secondaire": "menace_depart" | null,
    "entites": {
      "ref_commande": "CMD-2026-1847" | null,
      "produit_mentionne": "feuillete 44.2" | null,
      "date_livraison_prevue": "2026-04-03" | null,
      "retard_jours": 7 | null,
      "lieu": null
    },
    "contexte_emotionnel": {
      "frustration": 0.92,                 // float 0.0-1.0
      "urgence_percue": "critique",        // critique/haute/normale/basse
      "menace_churn": true,
      "escalade": "troisieme contact" | null
    },
    "mots_cles": ["attends", "bloque", ...],
    "action_attendue": "reponse immediate"
  },
  "routage": {
    "service": {
      "Prediction": "Logistique",
      "Probability": {"Logistique": 0.985, "SAV": 0.015, ...},
      "method": "snake_sat",              // snake_sat | fallback
      "tier": 1
    },
    "sous_categorie": { ... },             // Same structure
    "urgence": { ... },
    "priorite": 1                          // 1 (P1 critical) to 4 (low)
  },
  "risque_client": {
    "churn": {
      "Prediction": "Risque eleve",
      "Probability": {"Risque eleve": 0.926, "Risque modere": 0.074, "Pas de risque": 0.0}
    },
    "facteurs": [
      "Menace explicite de depart",
      "Escalade : troisieme contact",
      "Retard 7 jours sur date ferme"
    ]
  },
  "refs_commande": { ... },               // Extracted order/product refs
  "actions_suggerees": [
    {"priorite": 1, "action": "Rappel immediat par responsable logistique", "delai": "< 2h"},
    {"priorite": 2, "action": "Verifier statut CMD-2026-1847 dans ERP", "delai": "immediat"}
  ],
  "quality_score": 0.90,                  // 0.90 with Haiku, 0.55 with regex
  "latency_ms": 680,
  "xai": {
    "routing_audit": "Route Logistique: intention 'reclamation_livraison', ...",
    "urgence_audit": "Critique: frustration 0.92, ...",
    "churn_audit": "Risque eleve: menace_churn=True, ..."
  }
}

The response is always valid JSON with the same structure, regardless of mode. What changes between modes is the quality, not the shape.

5. Two modes: full vs degraded

full — Haiku enabled

Extraction: Claude Haiku (claude-haiku-4-5-20251001) reads the message, understands context, extracts structured fields with semantic understanding.

Quality: 0.90

Latency: ~680ms (dominated by LLM call)

Strengths: understands nuance, sarcasm, implicit references, complex sentences. Extracts dates, product names, and emotional tone with high accuracy.

degraded — Regex fallback

Extraction: pattern matching on the raw text. Regex for CMD references, product names, dates, frustration keywords, churn signals.

Quality: 0.55

Latency: <5ms

Strengths: zero external dependency, instant, always works. Good enough for clear signals (CMD refs, explicit "concurrent", product names).

Weaknesses: misses nuance, can't parse complex sentences, doesn't understand implicit frustration or indirect churn signals.

The mode and extraction_method fields in the response tell you which path was taken. The quality_score reflects extraction confidence — Snake classification quality is the same in both modes (same models, same AUROC).

error_fallback — Everything failed

If both Haiku AND regex fail (shouldn't happen — regex can't crash on valid text), the endpoint still returns a valid JSON payload with uniform probabilities and a manual-review action. Never a 500.

6. Setting up the API key on EC2

From your .zshrc

Your .zshrc has the key as ANTHROPIC_API_KEY (currently commented out). To push it to the EC2:

# 1. Uncomment in .zshrc or export directly
export ANTHROPIC_API_KEY="your-key-here"

# 2. SSH in and set it in the systemd service
ssh -i ~/.ssh/vlm-extraction-key.pem ubuntu@13.36.166.132

# 3. Add the Environment line to the service file
sudo systemctl edit requestclassifier --force
# Add under [Service]:
#   Environment="ANTHROPIC_API_KEY=your-key-here"

# 4. Restart
sudo systemctl daemon-reload
sudo systemctl restart requestclassifier

Using AWS_BEARER_TOKEN from .zshrc

If you prefer to use your AWS_BEARER_TOKEN_BEDROCK instead of a direct Anthropic key, the extraction module checks ANTHROPIC_API_KEY. To bridge:

# Option A: Set ANTHROPIC_API_KEY directly in the systemd unit
sudo tee /etc/systemd/system/requestclassifier.service.d/override.conf <<EOF
[Service]
Environment="ANTHROPIC_API_KEY=your-key"
EOF

# Option B: Use a .env file
echo 'ANTHROPIC_API_KEY=your-key' | sudo tee /opt/requestclassifier/.env
# Then add to service: EnvironmentFile=/opt/requestclassifier/.env

One-liner deploy with key

# From your Mac, push the key from your current shell env:
ssh -i ~/.ssh/vlm-extraction-key.pem ubuntu@13.36.166.132 \
  "sudo mkdir -p /etc/systemd/system/requestclassifier.service.d && \
   echo '[Service]' | sudo tee /etc/systemd/system/requestclassifier.service.d/override.conf && \
   echo 'Environment="ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}"' | sudo tee -a /etc/systemd/system/requestclassifier.service.d/override.conf && \
   sudo systemctl daemon-reload && \
   sudo systemctl restart requestclassifier"

Verify

# Check which mode is active
curl -s https://requestclassifier.aws.monce.ai/comprendre \
  -H 'Content-Type: application/json' \
  -d '{"text": "test"}' | python3 -m json.tool | grep -E '"mode"|"haiku_enabled"|"extraction_method"'

# Expected with key:
#   "mode": "full",
#   "haiku_enabled": true,
#   "extraction_method": "claude_haiku"

# Expected without key:
#   "mode": "degraded",
#   "haiku_enabled": false,
#   "extraction_method": "regex_fallback"

7. curl examples

Basic

curl -s https://requestclassifier.aws.monce.ai/comprendre \
  -H 'Content-Type: application/json' \
  -d '{"text": "J attends ma commande depuis 10 jours"}'

Full example (README scenario)

curl -s https://requestclassifier.aws.monce.ai/comprendre \
  -H 'Content-Type: application/json' \
  -d '{
    "text": "Ca fait 10 jours que j attends ma commande CMD-2026-1847 de feuillete 44.2 qui devait etre livree le 3 avril. J ai un chantier qui est bloque a cause de vous, mes poseurs sont sur place et ne peuvent rien faire. C est la troisieme fois que j appelle. Si je n ai pas de reponse aujourd hui je passe chez le concurrent.",
    "factory_id": 3
  }' | python3 -m json.tool

Python

import httpx

r = httpx.post("https://requestclassifier.aws.monce.ai/comprendre", json={
    "text": "Ma facture F-2026-3421 ne correspond pas au bon de commande. Ecart de 2400 euros.",
    "factory_id": 3,
})
data = r.json()
print(data["routage"]["service"]["Prediction"])    # "Comptabilite"
print(data["routage"]["urgence"]["Prediction"])     # "Haute"
print(data["mode"])                                  # "full" or "degraded"

8. Trust score (0-100)

Every response includes a trust_score object that assesses how confident you should be in the result. It's a composite of four independent signals:

ComponentWeightWhat it measuresRange
extraction_quality25%How good is the extraction method? Haiku=90, regex=45, error=0.0-90
signal_density25%How many extraction fields are populated? CMD ref, product, retard, frustration, escalade, menace, keywords, date, intention, secondary intention.0-100
classification_confidence35%Average top-class probability across the 4 Snake models. High = models are decisive. Low = models are unsure.0-100
model_agreement15%Do urgence and churn tell the same story? Both high or both low = 100. Contradictory (one says fire, other says calm) = 40.40-100

The composite score drives an interpretation label:

ScoreLabelMeaning
75-100highStrong signals, confident models, consistent story. Act on it.
50-74mediumDecent but incomplete. Some signals missing or models less certain. Review before escalating.
25-49lowWeak extraction, vague message, or conflicting signals. Treat as hint, not decision.
0-24very_lowAlmost no usable signal. Manual review required.

Example trust scores

{
  "trust_score": {
    "score": 78,
    "components": {
      "extraction_quality": 45,    // regex mode
      "signal_density": 90,        // 9/10 fields populated
      "classification_confidence": 84,  // models are decisive
      "model_agreement": 100       // urgence=Critique + churn=Risque eleve = aligned
    },
    "interpretation": "high"
  }
}

How to use it: A trust score of 78 means "the regex found plenty of signals, the models are confident, and everything points the same direction — this is a real P1 churn risk." A score of 52 means "vague message, few signals, classification is a best guess — maybe route it but don't auto-escalate."

9. What the outcome will be

With Haiku enabled

MetricValue
ExtractionSemantic — understands context, nuance, implicit signals
Quality score0.90
Latency~680ms
Routing AUROC0.990
Urgency AUROC0.979
Churn AUROC0.901
Cost~$0.0003/request (Haiku pricing)

Without Haiku (regex only)

MetricValue
ExtractionPattern matching — catches explicit signals only
Quality score0.55
Latency<5ms
Routing AUROC0.990 (same models)
Urgency AUROC0.979 (same models)
Churn AUROC0.901 (same models)
Cost$0.00/request

Key insight: Snake classification quality is identical in both modes — same models, same AUROC. The difference is in extraction quality. Regex gives Snake noisier features. The quality_score reflects this: 0.90 vs 0.55 is about extraction confidence, not classification accuracy.

In practice, degraded mode handles 70-80% of clear-cut cases correctly (explicit references, obvious domain keywords, clear frustration). It struggles with: ambiguous messages, indirect complaints, sarcasm, complex multi-topic messages, and implicit churn signals.

10. Endpoint summary

EndpointMethodAuthDescription
/comprendrePOSTNoneCasual classification — text in, full analysis out
/classifyPOSTNoneStructured classification — full schema with client_id, historique
/healthGETNoneHealth check
/GETNoneLanding page with live demo
/genesisGETNoneThis page — /comprendre documentation
/paperGETNoneTechnical paper — Dana Theorem, models, learning curve
/businesssummaryGETNoneNon-technical pitch
/docsGETNoneAuto-generated Swagger