MCIP uses NestJS exception handling with graceful degradation built into the search and ingestion pipelines. The agentic LangGraph workflow handles failures at each stage, BullMQ retries failed ingestion jobs, and the system falls back to simpler search modes when external services are unavailable. Structured error codes are planned for multi-platform releases.
All MCIP HTTP errors return a consistent JSON structure, powered by NestJS's built-in exception filters:
{
"statusCode": 400,
"message": "Error description here",
"error": "Bad Request"
}| Code | Meaning | When It Happens |
|---|---|---|
| 200 | Success | Request completed successfully |
| 400 | Bad Request | Invalid input, missing parameters, Zod validation failure |
| 401 | Unauthorized | Invalid or missing API key for admin endpoints |
| 404 | Not Found | Resource doesn't exist |
| 500 | Internal Server Error | Unexpected server-side error (OpenAI failure, Qdrant timeout, etc.) |
Missing search query:
curl "http://localhost:8080/search"
{
"statusCode": 400,
"message": "Query parameter 'q' is required",
"error": "Bad Request"
}
curl -X POST http://localhost:8080/admin/sync \
-H "x-admin-api-key: wrong-key"
{
"statusCode": 401,
"message": "Invalid Admin API Key",
"error": "Unauthorized"
}MCIP provides two search modes, each with its own error behavior. Understanding this is key to building robust integrations.
GET /search)The simple search path — embedding generation followed by Qdrant vector similarity — has a straightforward failure model:
| Failure Point | What Happens | Response |
|---|---|---|
| OpenAI embedding API down | Search cannot proceed | 500 Internal Server Error |
| Qdrant unreachable | Vector search fails | 500 Internal Server Error |
| No matching results | Empty items array returned | 200 with items: [] |
| Invalid query params | Zod validation rejects input | 400 Bad Request |
GET /hard-filtering/search)The agentic search runs a 4-stage LangGraph workflow. Each stage can fail independently, and MCIP handles failures at each step:
Stage 1 — Parallel Filter Extraction (GPT-4o-mini)
Three LLM calls run in parallel to extract categories, brands, and price constraints. If any extraction call fails, that filter dimension is skipped — the search continues with whatever filters were successfully extracted.
Stage 2 — Brand Validation (Qdrant Facet Search)
Extracted brands are validated against the actual store catalog via Qdrant's getFacetValues("brand"). If the requested brand doesn't exist in the store, MCIP returns an empty result set immediately rather than wasting time on a search that can't succeed.
Stage 3 — Hybrid Search (Embedding + Payload Filtering)
If embedding generation fails at this stage, the search cannot proceed and returns a 500 error. If Qdrant's payload filtering encounters an issue with a specific filter field, the system may fall back to pure vector search without that filter.
Stage 4 — LLM Verification (GPT-4o-mini)
If the verification LLM call fails, MCIP returns the unverified results from Stage 3 rather than failing the entire request. You'll still get relevant products — they just won't have the extra semantic verification pass.
filteringStatus IndicatorThe search response meta.filteringStatus tells you how the search was actually processed:
| Status | Meaning |
|---|---|
AI_FILTERED | Full agentic workflow succeeded — filters extracted and applied |
RAG_ONLY | Pure vector similarity search, no filter extraction applied |
FALLBACK | Degraded mode — something failed in the pipeline, results may be less precise |
Always check this field to understand the quality of results you're receiving:
{
"statusCode": 400,
"message": "SOURCE_URL environment variable is not set",
"error": "Bad Request"
}const data = await response.json();
if (data.meta.filteringStatus === 'FALLBACK') {
// Results are available but may be less precise
console.warn('Search ran in degraded mode');
}Product ingestion uses BullMQ with Redis for async job processing. This means ingestion errors don't surface as HTTP responses — they're handled within the queue.
Each product ingestion job is configured with automatic retries:
// Job configuration
{
name: "process-product",
data: rawProduct,
opts: {
removeOnComplete: true,
attempts: 3 // Retry up to 3 times
}
}| Stage | Failure | Recovery |
|---|---|---|
| Fetch from source | SOURCE_URL unreachable or returns error | Job fails, retries up to 3 times |
| Product mapping | Adapter throws (invalid data, missing fields) | Job fails, product skipped after retries |
| Zod validation | Mapped product doesn't match UnifiedProduct schema | Job fails, product skipped |
| Embedding generation | OpenAI API error or rate limit | Job retries with BullMQ backoff |
| Qdrant storage | Vector DB unreachable | Job retries, Qdrant auto-reconnects |
Check ingestion status through Docker logs:
# Watch ingestion processing
docker-compose logs -f mcip
# Look for mapper errors
docker-compose logs mcip | grep "ERROR"The sync endpoint returns the number of products queued:
{
"status": "success",
"message": "Queued 150 products from URL",
"count": 150
}A successful sync response means products were queued, not necessarily processed. Individual products may still fail during mapping or embedding.
MCIP is designed to return the best results it can, even when parts of the system are struggling. Think of it like a restaurant that still serves food when one burner is broken — you might not get the full menu, but you won't go hungry.
| Error Type | Handling Strategy | User Impact | Recovery |
|---|---|---|---|
| Embedding API failure | Falls back to simpler search or returns error | Degraded relevance or no results | Automatic when API recovers |
| Vector DB timeout | Returns cached or partial results if available | Possibly stale data | Automatic with retry |
| LLM filter extraction failure | Skips failed filter, continues with others | Some filters not applied | Automatic on next request |
| LLM verification failure | Returns unverified results from hybrid search | Results not semantically verified | Automatic on next request |
| Brand not in catalog | Returns empty results immediately | No results (by design) | N/A — correct behavior |
| Rate limiting (OpenAI) | Queue and retry with backoff | Delayed response | Automatic with exponential backoff |
MCIP retries the Qdrant connection up to 10 times on startup. If Qdrant is slow to start (common in Docker Compose), MCIP will wait:
# If you see this in logs, it's normal — MCIP is waiting for Qdrant
[Nest] WARN - Qdrant connection attempt 3/10...
The Docker Compose health checks ensure Qdrant is ready before MCIP starts accepting requests.
try {
const response = await fetch('http://localhost:8080/search?q=laptop');
const data = await response.json();
if (!response.ok) {
console.error(`Error ${data.statusCode}: ${data.message}`);
switch (data.statusCode) {
case 400:
// Invalid input — check your query parameters
break;
case 401:
// Auth error — check your admin API key
break;
case 500:
// Server error — retry with backoff
break;
}
return;
}
// Check search quality
if (data.meta.filteringStatus === 'FALLBACK') {
console.warn('Results may be less precise (degraded mode)');
}
console.log(`Found ${data.meta.count} products`);
} catch (error) {
// Network error — server unreachable
console.error('Network error:', error.message);
}async function searchWithRetry(query, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
const response = await fetch(
`http://localhost:8080/search?q=${encodeURIComponent(query)}`
);
if (response.ok) return response.json();
if (response.status >= 500) {
// Server error — worth retrying
const delay = Math.pow(2, i) * 1000;
console.log(`Retry ${i + 1}/${maxRetries} in ${delay}ms...`);
await new Promise(r => setTimeout(r, delay));
continue;
}
// Client error (400, 401) — don't retry, fix the request
const errorData = await response.json();
throw new Error(`Client error ${response.status}: ${errorData.message}`);
} catch (error) {
if (i === maxRetries - 1) throw error;
}
}
}When using the agentic search endpoint, you may want to fall back to simple search if the LangGraph pipeline fails:
async function smartSearch(query) {
try {
// Try agentic search first (best results)
const response = await fetch(
`http://localhost:8080/hard-filtering/search?q=${encodeURIComponent(query)}`
);
if (response.ok) return response.json();
} catch (error) {
console.warn('Agentic search failed, falling back to simple search');
}
// Fallback to simple vector search
const response = await fetch(
`http://localhost:8080/search?q=${encodeURIComponent(query)}`
);
return response.json();
}Always verify server health before operations:
curl http://localhost:8080/healthExpected response:
{"status":"ok"}| Symptom | Possible Cause | Solution |
|---|---|---|
| 500 on all searches | OpenAI API key invalid or expired | Verify OPENAI_API_KEY environment variable |
| 500 on sync | Qdrant unreachable | Check QDRANT_URL and run docker-compose logs qdrant |
| 401 on admin endpoints | Wrong API key | Verify ADMIN_API_KEY matches the x-admin-api-key header |
| Empty search results | Products not synced yet | Run POST /admin/sync and wait for ingestion to complete |
| Slow agentic search | OpenAI rate limits | Check OpenAI dashboard for rate limit status, add delays if on Tier 1 |
| Sync returns count but no searchable products | Mapper errors during ingestion | Check docker-compose logs mcip for mapping/validation errors |
| MCIP won't start | Qdrant not ready yet | Wait for health checks — MCIP retries Qdrant connection 10 times |
| Queue jobs stuck | Redis connection lost | Run docker-compose restart redis |
# MCIP server
curl http://localhost:8080/health
# Qdrant vector database
curl http://localhost:6333/collections
# Redis (via docker)
docker-compose exec redis redis-cli ping
# Expected: PONGWhen searching across multiple stores, return results from responsive stores even if some fail:
{
"items": [...],
"meta": {
"partial": true,
"failedStores": ["shopify-store-1"],
"successfulStores": ["vendure-main", "woocommerce-shop"]
}
}Automatic failure detection and recovery per store adapter: