AI API Errors: A Practical Debugging Guide for Developers

**API failures in AI work differently. Here's how to debug them properly.** A `200` status code doesn't always mean your AI generation succeeded. A `null` content field isn't necessarily an error. And...

API failures in AI work differently. Here's how to debug them properly.

A 200 status code doesn't always mean your AI generation succeeded. A null content field isn't necessarily an error. And a prompt that worked perfectly yesterday might fail today — because a provider quietly updated their content policy.

This guide walks you through reading AI API errors, understanding what each failure mode actually means, and building error handling that tells you what broke — not just that something broke.

Note: Model names like gpt-5.4 and gpt-5.4-mini used here are CometAPI platform identifiers. They work through https://api.cometapi.com/v1 only — not directly through OpenAI or Anthropic APIs.

Why AI API Debugging Is Different

With a standard REST API, 200 means success and 4xx means you made a mistake. AI APIs introduce a third category: soft failures — responses that return 200 but contain nothing usable.

AI failures fall into three types:

Failure Type	What Happens	Example
Hard failure	HTTP error (4xx, 5xx). Request didn't complete.	401 Unauthorized
Soft failure	HTTP 200, but `finish_reason` is `content_filter` or `length`	Blocked prompt
Silent failure	HTTP 200, everything looks fine — but output is wrong	Wrong classification

Most error handling only covers the first type. The second and third types are where production bugs hide.

Understanding Error Responses

The text completions endpoint returns a consistent error structure:

{
 "error": {
 "message": "Human-readable description (includes request ID)",
 "type": "comet_api_error",
 "param": "the_problematic_parameter",
 "code": "error_code"
 }
}

What to log: Always log message and param. The message tells you what went wrong. The param tells you which parameter caused it.

Image & video endpoints return different error formats — always parse the raw response body.

HTTP Status Codes: What They Mean

Status	Meaning	Common Cause	Fix
400	Bad request	Missing model or wrong parameter	Check `error.param`
401	Unauthorized	Invalid or missing API key	Verify `Bearer <key>` format
429	Rate limited	Too many requests	Exponential backoff
500	Server error	Provider-side issue	Retry with backoff
504	Gateway timeout	Provider took too long	Retry or use faster model

Rule of thumb: Retry on 429, 500, and 504. Don't retry on 400 or 401 — the same request will fail again.

The Most Overlooked Field: `finish_reason`

A 200 response with finish_reason: "content_filter" means your generation was blocked. The content field will be null or empty. If you don't check this, your app will silently return nothing.

`finish_reason`	Meaning	Action
`stop`	Normal completion	Success
`length`	Hit token limit	Increase `max_tokens` or shorten prompt
`content_filter`	Blocked by safety policy	Rephrase the prompt
`tool_calls`	Model called a tool	Handle the tool call (content will be `null`)

A Robust Text Completion Example (Python)

Here's a production-ready function that handles all three failure types:

import os
import logging
from openai import OpenAI, APIStatusError, APIConnectionError

client = OpenAI(
 base_url="https://api.cometapi.com/v1",
 api_key=os.environ.get("COMETAPI_KEY"),
)

def safe_complete(messages, model="gpt-5.4-mini", **kwargs):
 try:
 response = client.chat.completions.create(
 model=model, messages=messages, **kwargs
 )
 except APIStatusError as e:
 error_body = e.response.json().get("error", {})
 logging.error(f"API error {e.status_code}: {error_body.get('message')}")
 raise

 choice = response.choices[0]
 finish_reason = choice.finish_reason

 if finish_reason == "content_filter":
 raise ValueError(f"Generation blocked on model {model}. Rephrase prompt.")

 if finish_reason == "length":
 logging.warning("Output truncated at token limit.")

 return {
 "content": choice.message.content or "",
 "finish_reason": finish_reason,
 "tool_calls": choice.message.tool_calls,
 }

Key takeaway: Always check finish_reason. Don't assume 200 means success.

Detecting Silent Failures

Silent failures are the hardest to catch. The API returns 200, finish_reason is stop, but the output is semantically wrong. You can only catch these at the application level.

Example: Validation for classification tasks

def validate_completion(result, task):
 content = result["content"].strip()

 # Empty output check
 if not content and result["finish_reason"] != "tool_calls":
 raise ValueError(f"Empty output for task '{task}'")

 # Task-specific validation
 if task == "classify":
 valid_labels = {"positive", "negative", "neutral"}
 if content.lower() not in valid_labels:
 logging.warning(f"Unexpected output: '{content}'")
 # May need to re-prompt with stricter instructions

 if task == "json_extract":
 import json
 try:
 json.loads(content)
 except json.JSONDecodeError:
 raise ValueError("Expected JSON but got plain text")

 return content

Common causes of silent failures:

Ambiguous prompts
Model ignored format instructions
Input was too short or too long for the task

Exponential Backoff for Rate Limits

Rate limit errors (429) are temporary. Use exponential backoff with jitter:

import time
import random

def complete_with_retry(messages, model="gpt-5.4-mini", max_retries=3):
 for attempt in range(max_retries):
 try:
 return safe_complete(messages, model=model)
 except APIStatusError as e:
 if e.status_code < 500:
 raise # Don't retry 4xx errors
 except RateLimitError:
 pass # Retry

 if attempt < max_retries - 1:
 wait = (2 ** attempt) + random.random()
 logging.warning(f"Retry in {wait:.1f}s")
 time.sleep(wait)

 raise RuntimeError(f"Failed after {max_retries} attempts")

Why jitter matters: Random delay prevents multiple clients from retrying in sync (thundering herd problem).

Image Generation Errors

Image generation has its own failure patterns:

Symptom	Cause	Fix
Empty `data` array	Prompt filtered	Check `revised_prompt`; rephrase
`response_format` error	Wrong parameter for GPT Image 2	Use `output_format` instead
`n > 1` error	Qwen Image doesn't support multiple images	Loop single requests
URL returns 403 later	URL expired	Download immediately

Simplified image generation check:

def generate_image_safe(prompt, model="dall-e-3"):
 response = requests.post(
 "https://api.cometapi.com/v1/images/generations",
 json={"model": model, "prompt": prompt},
 headers={"Authorization": f"Bearer {api_key}"}
 )

 data = response.json().get("data", [])
 if not data:
 return {"blocked": True} # Content filter triggered

 return {"url": data[0].get("url"), "blocked": False}

Video Generation Errors

Video generation is asynchronous. Key patterns to watch:

Symptom	Cause	Fix
Stuck in `queued` 10+ min	Server load	Try a different model
`failed` with no detail	Prompt filtered	Rephrase prompt
URL returns 403	URL expired	Download immediately
`task_not_exist` on first poll	Task still initializing	Wait 5s and retry
Kling returns `"succeed"`	Non-standard status	Handle both `"succeed"` and `"succeeded"`

Minimal polling pattern:

def poll_video(task_id, max_wait=600):
 elapsed = 0
 while elapsed < max_wait:
 result = requests.get(f"https://api.cometapi.com/v1/videos/{task_id}").json()
 status = result.get("status")

 if status == "succeeded":
 return result["output"][0]
 if status in ("failed", "cancelled"):
 raise RuntimeError(f"Video failed: {result.get('error')}")

 time.sleep(10)
 elapsed += 10

 raise TimeoutError("Video generation timed out")

Debugging Checklist

For text generation:

API key is correctly formatted (Bearer <key>)
finish_reason is stop (not content_filter or length)
content is not null (or null is expected due to tool_calls)
Error is 4xx (fix request) or 5xx (retry)
Output passes application-layer validation (no silent failure)

For image generation:

data array is not empty (content filter not triggered)
Correct parameters used (output_format for GPT Image 2, not response_format)
Downloaded image before URL expired

For video generation:

Task progresses beyond queued within reasonable time
Error field checked in failed task response
Video downloaded before URL expired
Handles both "succeed" (Kling) and "succeeded" (others)

FAQ

Q: My request returns 200 but no content. What happened? Check finish_reason. content_filter means the generation was blocked. tool_calls means the model wants to call a tool (content is null by design). If finish_reason is stop but content is still empty, that's a silent failure — log the full response and check your prompt.

Q: How do I know if my prompt was filtered? Text: finish_reason === "content_filter". Images: data array is empty. Video: Task reaches failed status quickly with no error detail. Fix: Rephrase the prompt to be more neutral.

Q: When should I retry a failed request? Retry on 429 and 5xx with exponential backoff. Don't retry on 4xx — a bad request won't fix itself.

Q: What's exponential backoff? Instead of retrying immediately, wait progressively longer: 1s, 2s, 4s. Add random jitter to prevent multiple clients from retrying in sync. This is standard practice for any rate-limited API.

Q: How do I catch silent failures? Silent failures require application-layer validation. The API won't tell you the output is semantically wrong. Check that the output matches the expected format (valid JSON, expected label, minimum length). Log the full output when validation fails.

📚 Nguồn: Viblo

cometapi

Bình luận

0 bình luận

Mới nhất Cũ nhất

Chưa có bình luận nào. Hãy là người đầu tiên bình luận.

Chia sẻ bài viết

Facebook Twitter LinkedIn

Cần tư vấn?

Liên hệ với chúng tôi để được hỗ trợ

Liên hệ ngay

Bài viết liên quan

19/06/2026

Session 1 - Securing Accounts: Bảo vệ tài khoản trong thế giới số

## Mục tiêu của Session Session đầu tiên tập trung vào một chủ đề rất quen thuộc nhưng cũng là mục tiêu tấn công phổ biến nhất hiện nay: **tài khoản người...

Đọc thêm

19/06/2026

Tổng hợp kênh hỗ trợ FPT dành cho khách hàng cá nhân

Khi sử dụng Internet, truyền hình hoặc camera, khách hàng cá nhân đôi khi cần hỗ trợ về lắp đặt, báo lỗi, thanh toán, hợp đồng hoặc nâng cấp dịch vụ. Thay...

Đọc thêm

19/06/2026

Lắp mạng cho sinh viên: Cách chọn gói rẻ nhưng vẫn khỏe

Với sinh viên, Internet không chỉ để giải trí. Một đường truyền ổn định giúp học online, nộp bài, gọi video nhóm, xem tài liệu, làm thêm từ xa và thư giãn s...

Đọc thêm

Bắt đầu dự án của bạn

Hãy để Flash Dev đồng hành cùng bạn

Liên hệ ngay

AI API Errors: A Practical Debugging Guide for Developers

Why AI API Debugging Is Different

Understanding Error Responses

HTTP Status Codes: What They Mean

The Most Overlooked Field: finish_reason

A Robust Text Completion Example (Python)

Detecting Silent Failures

Exponential Backoff for Rate Limits

Image Generation Errors

Video Generation Errors

Debugging Checklist

FAQ

Bình luận

Chia sẻ bài viết

Cần tư vấn?

Bài viết liên quan

Session 1 - Securing Accounts: Bảo vệ tài khoản trong thế giới số

Tổng hợp kênh hỗ trợ FPT dành cho khách hàng cá nhân

Lắp mạng cho sinh viên: Cách chọn gói rẻ nhưng vẫn khỏe

Bắt đầu dự án của bạn

The Most Overlooked Field: `finish_reason`