Đang tải...

AI API Errors: A Practical Debugging Guide for Developers

11/06/2026
7 phút đọc
AI API Errors: A Practical Debugging Guide for Developers
**API failures in AI work differently. Here's how to debug them properly.** A `200` status code doesn't always mean your AI generation succeeded. A `null` content field isn't necessarily an error. And...

API failures in AI work differently. Here's how to debug them properly.

A 200 status code doesn't always mean your AI generation succeeded. A null content field isn't necessarily an error. And a prompt that worked perfectly yesterday might fail today — because a provider quietly updated their content policy.

This guide walks you through reading AI API errors, understanding what each failure mode actually means, and building error handling that tells you what broke — not just that something broke.

Note: Model names like gpt-5.4 and gpt-5.4-mini used here are CometAPI platform identifiers. They work through https://api.cometapi.com/v1 only — not directly through OpenAI or Anthropic APIs.


Why AI API Debugging Is Different

With a standard REST API, 200 means success and 4xx means you made a mistake. AI APIs introduce a third category: soft failures — responses that return 200 but contain nothing usable.

AI failures fall into three types:

Failure Type What Happens Example
Hard failure HTTP error (4xx, 5xx). Request didn't complete. 401 Unauthorized
Soft failure HTTP 200, but finish_reason is content_filter or length Blocked prompt
Silent failure HTTP 200, everything looks fine — but output is wrong Wrong classification

Most error handling only covers the first type. The second and third types are where production bugs hide.


Understanding Error Responses

The text completions endpoint returns a consistent error structure:

{
 "error": {
 "message": "Human-readable description (includes request ID)",
 "type": "comet_api_error",
 "param": "the_problematic_parameter",
 "code": "error_code"
 }
}

What to log: Always log message and param. The message tells you what went wrong. The param tells you which parameter caused it.

Image & video endpoints return different error formats — always parse the raw response body.


HTTP Status Codes: What They Mean

Status Meaning Common Cause Fix
400 Bad request Missing model or wrong parameter Check error.param
401 Unauthorized Invalid or missing API key Verify Bearer <key> format
429 Rate limited Too many requests Exponential backoff
500 Server error Provider-side issue Retry with backoff
504 Gateway timeout Provider took too long Retry or use faster model

Rule of thumb: Retry on 429, 500, and 504. Don't retry on 400 or 401 — the same request will fail again.


The Most Overlooked Field: finish_reason

A 200 response with finish_reason: "content_filter" means your generation was blocked. The content field will be null or empty. If you don't check this, your app will silently return nothing.

finish_reason Meaning Action
stop Normal completion Success
length Hit token limit Increase max_tokens or shorten prompt
content_filter Blocked by safety policy Rephrase the prompt
tool_calls Model called a tool Handle the tool call (content will be null)

A Robust Text Completion Example (Python)

Here's a production-ready function that handles all three failure types:

import os
import logging
from openai import OpenAI, APIStatusError, APIConnectionError

client = OpenAI(
 base_url="https://api.cometapi.com/v1",
 api_key=os.environ.get("COMETAPI_KEY"),
)

def safe_complete(messages, model="gpt-5.4-mini", **kwargs):
 try:
 response = client.chat.completions.create(
 model=model, messages=messages, **kwargs
 )
 except APIStatusError as e:
 error_body = e.response.json().get("error", {})
 logging.error(f"API error {e.status_code}: {error_body.get('message')}")
 raise

 choice = response.choices[0]
 finish_reason = choice.finish_reason

 if finish_reason == "content_filter":
 raise ValueError(f"Generation blocked on model {model}. Rephrase prompt.")

 if finish_reason == "length":
 logging.warning("Output truncated at token limit.")

 return {
 "content": choice.message.content or "",
 "finish_reason": finish_reason,
 "tool_calls": choice.message.tool_calls,
 }

Key takeaway: Always check finish_reason. Don't assume 200 means success.


Detecting Silent Failures

Silent failures are the hardest to catch. The API returns 200, finish_reason is stop, but the output is semantically wrong. You can only catch these at the application level.

Example: Validation for classification tasks

def validate_completion(result, task):
 content = result["content"].strip()

 # Empty output check
 if not content and result["finish_reason"] != "tool_calls":
 raise ValueError(f"Empty output for task '{task}'")

 # Task-specific validation
 if task == "classify":
 valid_labels = {"positive", "negative", "neutral"}
 if content.lower() not in valid_labels:
 logging.warning(f"Unexpected output: '{content}'")
 # May need to re-prompt with stricter instructions

 if task == "json_extract":
 import json
 try:
 json.loads(content)
 except json.JSONDecodeError:
 raise ValueError("Expected JSON but got plain text")

 return content

Common causes of silent failures:

  • Ambiguous prompts
  • Model ignored format instructions
  • Input was too short or too long for the task

Exponential Backoff for Rate Limits

Rate limit errors (429) are temporary. Use exponential backoff with jitter:

import time
import random

def complete_with_retry(messages, model="gpt-5.4-mini", max_retries=3):
 for attempt in range(max_retries):
 try:
 return safe_complete(messages, model=model)
 except APIStatusError as e:
 if e.status_code < 500:
 raise # Don't retry 4xx errors
 except RateLimitError:
 pass # Retry

 if attempt < max_retries - 1:
 wait = (2 ** attempt) + random.random()
 logging.warning(f"Retry in {wait:.1f}s")
 time.sleep(wait)

 raise RuntimeError(f"Failed after {max_retries} attempts")

Why jitter matters: Random delay prevents multiple clients from retrying in sync (thundering herd problem).


Image Generation Errors

Image generation has its own failure patterns:

Symptom Cause Fix
Empty data array Prompt filtered Check revised_prompt; rephrase
response_format error Wrong parameter for GPT Image 2 Use output_format instead
n > 1 error Qwen Image doesn't support multiple images Loop single requests
URL returns 403 later URL expired Download immediately

Simplified image generation check:

def generate_image_safe(prompt, model="dall-e-3"):
 response = requests.post(
 "https://api.cometapi.com/v1/images/generations",
 json={"model": model, "prompt": prompt},
 headers={"Authorization": f"Bearer {api_key}"}
 )

 data = response.json().get("data", [])
 if not data:
 return {"blocked": True} # Content filter triggered

 return {"url": data[0].get("url"), "blocked": False}

Video Generation Errors

Video generation is asynchronous. Key patterns to watch:

Symptom Cause Fix
Stuck in queued 10+ min Server load Try a different model
failed with no detail Prompt filtered Rephrase prompt
URL returns 403 URL expired Download immediately
task_not_exist on first poll Task still initializing Wait 5s and retry
Kling returns "succeed" Non-standard status Handle both "succeed" and "succeeded"

Minimal polling pattern:

def poll_video(task_id, max_wait=600):
 elapsed = 0
 while elapsed < max_wait:
 result = requests.get(f"https://api.cometapi.com/v1/videos/{task_id}").json()
 status = result.get("status")

 if status == "succeeded":
 return result["output"][0]
 if status in ("failed", "cancelled"):
 raise RuntimeError(f"Video failed: {result.get('error')}")

 time.sleep(10)
 elapsed += 10

 raise TimeoutError("Video generation timed out")

Debugging Checklist

For text generation:

  • API key is correctly formatted (Bearer <key>)
  • finish_reason is stop (not content_filter or length)
  • content is not null (or null is expected due to tool_calls)
  • Error is 4xx (fix request) or 5xx (retry)
  • Output passes application-layer validation (no silent failure)

For image generation:

  • data array is not empty (content filter not triggered)
  • Correct parameters used (output_format for GPT Image 2, not response_format)
  • Downloaded image before URL expired

For video generation:

  • Task progresses beyond queued within reasonable time
  • Error field checked in failed task response
  • Video downloaded before URL expired
  • Handles both "succeed" (Kling) and "succeeded" (others)

FAQ

Q: My request returns 200 but no content. What happened? Check finish_reason. content_filter means the generation was blocked. tool_calls means the model wants to call a tool (content is null by design). If finish_reason is stop but content is still empty, that's a silent failure — log the full response and check your prompt.

Q: How do I know if my prompt was filtered? Text: finish_reason === "content_filter". Images: data array is empty. Video: Task reaches failed status quickly with no error detail. Fix: Rephrase the prompt to be more neutral.

Q: When should I retry a failed request? Retry on 429 and 5xx with exponential backoff. Don't retry on 4xx — a bad request won't fix itself.

Q: What's exponential backoff? Instead of retrying immediately, wait progressively longer: 1s, 2s, 4s. Add random jitter to prevent multiple clients from retrying in sync. This is standard practice for any rate-limited API.

Q: How do I catch silent failures? Silent failures require application-layer validation. The API won't tell you the output is semantically wrong. Check that the output matches the expected format (valid JSON, expected label, minimum length). Log the full output when validation fails.

📚 Nguồn: Viblo

Bình luận

0 bình luận

Email không hiển thị công khai.

Chưa có bình luận nào. Hãy là người đầu tiên bình luận.

Chia sẻ bài viết

Cần tư vấn?

Liên hệ với chúng tôi để được hỗ trợ

Liên hệ ngay

Bài viết liên quan

Session 1 - Securing Accounts: Bảo vệ tài khoản trong thế giới số
19/06/2026

Session 1 - Securing Accounts: Bảo vệ tài khoản trong thế giới số

## Mục tiêu của Session Session đầu tiên tập trung vào một chủ đề rất quen thuộc nhưng cũng là mục tiêu tấn công phổ biến nhất hiện nay: **tài khoản người...

Đọc thêm
Tổng hợp kênh hỗ trợ FPT dành cho khách hàng cá nhân
19/06/2026

Tổng hợp kênh hỗ trợ FPT dành cho khách hàng cá nhân

Khi sử dụng Internet, truyền hình hoặc camera, khách hàng cá nhân đôi khi cần hỗ trợ về lắp đặt, báo lỗi, thanh toán, hợp đồng hoặc nâng cấp dịch vụ. Thay...

Đọc thêm
Lắp mạng cho sinh viên: Cách chọn gói rẻ nhưng vẫn khỏe
19/06/2026

Lắp mạng cho sinh viên: Cách chọn gói rẻ nhưng vẫn khỏe

Với sinh viên, Internet không chỉ để giải trí. Một đường truyền ổn định giúp học online, nộp bài, gọi video nhóm, xem tài liệu, làm thêm từ xa và thư giãn s...

Đọc thêm

Bắt đầu dự án của bạn

Hãy để Flash Dev đồng hành cùng bạn

Liên hệ ngay