A common practice with Chat Completions is to instruct the model to consistently return a JSON object tailored to specific use cases, by detailing this requirement in the system message. While effective in many instances, there are occasions when the model’s output may not correctly form valid JSON objects, leading to parsing issues.

To address this challenge, we have introduced support for a JSON mode that ensures the response is always a valid JSON object. This feature is similar to the OpenAI JSON mode. Going a step further, we also offer the capability to specify a dedicated JSON schema, providing you with finer control over the output format.

How to Use

Structured output is supported by ALL models on the platform. It can be activated by configuring the response_format parameter. Below is the schema of the parameter:

  • type: This can be set to either “text” or “json_object”:
    • "text", to be compatible with OpenAI API, it will have no effect, equivalent to leaving the field empty.
    • "json_object" enable the “JSON mode” which will guarantee the model to respond a valid json in its response.
  • schema: Optional. This is a feature Empower added that goes beyond the OpenAI API. The value should be a valid JSON schema. We introduced this to give users more explicit control over the output format.

To ensure the best results, it is crucial to consider the following two points:

  • When using the JSON mode, it is imperative to explicitly instruct the model to generate JSON-formatted output, either through a system or user message. Failure to do so could lead the model to produce a continuous stream of whitespace, persisting until the generation reaches the token limit. This can cause requests to run for an extended period, appearing as if they are “stuck.” Additionally, be aware that if the finish_reason is “length,” it indicates that the generation has surpassed the maximum token limit or the conversation has exceeded the maximum context length, which may result in message content being partially truncated.
  • Currently, utilizing the schema field may significantly impact performance during the first request that employs the schema, due to the overhead of building the index. However, we cache the schema for subsequent requests to mitigate this. In practice, we’ve found that specifying the schema directly into the prompt typically yields satisfactory results. Therefore, we recommend attempting to specify the schema within the prompt itself before relying on the schema field. Please contact us if you have the need for the schema field with better performance.
  • Set a very low temperature as possible for the best performance.

Code Example

python
from openai import OpenAI

client = OpenAI(
    base_url="https://app.empower.dev/api/v1",
    api_key="YOUR_API_KEY"
)

# General JSON mode
chat_completion = client.chat.completions.create(
    model="mistralai/Mixtral-8x7B-Instruct-v0.1",
    response_format={"type": "json_object"},
    temperature=0.0,
    messages=[
        {
            "role": "user",
            "content": "Who's the winner of the world series in 2020? Reply in json with all the details.",
        },
    ]
)
print(chat_completion.choices[0].message.content)

# Specify the schema
chat_completion = client.chat.completions.create(
    model="mistralai/Mixtral-8x7B-Instruct-v0.1",
    response_format={"type": "json_object", "schema": {
        "type": "object", "properties": {"winner": {"type": "string"}}}},
    temperature=0.0,
    messages=[
        {
            "role": "user",
            "content": "Who's the winner of the world series in 2020? Reply in json with all the details.",
        },
    ]
)
print(chat_completion.choices[0].message.content)

Output

{
"series": "World Series",
"year": 2020,
"winner": "Los Angeles Dodgers",
"opponent": "Tampa Bay Rays",
"result": "Dodgers won the series 4-2",
"mvp": "Corey Seager (LAD)"
}
{
"winner": "Los Angeles Dodgers"
}