Chat Messages
Roles
- Kotlin
- Swift
Roles of the chat messages, which follows the OpenAI API definition:
SYSTEM: Indicates the associated content is part of the system prompt. It is generally the first message, to provide guidance on how the model should behave.USER: Indicates the associated content is user input.ASSISTANT: Indicates the associated content is model-generated output.TOOL: Used when appending function-call results back into the conversation.
Message Structure
- Kotlin
- Swift
Fields
role: The role of this message (seeChatMessage.Role).content: A list of message contents. Each element is an instance ofChatMessageContent.reasoningContent: The reasoning content generated by the reasoning models. Only messages generated by reasoning models will have this field. For other models or other roles, this field should benull.functionCalls: Function call requests generated by the model. See the Function Calling guide for more details.
toJSONObject
Returns a JSONObject that represents the chat message. The returned object is compatible with ChatCompletionRequestMessage from the OpenAI API. It contains 2 fields: role and content.fromJSONObject
Constructs a ChatMessage instance from a JSONObject. Not all JSON object variants in ChatCompletionRequestMessage of the OpenAI API are acceptable. As of now, role supports user, system and assistant; content can be a string or an array.LeapSerializationException will be thrown if the provided JSONObject cannot be recognized as a
message.Message Content
- Kotlin
- Swift
Data class that is compatible with the content object in the OpenAI chat completion API. It is a sealed interface.
Pure text content. The content is available in the
Image content. Only JPEG-encoded data is supported. The
Audio content for speech recognition and audio understanding. The inference engine requires WAV-encoded audio with specific format requirements (see Audio Format Requirements below).
toJSONObjectreturns an OpenAI API compatible content object (with atypefield and the real content fields).fromJSONObjectreceives an OpenAI API compatible content object to build a message content. Not all OpenAI content objects are accepted.
Text: Pure text content.Image: JPEG-encoded image content.Audio: WAV-encoded audio content.
LeapSerializationException will be thrown if the provided JSONObject cannot be recognized as a
message.ChatMessageContent.Text
text field.ChatMessageContent.Image
fromBitmap helper function creates a ChatMessageContent.Image from an Android Bitmap object (the image will be compressed).ChatMessageContent.Audio
Audio
Audio Format Requirements
The LEAP inference engine requires WAV-encoded audio with specific format requirements:| Property | Required Value | Notes |
|---|---|---|
| Format | WAV (RIFF) | Only WAV format is supported |
| Sample Rate | 16000 Hz (16 kHz) recommended | Other sample rates are automatically resampled to 16 kHz |
| Encoding | PCM (various bit depths) | Supports Float32, Int16, Int24, Int32 |
| Channels | Mono (1 channel) | Required - stereo audio will be rejected |
| Byte Order | Little-endian | Standard WAV format |
- Float32: 32-bit floating point, normalized to [-1.0, 1.0]
- Int16: 16-bit signed integer, range [-32768, 32767] (recommended)
- Int24: 24-bit signed integer, range [-8388608, 8388607]
- Int32: 32-bit signed integer, range [-2147483648, 2147483647]
Automatic Resampling: The inference engine automatically resamples audio to 16 kHz if provided at a different sample rate. However, for best performance and quality, provide audio at 16 kHz to avoid resampling overhead.
Creating Audio Content from WAV Files
- Kotlin
- Swift
Creating Audio Content from Raw PCM Samples
- Kotlin
- Swift
If youβre recording audio or have raw PCM data, use the
FloatAudioBuffer utility to create properly formatted WAV files:FloatAudioBuffer automatically creates a valid WAV header and encodes the samples as 32-bit float PCM in a WAV container, which is compatible with the inference engine.Recording Audio
- Kotlin
- Swift
When recording audio from the device microphone, configure
AudioRecord or use a library like WaveRecorder with the correct settings:Audio Duration Considerations
- Minimum duration: At least 1 second of audio is recommended for reliable speech recognition
- Maximum duration: Limited by the modelβs context window (typically several minutes)
- Silence: Trim excessive silence from the beginning and end for better results
Audio Output from Models
When generating audio responses (e.g., withLFM2.5-Audio-1.5B), the model outputs audio at 24 kHz sample rate:
- Kotlin
- Swift
Note: Audio input should be 16 kHz, but audio output from generation models is typically 24 kHz. Make sure your audio playback code supports the correct sample rate.