Vision usage examples

Find code patterns for common Vision setups, including safety training, webcam selection, overhead cameras, look-at activation, and WebGL deployment.

These examples cover the most common Vision integration patterns. Each example is self-contained — copy the relevant script, attach it to the appropriate GameObject, and configure the serialized fields in the Inspector.

Monitor object placement in safety training

A safety training application where a Convai character monitors whether the user places equipment in the correct zone and gives spoken feedback. The character uses the live scene camera feed to observe placement in real time.

Expected outcome: When the player moves an object, the character describes its position and confirms whether placement is correct or flags a safety issue.

using Convai.Modules.Vision;
using Convai.Runtime.Vision.Publishing;
using UnityEngine;

/// <summary>
/// Enables vision when the training sequence starts and disables it when complete.
/// Attach to the same GameObject as ConvaiVisionPublisher.
/// </summary>
public class SafetyTrainingVisionController : MonoBehaviour
{
    [SerializeField] private ConvaiVisionPublisher _publisher;

    void Awake()
    {
        // Start in Manual mode so vision only captures during active training
        _publisher.SetPublishPolicy(VisionPublishPolicy.Manual);
    }

    public void BeginTrainingSequence()
    {
        _publisher.SetPublishPolicy(VisionPublishPolicy.HighResponsiveness);
        _publisher.EnablePublishing(true);
    }

    public void EndTrainingSequence()
    {
        _publisher.EnablePublishing(false);
        _publisher.SetPublishPolicy(VisionPublishPolicy.Manual);
    }
}

Select webcam device at runtime

A desktop onboarding application where the user selects which physical camera to use before a session starts. Useful when the user's workstation has multiple cameras (built-in webcam, USB camera, etc.).

Expected outcome: The dropdown populates on Start with all detected camera names. Selecting a camera name and clicking Switch swaps the capture device without stopping the session.

TMP_Dropdown requires the TextMeshPro package. If your project uses the legacy UnityEngine.UI.Dropdown, replace TMP_Dropdown with DropdownAddOptions(List<string>) works identically.

Stream an overhead security camera

An architectural walkthrough where an overhead security camera monitors the entire floor plan. The publisher uses LowOverhead policy because the scene changes slowly and bandwidth must be reserved for audio.

Expected outcome: The character describes what is visible in the top-down view — furniture layout, occupancy, or hazards — when asked.

Activate publishing on player look-at

Vision is expensive to stream continuously. This pattern activates publishing only while the player is looking at a specific object (e.g., a piece of machinery), and pauses it otherwise.

Expected outcome: The character responds to the object's state only when the player is looking at it. Network and GPU overhead are zero when the player looks away.

Configure Vision for WebGL

On WebGL, no frame source component is required. ConvaiVisionPublisher captures the browser canvas automatically via canvas.captureStream(). Set Connection Type to Video and the publish policy as needed — everything else is automatic.

Expected outcome: The character receives a live feed of the browser canvas. No frame source component is on the scene.

Next steps

Troubleshoot visionCustom frame sources

Last updated

Was this helpful?