MLC LLM + React Native: On-Device AI Without the Pain

Teachers

Artur Morys-Magiera

Software Engineer

Callstack

On-device AI does not have to be limited to models shipped with the operating system. In this episode of React Native AI Unpacked, we focus on the MLC runtime and show how React Native apps can run third-party large language models directly on the device without giving up performance, portability, or developer experience.

This episode explains how MLC LLM works as a universal inference engine and why it enables a different level of flexibility compared to built-in models. Instead of adapting your app to a single vendor or hardware target, you can choose the models you want, bundle them with your app, and rely on MLC to optimize execution for each device.

Using React Native AI with the MLC runtime, you can build applications that run fully on-device across iOS and Android while keeping the same JavaScript API you already use with the Vercel AI SDK.

Running third-party models on-device

The episode starts by introducing MLC LLM as a general-purpose engine for on-device inference. Rather than shipping a fixed model with the operating system, MLC allows you to decide which models your app uses and compile them into the application in an optimized form.

This approach gives you control over model size, capabilities, and behavior. You are not restricted to system-provided models and can instead use open-source models or your own fine-tuned variants, including models distributed through platforms like Hugging Face.

Performance across hardware and platforms

A core part of this episode is understanding how MLC achieves high performance across a wide range of devices. MLC applies advanced optimization techniques such as memory planning, operator fusion, hardware-specific tuning, and library offloading to ensure inference runs efficiently.

The same generated code can take advantage of GPUs on capable devices using drivers like OpenCL or Vulkan, while still running efficiently on lower-end hardware by relying on optimized CPU instruction sets. This makes it possible to deploy the same AI-powered features across phones, tablets, desktops, and even the web.

A consistent developer experience with the AI SDK

React Native AI’s MLC package integrates directly with the Vercel AI SDK, which means you do not need to learn a new API to work with third-party on-device models. The JavaScript interface stays the same whether you are using Apple’s built-in models or MLC-powered models.

Because the API remains consistent, switching runtimes becomes a configuration choice rather than a rewrite. This makes it easier to adapt your app to different platforms, devices, or future runtimes without changing application logic.

Model selection and lifecycle management

The episode demonstrates how model selection works in practice. Available models are declared ahead of time and can be downloaded, prepared, and instantiated at runtime. Once a model is downloaded and prepared, it can be reused across sessions with minimal startup cost.

You can dynamically choose which model to run, prepare it once, and then interact with it using the same APIs for text generation, streaming output, and structured results.

Text generation, streaming, and structured output

Beyond basic text generation, this episode shows how MLC models can be used with streaming output and structured responses. Streaming allows tokens to appear as they are generated, while structured output enforces a predefined schema for model responses.

Structured output makes it possible to integrate model responses directly into application logic, enabling features like typed data extraction, form generation, and custom workflows driven by AI-generated data.

Building flexible on-device AI features

By combining MLC LLM with React Native AI and the AI SDK, you can build on-device AI features that are portable, configurable, and performant. This episode shows how third-party models fit into the same architecture as built-in ones, allowing you to choose the right runtime for each use case without changing how your app is written.

If you need flexibility in model choice while keeping a stable API and strong performance across platforms, this episode provides the foundation for building those capabilities into your React Native applications.

‍

Can your AI survive a lost connection?

We help teams design on-device and hybrid AI architectures that stay reliable, private, and cost‑predictable in real mobile apps.

Let’s chat

Link copied to clipboard!

Button

Insights

Learn more about AI

Stay up to date with our latest insights on React, React Native, and cross-platform development from the people who build the technology and scale with it daily.

View all