How to Use LLM Models Locally on Your Android

Main Image
  • Like
  • Comment
  • Share

AI has been the talk of the town over the past few years and even now it’s only getting better. You can access a multitude of AI models that run on the cloud, giving you the same exceptional experience no matter how powerful your device is. The major drawback here is that you need an internet connection to be able to access these LLMs.

MLC LLM has developed an Android app called MLC Chat, allowing you to run LLMs directly on your device. While these local LLMs may not match the power of their cloud-based counterparts, they do provide access to LLM functionality when offline.

In this guide, we’ll show you how you can use the MLC Chat app on your Android device and start using AI chatting without an internet connection. So, without further ado, let’s get started.

Install the MLC Chat App 

The MLC Chat app is not available on the Google Play Store. Hence, you have to sideload the app on your device. Follow the steps below.

Note: The MLC Chat app is still in the demo and is made specifically for the Galaxy S23 devices powered by the Snapdragon 8 Gen 2 chip.

1. Download and install the MLC Chat app on your device. You can download the app from the official website for free (148 MB).

2. After installing the MLC Chat app, you need to download the desired LLM model to use it for AI chatting. Make sure you are connected to the internet.

3. I recommend downloading the Phi-2 model first since it should run fine on most devices.

4. Tap on the Download button next to the model name to download it to your device. The download size and speed may vary depending on your internet connection.

5. Once the model is downloaded, you can let go of the internet and tap on the Chat icon to start using the model that you just downloaded.

6. You can now start chatting with the AI model locally. Note that you can give input and get output in text-format only.

ALSO READ: 2024 Netflix Subscription Plans: Best Netflix Monthly and Yearly Plan, Free Streaming Offers

Model List:

  • Llama3-8B-Instruct-q3f16-MLC
  • gemma-2b-q4f16_1 (not working)
  • phi-2-q4f16_1 (recommended)
  • Llama-2-7b-chat-hf-q4f16_1
  • Mistral-7B-Instruct-v0.2-q4f16
  • RedPajama-INCITE-Chat-3B-v1-q4f16_1

Using the MLC Chat App

The MLC Chat app allows you to store LLMs on your device. At the time of writing, the MLC Chat app cannot leverage the NPU on your smartphone chip; instead, it uses the CPU to run these LLMs locally.

Moreover, the app is optimized for the Snapdragon chipsets only, particularly the Snapdragon 8 Gen 2 found on the Galaxy S23. So, if you have a MediaTek device like I do, the number of tokens per second might be low. 

For perspective, I was getting somewhere around 1.5 to 2 tokens per second (prefill) and about 4 tokens per second for decoding on my POCO X6 Pro which is powered by the MediaTek Dimensity 8300 Ultra chip. For the uninitiated, more number of tokens means faster and better performance.

As for using the MLC Chat app, the interface is pretty simple. Open the app and tap on the Chat icon to start using your downloaded LLM. 

Copy Text

Once you’ve got your response, you can tap and hold the text to select and copy it or continue the conversation with more prompts.

Reset the Chat

If you’d like to end a chat or conversation and start a new one, tap on the reset icon at the top right corner of the interface.

ALSO READ: Google Wallet has reportedly started working in India for select users


Can I use the MLC Chat app on any Android device?

The MLC Chat app is currently optimized for Galaxy S23 devices powered by the Snapdragon 8 Gen 2 chip. It may not work properly on other devices, especially those with a weak processor.

Are the locally run LLM models as powerful as the cloud-based models?

No, the locally run LLM models may not be as powerful as the cloud-based models due to hardware limitations.

Can I use the MLC Chat app offline?

Yes, you can use the MLC Chat app offline once you have downloaded the desired LLM model.

Can I download multiple LLM models on the MLC Chat app?

Yes, you can download multiple LLM models on the MLC Chat app and switch between them as needed.

You can follow Smartprix on TwitterFacebookInstagram, and Google News. Visit for the most recent newsreviews, and tech guides

Mehtab AnsariMehtab Ansari
Mehtab Ansari is a tech enthusiast who also has a great passion in writing. During his two years of career, he has covered news, features, and evergreen content on multiple platforms. Apart from keeping a close eye on emerging tech developments, he likes spending time at the gym.

Related Articles

ImageExclusive: Samsung Galaxy Watch7 Ultra 5K Renders; Say Hello to Squarish Design

Samsung’s Galaxy Watch series is one of the most popular WearOS watches on the market. The Galaxy Watch Classic series, though, has a separate fanbase due to its elegant design and, most prominently, the rotating bezel. We’re happy to report that Samsung is continuing the Classic series with the Galaxy Watch7 Ultra, but with a …

ImageGoogle Bard Is Now Gemini, Android App Now Available On Play Store

In a rather surprising development, Google has renamed its AI chatbot Bard. From now on, the chatbot will be called Gemini. Hence, the users who search for Bard on Google Search will now end up on “” Along with this, the Alphabet-owned company has also launched Gemini Advanced, the AI chatbot based on the most …

ImageApple to bring Generative AI to all its devices as soon as next year

Artificial Intelligence (AI) took the world by storm as millions of people flocked to services such as OpenAI’s ChatGPT and Microsoft Bing AI. The company that has been at the forefront of smartphones and technology was left gasping for air thanks to the sudden popularity of AI. I am talking about Apple which took the …

ImageHow to use Photo Stacks in Google Photos to clean up your library?

Google Photos ships on every Android smartphone by default. It is a great app for viewing photos and videos as well as taking a backup when you decide to delete the original photos to free up local storage. However, if you have been using Google Photos for a long time, no doubt you might’ve amassed a humongous library of photos and videos.  One such common occurrence that …

ImageAnthropic Launches New Claude 3 Models: Capabilities, Use Cases, And Cost Explained

Anthropic, OpenAI’s competitor, has launched the Claude 3 model family, which includes three state-of-the-art models. In ascending order of capability, these are Haiku, Sonnet, and Opus. Each successive model offers more powerful performance, giving users many options per their intelligence, speed, and cost requirements. While Opus and Sonnet are available to use in, which …


Be the first to leave a comment.