Mon, Oct 06, 2025
The internet doesn’t moderate itself. Every social media platform, from Facebook to YouTube, relies on content moderation systems that decide what stays online and what shouldn't. While there are still people behind some ultimate decisions, in many cases, the first line of defence has been handed to Large Language Models (LLMs) and their invisible algorithms.
An Artificial Intelligence (AI) model trained in California might recognise hate speech when it sees racist slurs on social media, but would it understand a post inflaming religious tensions in India? This is the type of question the next generation of AI companies is asking in India. "These generative systems are often foreign, designed with Western ideas of bias, hate, and safety," a senior consultant at a Big Tech company told The Secretariat.
This is one of the cases where the phrase "sovereign AI" enters the lexicon. It's shorthand for artificial intelligence developed within a country’s own ecosystem using its own data, infrastructure, and workforce.
"Generative engines power today’s chatbots, content tools, and translation systems," said the senior consultant, adding that "more than national pride, it's about capability. Think about who controls the algorithms, the data they are trained on, and the values they encode."
India’s Ministry of Electronics and Information Technology (MeitY) has launched eight projects under the IndiaAI Mission to build indigenous LLMs.
Foreign models are built on global, but often English-dominated, Global North-centric datasets.
GenLoop, a start-up backed by MeitY’s IndiaAI Mission, wants to change that. The goal is to train models on India-specific data that reflect local languages, laws, and cultural contexts.
“We don't want ourselves to get too dependent on a fundamental tech and then get the rug taken out from under us,” Ayush Gupta, CEO of GenLoop, told The Secretariat. GenLoop is one of the companies selected for the IndiaAI Mission.
Their project, Kavach, aims to give India its own content moderation model.
Size Isn’t Everything
GenLoop is in fact building three small language models (SMLs) of around two billion parameters each — Yukti (a base model), Varta (a conversational model), and Kavach (a content-moderation model).
For the unanointed, a “two-billion parameter model” sounds abstract. In practice, parameters are the numerical weights that determine how an LLM generates outputs. The largest global models today run into the hundreds of billions, but smaller models can be more efficient and affordable, specifically when targeted for a specific set of tasks.
“With smaller models, a single task can be perfected in a way that doesn't require a lot of memory. That can then be run at scale in a very cost-efficient manner," says Gupta. "For example, one LLM call of a bigger model, like a trillion-parameter model, takes around eight iPhone batteries worth of energy, but from an SLM, you can possibly get a thousand calls with the same energy," he says. A call refers to a single inference request, that is, one instance of a user asking the model a question or prompting it to generate a response.
The focus is on practical sovereignty. Not competing on raw scale, but building tools that work for Indian users in real-world conditions.
“We are not trying to beat GPT. What we want is a model that understands Indian languages natively and is affordable to deploy across government systems, classrooms, and start-ups,” says the senior consultant.
India-Specific Challenges
Why not simply adopt an existing Western model? “Most international models don’t even have taxonomies of terrorism, communalism, and cultural biases. And those are very important things for India,” explains Gupta.
In content moderation, especially, Indian realities differ sharply from American or European contexts. Terms like communalism — referring to divisive speech along religious lines — rarely appear in global moderation taxonomies, yet they are critical for India’s information ecosystem.
“The only content moderation tools available to India today, where India is doing a billion calls every day, are from a foreign lens. One of them is Llamaguard from the Meta team, and they don't have taxonomies of terrorism, communalism, or cultural biases, which are important from the Indian context,” explains Gupta.
GenLoop’s Kavach model is designed as a sovereign content-safety layer. It aims to provide moderation in all 22 scheduled Indian languages, a task foreign providers have struggled with. The ambition is context-aware filtering that can distinguish between hate speech and legitimate political or religious expression.
“If we want to keep our digital spaces safe without over-relying on foreign moderation filters, we need models that understand our own linguistic and cultural nuances,” the senior consultant said.
Public–Private Partnerships At Work
Building such models is expensive. Data collection, compute infrastructure, and skilled researchers all come with high costs. That is where the public–private partnership (PPP) approach of the IndiaAI Mission comes in.
The government is providing grants, access to computing resources, and coordination through MeitY. Private firms like GenLoop bring the technical expertise. It is a symbiotic arrangement; public funds lower the barrier to entry, while private innovation speeds up delivery.
This model also recognises that just the state alone cannot build sovereign AI at scale, but the private sector cannot do it without state support in data access and infrastructure.
Looking Ahead
India’s sovereign AI effort is still in its infancy. The first wave of models — from GenLoop and the seven other selected projects — will need to prove themselves not just technically, but in adoption by universities, start-ups, and government departments.
Critics may argue that two-billion-parameter models will never rival global giants. But the aim here is not to outgun Silicon Valley. It is to build trustworthy, affordable, and India-ready AI.
As the consultant put it, “The vision is that whenever a government department, or a classroom, or a start-up asks: Is there an Indian model we can use, the answer should be yes. We should be able to say that we have AI at home.”
If the PPP model works, India could carve out a distinctive path that isn't about chasing size for its own sake, but embedding AI in a way that reflects local languages, norms, and needs.
Sovereignty in AI, in other words, is about ensuring that when Indians reach for the tools of the future, they can reach for ones shaped by their own society.