Why Is Elon Musk Hiring Hindi Tutors To Train AI Models?

Billionaire's Hindi-focused hiring for xAI signals a new step toward making AI truly multilingual. But India needs to lead this transformation from within

Artificial intelligence (AI) company xAI is hiring people who can speak English and other languages fluently to train its future large language models (LLMs). Hindi happens to be one of the 14 languages for which the company is hiring

It's a six-month-long temporary position that allows one to work remotely from anywhere in the world and be paid between US$ 35 and US$ 65 (Rs 2,950 to Rs 5,477) per hour. 

For a person working the job for eight-and-a-half hours a day, as mentioned in the job description, they could earn anywhere between Rs 24,990 and Rs 46,410 per day. What xAI will pay is significantly more than the average salary in India on a per-month basis, which falls somewhere between Rs 44,000 and Rs 30,000.

Importance Of Hindi For xAI

Why is xAI interested in training its models in Hindi? The answer has several layers.

For those who don’t know what xAI is, it’s a company started by Elon Musk in 2023 after the Artificial Intelligence (AI) boom, which began with the birth of ChatGPT in 2022. One of the famous (and infamous) products of xAI is its chatbot called Grok, a competitor to other chatbots like ChatGPT, Google’s Gemini, Anthropic’s Claude, etc.

Grok, often described as an "unhinged distant cousin of other mild-mannered chatbots", is integrated and currently only available on the microblogging website X (formerly known as Twitter), which is also owned by Musk. Grok was developed as a collaboration between X and xAI.

Users on X can interact with Grok for various tasks, such as answering questions and assisting with content generation directly within the platform. Currently, only Premium+ users on X have access to Grok.

As of now, Grok supports multiple languages, with a primary focus on English. While the exact list of languages Grok can speak is not fully disclosed, it's expected to include commonly spoken languages like Spanish, French, German, etc.

As of April 2024, X has the third-largest user base in India, after the US and Japan. So, it makes sense that xAI would want its models to be trained in Hindi, the most widely-spoken language in India.

xAI isn’t the first company to train its AI models in Hindi. Other models like ChatGPT, Anthropic, Gemini, and Microsoft Copilot can easily converse in Hindi and other Indian languages. But these models, still yet, work best in languages like English, where text and audio data is abundant online. 

For example, when Scale AI put out ads last year to hire 60 contract writers to help train generative AI models across a wide range of languages including Hausa, Punjabi, Thai, Persian, and Xhosa, there were stark pay disparities between these languages. 

Western languages like German were offered US$ 21.55 per hour, while Telugu paid as little as US$ 1.43 per hour, as per a report by Rest of World.

Hindi, Bengali and many other widely-spoken languages are underrepresented on the internet. They have made the mistake of not being English, thus they don’t get the spotlight. For example, even though Bengali is the seventh-most spoken language in the world, it has a presence in only 0.013 per cent of websites in the world. In comparison, English is used in 49.4 per cent of total websites.

The role of the 'AI tutor' that Musk is hiring is to help his company xAI do just that. To enhance the performance of its models, like Grok, in Hindi. The tutor would create and label good-quality data in both English and Hindi to feed this data to the AI model.

India Is The Largest Market For AI Apps

India has emerged as the largest market for AI app downloads in 2024, contributing 21 per cent of total global downloads, according to the 2024 AI Apps Market Insights report by Sensor Tower

These apps were mostly generative AI chatbots and photo-editing apps, including ChatGPT, the AI photo enhancer app Remini, Google's Gemini, Photoroom AI Photo Editor, Microsoft's Copilot, Character.AI, etc.

In August 2024, India even surpassed the US as ChatGPT’s top market by active users, with 13.4 per cent of total users, according to the report. 

While India has seen skyrocketing interest in generative AI tools, there’s a noticeable shortage of these AI products that fully support regional languages. Exhibit A: We don’t yet have a widely-used AI chatbot like ChatGPT that converses fluently in Hindi or other languages spoken here. 

Sure, we have AI models like Bhashini — a language technology platform focused on creating tools that support multiple Indian languages — and Jugalbandi, a collaborative effort between Microsoft and the Indian government to provide information on government schemes and services.

While these Indian-language models can involve conversational elements, they are generally focused on tasks or domains specific to an industry rather than engaging open-ended conversations like how a generative AI chatbot would.

This shortage is significant because it means that a vast, linguistically diverse user base is turning to tools that may not fully meet their needs, having to rely instead on models optimised for Western markets and English-speaking users.

By training its models in Hindi, xAI could begin to bridge this divide. If successful, xAI’s efforts might spark a much-needed shift toward creating AI tools that truly cater to speakers of different languages.

Looking at our neighbour to the north, China has been advancing its own generative AI chatbots that can communicate fluently in Mandarin as well as regional dialects like Cantonese, Hokkien and Dongbei. One such example is Baidu's Ernie Bot, which has over 200 million users.

While globally popular chatbots such as ChatGPT, Claude and Gemini are restricted in China, locally developed AI models cater specifically to Chinese languages and cultural nuances, filling the gap and providing tailored services that resonate with Chinese users.

India is similar to China in the sense that it has many regional languages and dialects, each of which has unique nuances. The answer to bridging this must therefore, also come from within India, with local companies developing AI that inherently understand the country’s unique linguistic and cultural fabric.

This is a free story, Feel free to share.

facebooktwitterlinkedInwhatsApp