Technology

MM1, a Family of Multimodal AI Models with up to 30 billion Parameters, is being Developed by Apple Researchers

Published

1 year ago

March 18, 2024

In a pre-print paper, Apple researchers presented their work on developing a multimodal large language model (LLM) for artificial intelligence (AI). The paper describes how it was possible to achieve the advanced capabilities of multimodality and train the foundation model on both text-only data and images, and it was published on an online portal on March 14. The Cupertino-based tech giant has made new advances in AI in response to CEO Tim Cook’s statement during the company’s earnings calls, which stated that AI features might be released later this year.

ArXiv, an open-access online repository for scholarly papers, has published the research paper’s pre-print version. Peer review is not, however, applied to the papers that are posted here. The project is thought to be connected to Apple as well, even though the paper makes no mention of the company; this is because the majority of the researchers mentioned are connected to the machine learning (ML) division of Apple.

A family of multimodal models with up to 30 billion parameters, known as MM1, is the project that the researchers are currently working on. The paper’s authors referred to it as a “performant multimodal LLM (MLLM)” and noted that in order to build an AI model that can comprehend both text and image-based inputs, image encoders, the vision language connector, and other architecture elements and data decisions were made.

The paper provided an example in stating that “We demonstrate that achieving state-of-the-art (SOTA) few-shot results across multiple benchmarks, compared to other published pre-training results, requires a careful mix of image-caption, interleaved image-text, and text-only data for large-scale multimodal pre-training.”

To put it simply, the AI model has not received enough training to produce the intended results and is presently in the pre-training phase. This phase involves designing the model’s workflow and data processing eventually using the algorithm and AI architecture. The researchers at Apple were able to incorporate computer vision into the model by means of a vision language connector and image encoders. Upon conducting tests using a combination of image-only, image-text, and text-only data sets, the team discovered that the outcomes were comparable to those of other models at the same stage.

Although this is a significant breakthrough, there is insufficient evidence in this research paper to conclude that Apple will integrate a multimodal AI chatbot into its operating system. It’s difficult to even say at this point whether the AI model is multimodal in terms of receiving inputs or producing output (i.e., whether it can produce AI images or not). However, it can be said that the tech giant has made significant progress toward developing a native generative AI foundation model if the results are verified to be consistent following peer review.

Up Next

NVIDIA Releases 6G Research Cloud Platform to Use AI to Improve Wireless Communications

Don't Miss

Google Gemini may be used by Apple to Power Certain AI Features on the iPhone

Kajal Chavan

Technology

Microsoft Expands Copilot Voice and Think Deeper

Published

2 months ago

February 25, 2025

Archana Suryawanshi

Microsoft Expands Copilot Voice and Think Deeper

Microsoft is taking a major step forward by offering unlimited access to Copilot Voice and Think Deeper, marking two years since the AI-powered Copilot was first integrated into Bing search. This update comes shortly after the tech giant revamped its Copilot Pro subscription and bundled advanced AI features into Microsoft 365.

What’s Changing?

Microsoft remains committed to its $20 per month Copilot Pro plan, ensuring that subscribers continue to enjoy premium benefits. According to the company, Copilot Pro users will receive:

Preferred access to the latest AI models during peak hours.
Early access to experimental AI features, with more updates expected soon.
Extended use of Copilot within popular Microsoft 365 apps like Word, Excel, and PowerPoint.

The Impact on Users

This move signals Microsoft’s dedication to enhancing AI-driven productivity tools. By expanding access to Copilot’s powerful features, users can expect improved efficiency, smarter assistance, and seamless integration across Microsoft’s ecosystem.

As AI technology continues to evolve, Microsoft is positioning itself at the forefront of innovation, ensuring both casual users and professionals can leverage the best AI tools available.

Stay tuned for further updates as Microsoft rolls out more enhancements to its AI offerings.

Technology

Google Launches Free AI Coding Tool for Individual Developers

Published

2 months ago

February 25, 2025

Archana Suryawanshi

Google Launches Free AI Coding Tool for Individual Developers

Google has introduced a free version of Gemini Code Assistant, its AI-powered coding assistant, for solo developers worldwide. The tool, previously available only to enterprise users, is now in public preview, making advanced AI-assisted coding accessible to students, freelancers, hobbyists, and startups.

More Features, Fewer Limits

Unlike competing tools such as GitHub Copilot, which limits free users to 2,000 code completions per month, Google is offering up to 180,000 code completions—a significantly higher cap designed to accommodate even the most active developers.

“Now anyone can easily learn, generate code snippets, debug, and modify applications without switching between multiple windows,” said Ryan J. Salva, Google’s senior director of product management.

AI-Powered Coding Assistance

Gemini Code Assist for individuals is powered by Google’s Gemini 2.0 AI model and offers:
Auto-completion of code while typing
Generation of entire code blocks based on prompts
Debugging assistance via an interactive chatbot

The tool integrates with popular developer environments like Visual Studio Code, GitHub, and JetBrains, supporting a wide range of programming languages. Developers can use natural language prompts, such as:
“Create an HTML form with fields for name, email, and message, plus a submit button.”

With support for 38 programming languages and a 128,000-token memory for processing complex prompts, Gemini Code Assist provides a robust AI-driven coding experience.

Enterprise Features Still Require a Subscription

While the free tier is generous, advanced features like productivity analytics, Google Cloud integrations, and custom AI tuning remain exclusive to paid Standard and Enterprise plans.

With this move, Google aims to compete more aggressively in the AI coding assistant market, offering developers a powerful and unrestricted alternative to existing tools.

Technology

Elon Musk Unveils Grok-3: A Game-Changing AI Chatbot to Rival ChatGPT

Published

3 months ago

February 19, 2025

Archana Suryawanshi

Elon Musk Unveils Grok-3: A Game-Changing AI Chatbot to Rival ChatGPT

Elon Musk’s artificial intelligence company xAI has unveiled its latest chatbot, Grok-3, which aims to compete with leading AI models such as OpenAI’s ChatGPT and China’s DeepSeek. Grok-3 is now available to Premium+ subscribers on Musk’s social media platform x (formerly Twitter) and is also available through xAI’s mobile app and the new SuperGrok subscription tier on Grok.com.

Advanced capabilities and performance

Grok-3 has ten times the computing power of its predecessor, Grok-2. Initial tests show that Grok-3 outperforms models from OpenAI, Google, and DeepSeek, particularly in areas such as math, science, and coding. The chatbot features advanced reasoning features capable of decomposing complex questions into manageable tasks. Users can interact with Grok-3 in two different ways: “Think,” which performs step-by-step reasoning, and “Big Brain,” which is designed for more difficult tasks.

Strategic Investments and Infrastructure

To support the development of Grok-3, xAI has made major investments in its supercomputer cluster, Colossus, which is currently the largest globally. This infrastructure underscores the company’s commitment to advancing AI technology and maintaining a competitive edge in the industry.

New Offerings and Future Plans

Along with Grok-3, xAI has also introduced a logic-based chatbot called DeepSearch, designed to enhance research, brainstorming, and data analysis tasks. This tool aims to provide users with more insightful and relevant information. Looking to the future, xAI plans to release Grok-2 as an open-source model, encouraging community participation and further development. Additionally, upcoming improvements for Grok-3 include a synthesized voice feature, which aims to improve user interaction and accessibility.

Market position and competition

The launch of Grok-3 positions xAI as a major competitor in the AI chatbot market, directly challenging established models from OpenAI and emerging competitors such as DeepSeek. While Grok-3’s performance claims are yet to be independently verified, early indications suggest it could have a significant impact on the AI landscape. xAI is actively seeking $10 billion in investment from major companies, demonstrating its strong belief in their technological advancements and market potential.

HBO Confirms Hogwarts Staff Casting in Harry Potter TV Series

Entertainment4 weeks ago

HBO Confirms Hogwarts Staff Casting in Harry Potter TV Series

Entertainment3 weeks ago

“Him” Teaser: Jordan Peele’s New Sports Horror Film Tackles Obsession and Sacrifice

Danny Boyle Drops Second Trailer for '28 Years Later

Entertainment3 weeks ago

Danny Boyle Drops Second Trailer for ’28 Years Later

Apsters Media

MM1, a Family of Multimodal AI Models with up to 30 billion Parameters, is being Developed by Apple Researchers

Technology

MM1, a Family of Multimodal AI Models with up to 30 billion Parameters, is being Developed by Apple Researchers

Technology

Microsoft Expands Copilot Voice and Think Deeper

What’s Changing?

The Impact on Users

Technology

Google Launches Free AI Coding Tool for Individual Developers

More Features, Fewer Limits

AI-Powered Coding Assistance

Enterprise Features Still Require a Subscription

Technology

Elon Musk Unveils Grok-3: A Game-Changing AI Chatbot to Rival ChatGPT

Advanced capabilities and performance

Strategic Investments and Infrastructure

New Offerings and Future Plans

Market position and competition

Trending

Apsters Media

MM1, a Family of Multimodal AI Models with up to 30 billion Parameters, is being Developed by Apple Researchers

You may like

Technology

Microsoft Expands Copilot Voice and Think Deeper

What’s Changing?

The Impact on Users

Technology

Google Launches Free AI Coding Tool for Individual Developers

More Features, Fewer Limits

AI-Powered Coding Assistance

Enterprise Features Still Require a Subscription

Technology

Elon Musk Unveils Grok-3: A Game-Changing AI Chatbot to Rival ChatGPT

Advanced capabilities and performance

Strategic Investments and Infrastructure

New Offerings and Future Plans

Market position and competition

Trending