Connect with us

Technology

Gen AI without the dangers

Published

on

It’s understandable that ChatGPT, Stable Diffusion, and DreamStudio-Generative AI are making headlines. The outcomes are striking and getting better geometrically. Already, search and information analysis, as well as code creation, network security, and article writing, are being revolutionized by intelligent assistants.

Gen AI will play a critical role in how businesses run and provide IT services, as well as how business users complete their tasks. There are countless options, but there are also countless dangers. Successful AI development and implementation can be a costly and risky process. Furthermore, the workloads associated with Gen AI and the large language models (LLMs) that drive it are extremely computationally demanding and energy-intensive.Dr. Sajjad Moazeni of the University of Washington estimates that training an LLM with 175 billion or more parameters requires an annual energy expenditure for 1,000 US households, though exact figures are unknown. Over 100 million generative AI questions answered daily equate to one gigawatt-hour of electricity use, or about 33,000 US households’ daily energy use.

How even hyperscalers can afford that much electricity is beyond me. It’s too expensive for the typical business. How can CIOs provide reliable, accurate AI without incurring the energy expenses and environmental impact of a small city?

Six pointers for implementing Gen AI economically and with less risk

Retraining generative AI to perform particular tasks is essential to its applicability in business settings. Expert models produced by retraining are smaller, more accurate, and require less processing power. So, in order to train their own AI models, does every business need to establish a specialized AI development team and a supercomputer? Not at all.

Here are six strategies to create and implement AI without spending a lot of money on expensive hardware or highly skilled personnel.

Start with a foundation model rather than creating the wheel.

A company might spend money creating custom models for its own use cases. But the expenditure on data scientists, HPC specialists, and supercomputing infrastructure is out of reach for all but the biggest government organizations, businesses, and hyperscalers.

Rather, begin with a foundation model that features a robust application portfolio and an active developer ecosystem. You could use an open-source model like Meta’s Llama 2, or a proprietary model like OpenAI’s ChatGPT. Hugging Face and other communities provide a vast array of open-source models and applications.

Align the model with the intended use

Models can be broadly applicable and computationally demanding, such as GPT, or more narrowly focused, like Med-BERT (an open-source LLM for medical literature). The time it takes to create a viable prototype can be shortened and months of training can be avoided by choosing the appropriate model early in the project.

However, exercise caution. Any model may exhibit biases in the data it uses to train, and generative AI models are capable of lying outright and fabricating responses. Seek models trained on clean, transparent data with well-defined governance and explicable decision-making for optimal trustworthiness.

Retrain to produce more accurate, smaller models

Retraining foundation models on particular datasets offers various advantages. The model sheds parameters it doesn’t need for the application as it gets more accurate on a smaller field. One way to trade a general skill like songwriting for the ability to assist a customer with a mortgage application would be to retrain an LLM in financial information.

With a more compact design, the new banking assistant would still be able to provide superb, extremely accurate services while operating on standard (current) hardware.

Make use of your current infrastructure

A supercomputer with 10,000 GPUs is too big for most businesses to set up. Fortunately, most practical AI training, retraining, and inference can be done without large GPU arrays.

  • Training up to 10 billion: at competitive price/performance points, contemporary CPUs with integrated AI acceleration can manage training loads in this range. For better performance and lower costs, train overnight during periods of low demand for data centers.
  • Retraining up to 10 billion models is possible with modern CPUs; no GPU is needed, and it takes only minutes.
  • With integrated CPUs, smaller models can operate on standalone edge devices, with inferencing ranging from millions to less than 20 billion. For models with less than 20 billion parameters, such as Llama 2, CPUs can respond as quickly and precisely as GPUs.

Execute inference with consideration for hardware

Applications for inference can be fine-tuned and optimized for improved performance on particular hardware configurations and features. Similar to training a model, optimizing one for a given application means striking a balance between processing efficiency, model size, and accuracy.

One way to increase inference speeds four times while maintaining accuracy is to round down a 32-bit floating point model to the nearest 8-bit fixed integer (INT8). Utilizing host accelerators such as integrated GPUs, Intel® Advanced Matrix Extensions (Intel® AMX), and Intel® Advanced Vector Extensions 512 (Intel® AVX-512), tools such as Intel® Distribution of OpenVINOTM toolkit manage optimization and build hardware-aware inference engines.

Monitor cloud utilization

A quick, dependable, and expandable route is to offer AI services through cloud-based AI applications and APIs. Customers and business users alike benefit from always-on AI from a service provider, but costs can rise suddenly. Everyone will use your AI service if it is well-liked by all.

Many businesses that began their AI journeys entirely in the cloud are returning workloads to their on-premises and co-located infrastructure that can function well there. Pay-as-you-go infrastructure-as-a-service is becoming a competitive option for cloud-native enterprises with minimal or no on-premises infrastructure in comparison to rising cloud costs.

You have choices when it comes to Gen AI. Generative AI is surrounded by a lot of hype and mystery, giving the impression that it’s a cutting-edge technology that’s only accessible to the wealthiest companies. Actually, on a typical CPU-based data center or cloud instance, hundreds of high-performance models, including LLMs for generative AI, are accurate and performant. Enterprise-grade generative AI experimentation, prototyping, and deployment tools are rapidly developing in both open-source and proprietary communities.

By utilizing all of their resources, astute CIOs can leverage AI that transforms businesses without incurring the expenses and hazards associated with in-house development.

Technology

Threads uses a more sophisticated search to compete with Bluesky

Published

on

Instagram Threads, a rival to Meta’s X, will have an enhanced search experience, the firm said Monday. The app, which is based on Instagram’s social graph and provides a Meta-run substitute for Elon Musk’s X, is introducing a new feature that lets users search for certain posts by date ranges and user profiles.

Compared to X’s advanced search, which now allows users to refine queries by language, keywords, exact phrases, excluded terms, hashtags, and more, this is less thorough. However, it does make it simpler for users of Threads to find particular messages. Additionally, it will make Threads’ search more comparable to Bluesky’s, which also lets users use sophisticated queries to restrict searches by user profiles, date ranges, and other criteria. However, not all of the filtering options are yet visible in the Bluesky app’s user interface.

In order to counter the danger posed by social networking startup Bluesky, which has quickly gained traction as another X competitor, Meta has started launching new features in quick succession in recent days. Bluesky had more than 9 million users in September, but in the weeks after the U.S. elections, users left X due to Elon Musk’s political views and other policy changes, including plans to alter the way blocks operate and let AI companies train on X user data. According to Bluesky, there are currently around 24 million users.

Meta’s Threads introduced new features to counter Bluesky’s potential, such as an improved algorithm, a design modification that makes switching between feeds easier, and the option for users to select their own default feed. Additionally, it was observed creating Starter Packs, its own version of Bluesky’s user-curated recommendation lists.

Continue Reading

Technology

Apple’s own 5G modem-equipped iPhone SE 4 is “confirmed” to launch in March

Published

on

Tom O’Malley, an analyst at Barclays, recently visited Asia with his colleagues to speak with suppliers and makers of electronics. The analysts said they had “confirmed” that a fourth-generation iPhone SE with an Apple-designed 5G modem is scheduled to launch near the end of the first quarter next year in a research note they released this week that outlines the main conclusions from the trip. That timeline implies that the next iPhone SE will be unveiled in March, similar to when the present model was unveiled in 2022, in keeping with earlier rumors.

The rumored features of the fourth-generation iPhone SE include a 6.1-inch OLED display, Face ID, a newer A-series chip, a USB-C port, a single 48-megapixel rear camera, 8GB of RAM to enable Apple Intelligence support, and the previously mentioned Apple-designed 5G modem. The SE is anticipated to have a similar design to the base iPhone 14.

Since 2018, Apple is said to have been developing its own 5G modem for iPhones, a move that will let it lessen and eventually do away with its reliance on Qualcomm. With Qualcomm’s 5G modem supply arrangement for iPhone launches extended through 2026 earlier this year, Apple still has plenty of time to finish switching to its own modem. In addition to the fourth-generation iPhone SE, Apple analyst Ming-Chi Kuo earlier stated that the so-called “iPhone 17 Air” would come with a 5G modem that was created by Apple.

Whether Apple’s initial 5G modem would offer any advantages to consumers over Qualcomm’s modems, such quicker speeds, is uncertain.

Qualcomm was sued by Apple in 2017 for anticompetitive behavior and $1 billion in unpaid royalties. In 2019, Apple purchased the majority of Intel’s smartphone modem business after the two firms reached a settlement in the dispute. Apple was able to support its development by acquiring a portfolio of patents relating to cellular technology. It appears that we will eventually be able to enjoy the results of our effort in four more months.

On March 8, 2022, Apple made the announcement of the third-generation iPhone SE online. With antiquated features like a Touch ID button, a Lightning port, and large bezels surrounding the screen, the handset resembles the iPhone 8. The iPhone SE presently retails for $429 in the United States, but the new model may see a price increase of at least a little.

Continue Reading

Technology

Google is said to be discontinuing the Pixel Tablet 2 and may be leaving the market once more

Published

on

Google terminated the development of the Pixel Tablet 3 yesterday, according to Android Headlines, even before a second-generation model was announced. The second-generation Pixel Tablet has actually been canceled, according to the report. This means that the gadget that was released last year will likely be a one-off, and Google is abandoning the tablet market for the second time in just over five years.

If accurate, the report indicates that Google has determined that it is not worth investing more money in a follow-up because of the dismal sales of the Pixel Tablet. Rumors of a keyboard accessory and more functionality for the now-defunct project surfaced as recently as last week.

It’s important to keep in mind that Google’s Nest subsidiary may abandon its plans for large-screen products in favor of developing technologies like the Nest Hub and Hub Max rather than standalone tablets.

Google has always had difficulty making a significant impact in the tablet market and creating a competitor that can match Apple’s iPad in terms of sales and general performance, not helped in the least by its inconsistent approach. Even though the hardware was good, it never really fought back after getting off to a promising start with the Nexus 7 eons ago. Another problem that has hampered Google’s efforts is that Android significantly trails iPadOS in terms of the quantity of third-party apps that are tablet-optimized.

After the Pixel Slate received tremendously unfavorable reviews, the firm first declared that it was finished producing tablets in 2019. Two tablets that were still in development at the time were discarded.

By 2022, however, Google had altered its mind and declared that a tablet was being developed by its Pixel hardware team. The $499 Pixel Tablet was the final version of the gadget, which came with a speaker dock that the tablet could magnetically connect to. (Google would subsequently charge $399 for the tablet alone.)

Continue Reading

Trending

error: Content is protected !!