Normal view

There are new articles available, click to refresh the page.
Yesterday — 25 June 2024Main stream

Researchers upend AI status quo by eliminating matrix multiplication in LLMs

25 June 2024 at 18:27
Illustration of a brain inside of a light bulb.

Enlarge / Illustration of a brain inside of a light bulb. (credit: Getty Images)

Researchers claim to have developed a new way to run AI language models more efficiently by eliminating matrix multiplication from the process. This fundamentally redesigns neural network operations that are currently accelerated by GPU chips. The findings, detailed in a recent preprint paper from researchers at the University of California Santa Cruz, UC Davis, LuxiTech, and Soochow University, could have deep implications for the environmental impact and operational costs of AI systems.

Matrix multiplication (often abbreviated to "MatMul") is at the center of most neural network computational tasks today, and GPUs are particularly good at executing the math quickly because they can perform large numbers of multiplication operations in parallel. That ability momentarily made Nvidia the most valuable company in the world last week; the company currently holds an estimated 98 percent market share for data center GPUs, which are commonly used to power AI systems like ChatGPT and Google Gemini.

In the new paper, titled "Scalable MatMul-free Language Modeling," the researchers describe creating a custom 2.7 billion parameter model without using MatMul that features similar performance to conventional large language models (LLMs). They also demonstrate running a 1.3 billion parameter model at 23.8 tokens per second on a GPU that was accelerated by a custom-programmed FPGA chip that uses about 13 watts of power (not counting the GPU's power draw). The implication is that a more efficient FPGA "paves the way for the development of more efficient and hardware-friendly architectures," they write.

Read 13 remaining paragraphs | Comments

Before yesterdayMain stream

Music industry giants allege mass copyright violation by AI firms

24 June 2024 at 14:44
Michael Jackson in concert, 1986. Sony Music owns a large portion of publishing rights to Jackson's music.

Enlarge / Michael Jackson in concert, 1986. Sony Music owns a large portion of publishing rights to Jackson's music. (credit: Getty Images)

Universal Music Group, Sony Music, and Warner Records have sued AI music-synthesis companies Udio and Suno for allegedly committing mass copyright infringement by using recordings owned by the labels to train music-generating AI models, reports Reuters. Udio and Suno can generate novel song recordings based on text-based descriptions of music (i.e., "a dubstep song about Linus Torvalds").

The lawsuits, filed in federal courts in New York and Massachusetts, claim that the AI companies' use of copyrighted material to train their systems could lead to AI-generated music that directly competes with and potentially devalues the work of human artists.

Like other generative AI models, both Udio and Suno (which we covered separately in April) rely on a broad selection of existing human-created artworks that teach a neural network the relationship between words in a written prompt and styles of music. The record labels correctly note that these companies have been deliberately vague about the sources of their training data.

Read 6 remaining paragraphs | Comments

Anthropic introduces Claude 3.5 Sonnet, matching GPT-4o on benchmarks

20 June 2024 at 17:04
The Anthropic Claude 3 logo, jazzed up by Benj Edwards.

Enlarge (credit: Anthropic / Benj Edwards)

On Thursday, Anthropic announced Claude 3.5 Sonnet, its latest AI language model and the first in a new series of "3.5" models that build upon Claude 3, launched in March. Claude 3.5 can compose text, analyze data, and write code. It features a 200,000 token context window and is available now on the Claude website and through an API. Anthropic also introduced Artifacts, a new feature in the Claude interface that shows related work documents in a dedicated window.

So far, people outside of Anthropic seem impressed. "This model is really, really good," wrote independent AI researcher Simon Willison on X. "I think this is the new best overall model (and both faster and half the price of Opus, similar to the GPT-4 Turbo to GPT-4o jump)."

As we've written before, benchmarks for large language models (LLMs) are troublesome because they can be cherry-picked and often do not capture the feel and nuance of using a machine to generate outputs on almost any conceivable topic. But according to Anthropic, Claude 3.5 Sonnet matches or outperforms competitor models like GPT-4o and Gemini 1.5 Pro on certain benchmarks like MMLU (undergraduate level knowledge), GSM8K (grade school math), and HumanEval (coding).

Read 17 remaining paragraphs | Comments

Ex-OpenAI star Sutskever shoots for superintelligent AI with new company

20 June 2024 at 10:06
Illya Sutskever physically gestures as OpenAI CEO Sam Altman looks on at Tel Aviv University on June 5, 2023.

Enlarge / Ilya Sutskever physically gestures as OpenAI CEO Sam Altman looks on at Tel Aviv University on June 5, 2023. (credit: Getty Images)

On Wednesday, former OpenAI Chief Scientist Ilya Sutskever announced he is forming a new company called Safe Superintelligence, Inc. (SSI) with the goal of safely building "superintelligence," which is a hypothetical form of artificial intelligence that surpasses human intelligence, possibly in the extreme.

"We will pursue safe superintelligence in a straight shot, with one focus, one goal, and one product," wrote Sutskever on X. "We will do it through revolutionary breakthroughs produced by a small cracked team."

Sutskever was a founding member of OpenAI and formerly served as the company's chief scientist. Two others are joining Sutskever at SSI initially: Daniel Levy, who formerly headed the Optimization Team at OpenAI, and Daniel Gross, an AI investor who worked on machine learning projects at Apple between 2013 and 2017. The trio posted a statement on the company's new website.

Read 8 remaining paragraphs | Comments

Runway’s latest AI video generator brings giant cotton candy monsters to life

18 June 2024 at 17:41
Screen capture of a Runway Gen-3 Alpha video generated with the prompt

Enlarge / Screen capture of a Runway Gen-3 Alpha video generated with the prompt "A giant humanoid, made of fluffy blue cotton candy, stomping on the ground, and roaring to the sky, clear blue sky behind them." (credit: Runway)

On Sunday, Runway announced a new AI video synthesis model called Gen-3 Alpha that's still under development, but it appears to create video of similar quality to OpenAI's Sora, which debuted earlier this year (and has also not yet been released). It can generate novel, high-definition video from text prompts that range from realistic humans to surrealistic monsters stomping the countryside.

Unlike Runway's previous best model from June 2023, which could only create two-second-long clips, Gen-3 Alpha can reportedly create 10-second-long video segments of people, places, and things that have a consistency and coherency that easily surpasses Gen-2. If 10 seconds sounds short compared to Sora's full minute of video, consider that the company is working with a shoestring budget of compute compared to more lavishly funded OpenAI—and actually has a history of shipping video generation capability to commercial users.

Gen-3 Alpha does not generate audio to accompany the video clips, and it's highly likely that temporally coherent generations (those that keep a character consistent over time) are dependent on similar high-quality training material. But Runway's improvement in visual fidelity over the past year is difficult to ignore.

Read 20 remaining paragraphs | Comments

Softbank plans to cancel out angry customer voices using AI

18 June 2024 at 13:09
A man is angry and screaming while talking on a smartphone.

Enlarge (credit: Getty Images / Benj Edwards)

Japanese telecommunications giant SoftBank recently announced that it has been developing "emotion-canceling" technology powered by AI that will alter the voices of angry customers to sound calmer during phone calls with customer service representatives. The project aims to reduce the psychological burden on operators suffering from harassment and has been in development for three years. Softbank plans to launch it by March 2026, but the idea is receiving mixed reactions online.

According to a report from the Japanese news site The Asahi Shimbun, SoftBank's project relies on an AI model to alter the tone and pitch of a customer's voice in real-time during a phone call. SoftBank's developers, led by employee Toshiyuki Nakatani, trained the system using a dataset of over 10,000 voice samples, which were performed by 10 Japanese actors expressing more than 100 phrases with various emotions, including yelling and accusatory tones.

Voice cloning and synthesis technology has made massive strides in the past three years. We've previously covered technology from Microsoft that can clone a voice with a three-second audio sample and audio-processing technology from Adobe that cleans up audio by re-synthesizing a person's voice, so SoftBank's technology is well within the realm of plausibility.

Read 11 remaining paragraphs | Comments

Retired engineer discovers 55-year-old bug in Lunar Lander computer game code

14 June 2024 at 14:04
Illustration of the Apollo lunar lander Eagle over the Moon.

Enlarge / Illustration of the Apollo lunar lander Eagle over the Moon. (credit: Getty Images)

On Friday, a retired software engineer named Martin C. Martin announced that he recently discovered a bug in the original Lunar Lander computer game's physics code while tinkering with the software. Created by a 17-year-old high school student named Jim Storer in 1969, this primordial game rendered the action only as text status updates on a teletype, but it set the stage for future versions to come.

The legendary game—which Storer developed on a PDP-8 minicomputer in a programming language called FOCAL just months after Neil Armstrong and Buzz Aldrin made their historic moonwalks—allows players to control a lunar module's descent onto the Moon's surface. Players must carefully manage their fuel usage to achieve a gentle landing, making critical decisions every ten seconds to burn the right amount of fuel.

In 2009, just short of the 40th anniversary of the first Moon landing, I set out to find the author of the original Lunar Lander game, which was then primarily known as a graphical game, thanks to the graphical version from 1974 and a 1979 Atari arcade title. When I discovered that Storer created the oldest known version as a teletype game, I interviewed him and wrote up a history of the game. Storer later released the source code to the original game, written in FOCAL, on his website.

Read 7 remaining paragraphs | Comments

“Simulation of keyboard activity” leads to firing of Wells Fargo employees

13 June 2024 at 16:51
Signage with logo at headquarters of Wells Fargo Capital Finance, the commercial banking division of Wells Fargo Bank, in the Financial District neighborhood of San Francisco, California, September 26, 2016.

Enlarge (credit: Getty Images)

Last month, Wells Fargo terminated over a dozen bank employees following an investigation into claims of faking work activity on their computers, according to a Bloomberg report.

A Financial Industry Regulatory Authority (FINRA) search conducted by Ars confirmed that the fired members of the firm's wealth and investment management division were "discharged after review of allegations involving simulation of keyboard activity creating impression of active work."

A rise in remote work during the COVID-19 pandemic accelerated the adoption of remote worker surveillance techniques, especially those using software installed on machines that keeps track of activity and reports back to corporate management. It's worth noting that the Bloomberg report says the FINRA filing does not specify whether the fired Wells Fargo employees were simulating activity at home or in an office.

Read 6 remaining paragraphs | Comments

Report: Apple isn’t paying OpenAI for ChatGPT integration into OSes

13 June 2024 at 13:20
The OpenAI and Apple logos together.

Enlarge (credit: OpenAI / Apple / Benj Edwards)

On Monday, Apple announced it would be integrating OpenAI's ChatGPT AI assistant into upcoming versions of its iPhone, iPad, and Mac operating systems. It paves the way for future third-party AI model integrations, but given Google's multi-billion-dollar deal with Apple for preferential web search, the OpenAI announcement inspired speculation about who is paying whom. According to a Bloomberg report published Wednesday, Apple considers ChatGPT's placement on its devices as compensation enough.

"Apple isn’t paying OpenAI as part of the partnership," writes Bloomberg reporter Mark Gurman, citing people familiar with the matter who wish to remain anonymous. "Instead, Apple believes pushing OpenAI’s brand and technology to hundreds of millions of its devices is of equal or greater value than monetary payments."

The Bloomberg report states that neither company expects the agreement to generate meaningful revenue in the short term, and in fact, the partnership could burn extra money for OpenAI, because it pays Microsoft to host ChatGPT's capabilities on its Azure cloud. However, OpenAI could benefit by converting free users to paid subscriptions, and Apple potentially benefits by providing easy, built-in access to ChatGPT during a time when its own in-house LLMs are still catching up.

Read 7 remaining paragraphs | Comments

Turkish student creates custom AI device for cheating university exam, gets arrested

12 June 2024 at 16:52
A photo illustration of what a shirt-button camera <em>could</em> look like.

Enlarge / A photo illustration of what a shirt-button camera could look like. (credit: Aurich Lawson | Getty Images)

On Saturday, Turkish police arrested and detained a prospective university student who is accused of developing an elaborate scheme to use AI and hidden devices to help him cheat on an important entrance exam, reports Reuters and The Daily Mail.

The unnamed student is reportedly jailed pending trial after the incident, which took place in the southwestern province of Isparta, where the student was caught behaving suspiciously during the TYT. The TYT is a nationally held university aptitude exam that determines a person's eligibility to attend a university in Turkey—and cheating on the high-stakes exam is a serious offense.

According to police reports, the student used a camera disguised as a shirt button, connected to AI software via a "router" (possibly a mistranslation of a cellular modem) hidden in the sole of their shoe. The system worked by scanning the exam questions using the button camera, which then relayed the information to an unnamed AI model. The software generated the correct answers and recited them to the student through an earpiece.

Read 5 remaining paragraphs | Comments

New Stable Diffusion 3 release excels at AI-generated body horror

12 June 2024 at 15:26
An AI-generated image created using Stable Diffusion 3 of a girl lying in the grass.

Enlarge / An AI-generated image created using Stable Diffusion 3 of a girl lying in the grass. (credit: HorneyMetalBeing)

On Wednesday, Stability AI released weights for Stable Diffusion 3 Medium, an AI image-synthesis model that turns text prompts into AI-generated images. Its arrival has been ridiculed online, however, because it generates images of humans in a way that seems like a step backward from other state-of-the-art image-synthesis models like Midjourney or DALL-E 3. As a result, it can churn out wild anatomically incorrect visual abominations with ease.

A thread on Reddit, titled, "Is this release supposed to be a joke? [SD3-2B]," details the spectacular failures of SD3 Medium at rendering humans, especially human limbs like hands and feet. Another thread, titled, "Why is SD3 so bad at generating girls lying on the grass?" shows similar issues, but for entire human bodies.

Hands have traditionally been a challenge for AI image generators due to lack of good examples in early training data sets, but more recently, several image-synthesis models seemed to have overcome the issue. In that sense, SD3 appears to be a huge step backward for the image-synthesis enthusiasts that gather on Reddit—especially compared to recent Stability releases like SD XL Turbo in November.

Read 10 remaining paragraphs | Comments

Apple and OpenAI currently have the most misunderstood partnership in tech

11 June 2024 at 13:29
A man talks into a smartphone.

Enlarge / He isn't using an iPhone, but some people talk to Siri like this.

On Monday, Apple premiered "Apple Intelligence" during a wide-ranging presentation at its annual Worldwide Developers Conference in Cupertino, California. However, the heart of its new tech, an array of Apple-developed AI models, was overshadowed by the announcement of ChatGPT integration into its device operating systems.

Since rumors of the partnership first emerged, we've seen confusion on social media about why Apple didn't develop a cutting-edge GPT-4-like chatbot internally. Despite Apple's year-long development of its own large language models (LLMs), many perceived the integration of ChatGPT (and opening the door for others, like Google Gemini) as a sign of Apple's lack of innovation.

"This is really strange. Surely Apple could train a very good competing LLM if they wanted? They've had a year," wrote AI developer Benjamin De Kraker on X. Elon Musk has also been grumbling about the OpenAI deal—and spreading misconceptions about it—saying things like, "It’s patently absurd that Apple isn’t smart enough to make their own AI, yet is somehow capable of ensuring that OpenAI will protect your security & privacy!"

Read 19 remaining paragraphs | Comments

Apple unveils “Apple Intelligence” AI features for iOS, iPadOS, and macOS

10 June 2024 at 15:15
Apple unveils “Apple Intelligence” AI features for iOS, iPadOS, and macOS

Enlarge (credit: Apple)

On Monday, Apple debuted "Apple Intelligence," a new suite of free AI-powered features for iOS 18, iPadOS 18, macOS Sequoia that includes creating email summaries, generating images and emoji, and allowing Siri to take actions on your behalf. These features are achieved through a combination of on-device and cloud processing, with a strong emphasis on privacy. Apple says that Apple Intelligence features will be widely available later this year and will be available as a beta test for developers this summer.

The announcements came during a livestream WWDC keynote and a simultaneous event attended by the press on Apple's campus in Cupertino, California. In an introduction, Apple CEO Tim Cook said the company has been using machine learning for years, but the introduction of large language models (LLMs) presents new opportunities to elevate the capabilities of Apple products. He emphasized the need for both personalization and privacy in Apple's approach.

At last year's WWDC, Apple avoided using the term "AI" completely, instead preferring terms like "machine learning" as Apple's way of avoiding buzzy hype while integrating applications of AI into apps in useful ways. This year, Apple figured out a new way to largely avoid the abbreviation "AI" by coining "Apple Intelligence," a catchall branding term that refers to a broad group of machine learning, LLM, and image generation technologies. By our count, the term "AI" was used sparingly in the keynote—most notably near the end of the presentation when Apple executive Craig Federighi said, "It's AI for the rest of us."

Read 10 remaining paragraphs | Comments

DuckDuckGo offers “anonymous” access to AI chatbots through new service

6 June 2024 at 12:39
DuckDuckGo's AI Chat promotional image.

Enlarge (credit: DuckDuckGo)

On Thursday, DuckDuckGo unveiled a new "AI Chat" service that allows users to converse with four mid-range large language models (LLMs) from OpenAI, Anthropic, Meta, and Mistral in an interface similar to ChatGPT while attempting to preserve privacy and anonymity. While the AI models involved can output inaccurate information readily, the site allows users to test different mid-range LLMs without having to install anything or sign up for an account.

DuckDuckGo's AI Chat currently features access to OpenAI's GPT-3.5 Turbo, Anthropic's Claude 3 Haiku, and two open source models, Meta's Llama 3 and Mistral's Mixtral 8x7B. The service is currently free to use within daily limits. Users can access AI Chat through the DuckDuckGo search engine, direct links to the site, or by using "!ai" or "!chat" shortcuts in the search field. AI Chat can also be disabled in the site's settings for users with accounts.

According to DuckDuckGo, chats on the service are anonymized, with metadata and IP address removed to prevent tracing back to individuals. The company states that chats are not used for AI model training, citing its privacy policy and terms of use.

Read 6 remaining paragraphs | Comments

❌
❌