โŒ

Normal view

There are new articles available, click to refresh the page.
Before yesterdayMain stream

Apple Intelligence and Privacy @ WWDC '24

By: Ryvar
11 June 2024 at 04:33
Yesterday at WWDC 2024 Apple announced its long-anticipated machine learning effort, a Siri overhaul dubbed "Apple Intelligence." The new system employs LLMs and diffusion model image generation while attempting to maintain a uniquely high level of privacy by splitting queries across three tiers of increasing anonymity and capability: on device, private cloud compute servers, and anonymized opt-in-only ChatGPT calls. Ars coverage on Apple Intelligence, and the ChatGPT integration.

The system will debut in the pending iOS 18, iPadOS 18, and macOS Sequoia releases and is composed of three separate layers: 1) On device, the primary mode which draws upon all personal information across apps, contacts, conversations, etc. to create a highly-detailed, user-specific context Apple refers to as a "semantic index." In addition to the ability to parse information currently displayed on screen when requested, there is a new developer API so that third-party applications can specify what kind of information Siri can draw from them, and request appropriate generated text and images. The specific information gathered and any derived data or personalized fine-tuning remains on your device, with the limited exception of difficult queries which are handed off to... 2) Private Cloud Compute, a semi-anonymous cloud-based neural network inference service hosted by Apple with exposure of personal data limited specifically to the query at hand, using a cryptographically-signed software stack and operating with a no-data-retention policy. The segment on Private Cloud Compute featured an unusually candid critique of the data harvesting common to machine learning systems by competing tech giants, without specifically naming... 3) OpenAI's ChatGPT, which will be available later this year and only with explicit user opt-in (on each individual query) for queries the new Siri detects as likely to benefit from scale beyond both on-device hardware and Private Cloud Compute. Data sent to OpenAI is heavily anonymized and multi-modal (meaning combined text and images) for asking questions about an image. Apple mentioned that other models may later become available, but did not specify whether this meant Google's Gemini, Facebook's Llama-3, or potentially even self-hosted endpoints based on open source models like Mistral 8x7b.

I Built the World's Largest Translated Cuneiform Corpus using AI

By: bq
9 June 2024 at 21:10
TL;DR I used a custom-trained Large Language Model (T5) to create the world's largest online corpus of translated cuneiform texts. It's called the AICC (AI Cuneiform Corpus) and contains 130,000 AI translated texts from the CDLI and ORACC projects.

Also of interest: Cuneiform Digital Library Initiative - By making the form and content of cuneiform texts available online, the CDLI is opening pathways to the rich historical tradition of the ancient Middle East. In close collaboration with researchers, museums and an engaged public, the project seeks to unharness the extraordinary content of these earliest witnesses to our shared world heritage. Open Richly Annotated Cuneiform Corpus - Oracc is a collaborative effort to develop a complete corpus of cuneiform whose rich annotation and open licensing support the next generation of scholarly research.
โŒ
โŒ