Apple wants AI to run directly on its hardware instead of in the cloud::iPhone maker wants to catch up to its rivals when it comes to AI.
I am really jealous being an android user that Apple usually have this things running on their own devices with more privacy in mind. But I just can’t accept the close garden Apple push.
Google is already doing this.
https://store.google.com/intl/en/ideas/articles/pixel-feature-drop-december-2023/
They do say that they have privacy in mind. And they are also collecting the same data of their users as Google. Don’t be too jealous, they suck just as much as your next Android-Phone company. But with a higher price tag and a walled garden.
They’re both bad, but Apple is clearly less bad about privacy than most big hardware or software companies by far.
How do you know? Because they promise?
Funny that you mention it, a few months ago when updating stuff I got a new feature on my Android phone… Offline subtitle generation based on audio, just realtime generated from anything outputing sound on my phone.
A Google search suggests this might be an older feature - not sure if my phone didn’t support it, or if I maybe just missed it, or if they added a more obvious button.
Google has a separate app for that stuff, called Private Compute Services. Right now it’s nothing like an offline Google assistant replacement, but I thought it’s really nice to have that stuff available without relying on internet access.
Did you try to turn off the internet to see if it actually works? That’s pretty amazing thank you for sharing!
Remember, this probably isn’t an either or thing. Both Apple and Google have been offloading certain AI tasks to devices to speed up response time and process certain requests offline.
deleted by creator
Just because a certain requests don’t work offline, that doesn’t mean that Google isn’t actually running models locally for many requests.
My pixel isn’t new enough to run nano. What are some examples of offline processing not working?
I wouldn’t be surprised if the handshake between Pro and Nano was intermingled for certain requests. Some stuff done in the cloud, and some stuff done locally for speed - but if the internet is off, they kill the processing of the request entirely because half of the required platform isn’t available.
deleted by creator
What a thought provoking reply.
deleted by creator
It’s already possible. A 4bit quant of phi 1.5 1.5B (as smart as a 7b model ) takes 1Gb of ram . Phi 2 2.6B (as smart as a 13b model ) was recently released and it would likely take 2GB of RAM with 4bit Quant (not tried yet) The research only license on these make people not want to port them to android and instead focus on weak 3B models or bigger models ( 7b+) which heavily limit any potential usability.
- Apple could mimic and improve the phi models training to make their own powerful but small model and then leverage the fact that they have full knowledge and control over the hardware architecture to maximize every drop of performance. Kinda like how the some people used their deep knowledge of the console architecture to make it do things that seems impossible.
Or
- The Apple engineers will choose, either due to time constraints or laziness to simply use llama.cpp which will certainly implement this flash attention and then use an already available model that allow its use for commercial purposes like mistral, add some secret sauce optimizations based on the hardware and voilà.
I bet on 2.
AKA “we completely missed the boat on this thing and are going to pretend it was intentional by focusing on an inevitable inflection point a few years out from today instead.”
Also Apple sucks at cloud services.
Google is doing this exact same thing with Gemini, the platform behind Bard / Assistant.
Gemini has large scale models, that live in data centers, and handles complex queries. They also have a “Nano” version of the model that can live on a phone and handle simpler on-device tasks.
The smaller models are great for things like natural language UI and smart home controls. It’s also way faster and capable of working offline. A big use case for offline AI has been hiking with the Apple Watch in areas with no reception.
Also battery management, background tasks power distribution and hardware energy efficiency, i mean it would be great to have ai that adapted hardware energy consumption settings depending on my use case, yes i know that algorithms already exist to do that, but it would be great to have much much more flexible energy manager based on ai that accommodate and adapt to my use cases
This is the best summary I could come up with:
Apple’s latest research about running large language models on smartphones offers the clearest signal yet that the iPhone maker plans to catch up with its Silicon Valley rivals in generative artificial intelligence.
The paper was published on December 12 but caught wider attention after Hugging Face, a popular site for AI researchers to showcase their work, highlighted it late on Wednesday.
Device manufacturers and chipmakers are hoping that new AI features will help revive the smartphone market, which has had its worst year in a decade, with shipments falling an estimated 5 percent, according to Counterpoint Research.
Running the kind of large AI model that powers ChatGPT or Google’s Bard on a personal device brings formidable technical challenges, because smartphones lack the huge computing resources and energy available in a data center.
Apple tested its approach on models including Falcon 7B, a smaller version of an open source LLM originally developed by the Technology Innovation Institute in Abu Dhabi.
Academic papers are not a direct indicator of how Apple intends to add new features to its products, but they offer a rare glimpse into its secretive research labs and the company’s latest technical breakthroughs.
The original article contains 741 words, the summary contains 194 words. Saved 74%. I’m a bot and I’m open source!