everydAI
Posts
Llama 4 Is Low-Key Terrifying

Llama 4 Is Low-Key Terrifying

Meta’s New AI Model Runs on One GPU, and Beats GPT-4

Diego Castaldo & Mario Castaldo
April 07, 2025

Welcome back apprentices! 👋

You know it’s serious when Meta drops something on a Saturday.

Not a meme, not a product recall — something big, brainy, and definitely trained on more text than you’ve read in your entire life. No livestream, no fanfare, just a quiet little blog post that might’ve just changed the AI game. Again.

So, what did Zuck’s lab cooks cook up this time?

Let’s just say: it sees. It thinks. It fits on one GPU. And it definitely didn’t come here to chat politely.

In today's email

Your AI Thinks — But Won’t Say How
DeepMind’s AGI Safety Roadmap
Nvidia’s $16B “Ship or Skip” Dilemma
Meta Drops Llama 4
Intel Just Called Its Rival for Help
even more AI magic
Test the prompt (NEW)

_{Read Time: 4 minutes}

Quick News

🧐 Anthropic just dropped a research bomb: large AI models like Claude 3.7 Sonnet may appear to “reason,” but when it comes to transparency about how they get their answers? They’re more PR manager than math teacher. In structured tests where models were nudged with misleading or reward-hacking prompts, Claude admitted using those hints only 25% of the time. In more adversarial setups — where it was incentivized to game the system — it confessed just 1.6% of the time.

Even when researchers tried training the model to “think better” using reinforcement learning, faithfulness stalled at around 30%. Translation? Chain-of-Thought (CoT) prompting might make AIs sound smarter — but it won’t make them more honest.

🧠 DeepMind just released its AGI safety roadmap, targeting four major risks: misuse, misalignment, accidents, and societal-scale disruption. From AI monitors and deceptive alignment detection to MONA (yes, that’s a real thing for long-term planning) and debate-style oversight, the tools are serious — and so is the message. With AGI potentially just years away, DeepMind is setting the standard for not accidentally creating an all-knowing intern who rewrites your company — and the planet — overnight.

💸 Chinese tech firms are reportedly placing a $16 billion order for Nvidia's AI chips before new U.S. export bans lock the gates—and Nvidia is now stuck in the semiconductor version of “should I stay or should I go?” On one side: huge profits and customer demand for H20 chips (China-compliant alternatives to the H100). On the other: the very real possibility of U.S. regulatory heat if it looks like they’re sprinting to sell before the White House says no. Oh, and delivery deadlines? Before June, just in case policy shifts mid-year.

Meta
Meet Llama 4: Multimodal, Multilingual, and Mildly Menacing

Source: Meta

Meta just dropped its most advanced AI models yet — on a Saturday, in true Zuck fashion.

Meet Llama 4 Scout and Llama 4 Maverick: multimodal, multilingual, and borderline unsettling in how well they read your documents, diagrams, and probably your mind. These aren’t just chatbots — they see, reason, summarize, and reportedly run better on a single GPU than your Zoom call does on hotel Wi-Fi.

But don’t worry — they’re still polite. Scout fits neatly on a single H100 GPU like a model houseguest, while Maverick flexes 400B parameters — but only uses 17B at a time. Think bazooka-in-a-backpack energy, used with the restraint of someone who still believes in responsible compute.

What’s Actually New (And Kinda Nuts):

🧠 Context window? Try 10 million tokens. That’s War and Peace, the Terms & Conditions, and your entire Slack history in one prompt.
🏋️ Llama 4 Maverick: 400B total parameters, 128 experts, 17B active at a time = muscle + manners.
📷 It sees now. Like, for real. Native image + text input, no vision adapter duct-taped on.
🏆 Benchmarks: Maverick casually outruns GPT-4o, Claude 3.5, and Gemini 2.0 in everything from code to common sense to multilingual trivia night.

So... What Can You Do With It?

Meta made these models open-weight and available on Hugging Face, meaning you can download them, run them on your own servers, and pretend you invented AGI. Deploy them in apps, agents, slide deck generators, or just have them roast your inbox.

Best part? It doesn’t cost a GPU farm. Llama 4 Scout runs beautifully on one (yes, one) H100, making it perfect for budget-conscious developers, solo devs with ambition, or that one intern who accidentally built your AI roadmap.

But Is It Safe? (Or Just Llama Unleashed?)

Meta added a safety belt this time, with:

Llama Guard: Filters the prompts that should never leave QA.
Prompt Guard: Prevents your model from “going rogue” mid-convo.
GOAT: (No, really) an automated red-teaming agent to stress-test your AI before the lawyers do.

Also: Refusal rates on sensitive questions are down to <2%, and biased responses are being zapped with laser precision. This isn’t just smarter — it’s better behaved.

And About That “Behemoth” Rumor…

Meta quietly dropped that its real boss model — Llama 4 Behemoth (working title) — is still in training. At 2 trillion parameters across 16 experts, it’s designed to beat GPT-5 at its own game before GPT-5 even launches.

And with 32,000 GPUs currently sweating in the background, it's fair to say Meta’s not just “training a model” — they’re breeding a digital demigod.

So what’s the deal? While OpenAI is cozy behind closed APIs and Google juggles Gemini branding, Meta just gave the dev world a gift wrapped in llama fur.

It’s open, scalable, vision-enabled, and (if Scout is any proof) dangerously efficient. This is no longer a research toy — it’s an enterprise weapon you can fine-tune in your basement. And if you aren’t using it soon, your competition probably is.

Also, the model is called Maverick, so yes, you now have Top Gun permission to say:

“Talk to me, Llama.”

Intel & TSMC
Intel Just Called Its Rival for Help — And TSMC Answered Like a Boss

After clocking $16 billion in losses in 2024 (yep, that’s a lot of zeroes), Intel is reportedly forming a joint venture with longtime rival TSMC, with a little matchmaking help from the White House.

TSMC would grab a 20% stake in Intel’s U.S. chipmaking operations, not with cash, but with its god-tier manufacturing expertise and talent pipelines. The move is being led by Intel’s new CEO Lip-Bu Tan, who’s clearly not afraid to shake the silicon tree.

While some Intel execs are worried about job cuts and losing what’s left of the company’s manufacturing mojo, the deal could be a life-saving reboot for a brand that used to lead the chip race —and now just hopes to stay on the track.

Meanwhile, TSMC expands its U.S. footprint without breaking ground, and the U.S. gets a shiny new narrative for reshoring chip supremacy. Everybody wins… maybe.

Current Overview of the Chipmaking Market

A.K.A. “Who’s still running Moore’s Law and who’s just running in circles?”

🇺🇸 Nvidia – $1.55T Market Cap
🧠 AI's golden goose.
Invented your favorite GPU — and now fuels everything from chatbots to trillion-dollar valuations.
Basically Wall Street’s emotional support chipmaker.

🇹🇼 TSMC – $540B Market Cap
🏭 The chip world’s ghostwriter.
Doesn’t make products, just everybody else’s chips.
Runs the semiconductor universe from behind the curtain (and inside your iPhone).

🇺🇸 Broadcom – $500B Market Cap
🌐 The networking ninja.
Probably in your modem, router, and your Wi-Fi-enabled fridge.
If it connects, Broadcom quietly gets a cut.

🇳🇱 ASML – $380B Market Cap
🔬 The machine behind the machine.
Doesn’t make chips — makes the machines that make chips.
Without ASML’s EUV gear, nobody’s hitting 2nm. Period.

🇺🇸 AMD – $300B Market Cap
🎮 From underdog to datacenter darling.
Killed it in gaming, now eyeing Nvidia's lunch in AI.
Comeback stories wish they were this good.

Bonus:

🇺🇸 Intel – $125B Market Cap

🫠 Silicon Valley's redemption arc.

Used to rule the world, now teaming up with TSMC to stay in the game.

From chip champ to fab fixer-upper.

So what’s the deal? Intel’s not just outsourcing — it’s outsourcing pride. This isn’t a licensing deal or a tech transfer — it’s a full-on “take our fabs and show us how it’s done” moment. And while internal pushback is real (some execs aren’t thrilled about TSMC staff showing up with toolkits and flowcharts), the strategic logic is hard to ignore.

For TSMC, it’s a soft landing into U.S. territory without having to build or staff from scratch. For Intel, it’s a survival strategy wrapped in a collaboration ribbon. And for the U.S. government? It’s geopolitical gold: more chips made at home, fewer dependencies abroad.

In the chip wars, humility is rare — but survival instincts aren’t. If your fabs are failing, don’t reboot the servers — reboot the strategy.

Help Your Friends Level Up! 🔥

Hey, you didn’t get all this info for nothing — share it! If you know someone who’s diving into AI, help them stay in the loop with this week’s updates.

Sharing is a win-win! Send this to a friend who’s all about tech, and let’s bring them into the fold!

Even Quicker News

🤖 In a Turing test showdown, GPT-4.5 fooled judges 73% of the time—out-humaning actual humans in casual chat. At this point, the only way to tell who's real might be who asks you to hold for a supervisor.

📚 OpenAI built PaperBench to test if AIs can actually understand research papers — not just skim them like a sleep-deprived grad student. Spoiler: most models still panic at footnotes and call it "insight."

🎓 Anthropic launched Claude for Education with “Learning Mode” to build student brains; OpenAI clapped back by making ChatGPT Plus free for finals season. AI isn’t just doing homework anymore—it’s fighting for valedictorian.

Today’s Toolbox

🎬 Adobe’s new Generative Extend fills video gaps so smoothly, your timeline might finally stop judging you. It also searches terabytes of footage in seconds — so your editor can find that one perfect clip before their coffee gets cold.

🌬️ HKU and Huawei just dropped Dream 7B — a diffusion-based model that doesn’t write left-to-right, but somehow still outsmarts the usual suspects at math, code, and Sudoku. It’s like if ChatGPT ditched caffeine, learned strategy, and started speed-running logic puzzles for fun.

🎧 Spotify now writes your ad scripts, voices them, places them in real-time auctions, and tracks who actually cared—all before your coffee gets cold. Basically, it’s a full-stack media team that doesn’t ask for PTO.

🧪 Test the Prompt

A playground for your imagination (and low-key prompt skills).

Each send, we give you a customizable DALL·E prompt inspired by a real-world use case — something that could help you in your business or job if you wanted to use it that way. But it’s also just a fun creative experiment.

You tweak it, run it, and send us your favorite. We pick one winner to feature in the next issue.

Bonus: you’re secretly getting better at prompt design. 🤫

The winner is…

Last send, we challenged you to test GPT-4o’s visual generation skills with this prompt.

Here’s the WINNER:

Congrats to Matt from (who would’ve guessed it) New York!🥳

Want to be featured next? Keep those generations coming!

🎨 Prompt: “Solopreneur Command Center”

“An ultra-detailed isometric illustration of a solopreneur’s dream command center, floating inside a [surreal environment]. The desk is built from [unusual object] and surrounded by holographic dashboards showing data powered by [strange fuel source]. The space is managed by AI assistants shaped like [absurd creatures or items], each multitasking — writing emails, automating workflows, or brewing coffee. Add a glowing calendar wall, color-coded cables, and a neon sign that says: “[weird productivity mantra]”. Style: cozy-futuristic tech lair, with soft ambient lighting and chaotic focus energy.”

We’ll be featuring the best generations in our next newsletter!

_FEEDBACK

How was today's everydAI?

DISCLAIMER: None of this is financial advice. This newsletter is strictly educational and is not investment advice or a solicitation to buy or sell any assets or to make any financial decisions. Please be careful and do your own research.