#76
Stage 2.7, OLMoE, Whale 2.5, Alpha Proteo, Glowtime, ElementRef, Games v0, Luhn's Algo, staticrypt, Is My Blue, Screen apnea, Vagus nerve, PMs in India, Art of Finishing & more
👋🏻 Welcome to the 76th!
📰 Read #76 on Substack for the best formatting
What’s happening 📰
🍎 Apple’s annual iPhone launch event taglined “It’s Glowtime” is happening today (in ~12 hours). They’ll be announcing the iPhone 16 and 16 Pro lineups and AI is going to be one of the central themes around it. You can also watch the live stream on YouTube.
✨ AGI Digest
⚓️ Model and Benchmark Drops:
📂 AllenAI released OLMoE, a completely OSS MoE model with 1.3B active and 6.9B total parameters having 64 experts per layer with 8 active experts in any forward pass. This model shines well in the <2B active parameter range but what's even more impressive is how meticulously the team worked to make everything (model, data, code, logs) transparent and traceable.
🪬 Mini-Omni is a small audio-llm that can natively take in both text/speech inputs and stream both text and audio at the same time. It uses a 0.5B Qwen2 as its LM backbone with a whisper for audio inputs and CosyVoice for generating the output voice.
👩💻 Yi-01.AI released Yi-Coder — a series of 1.5B and 9B parameter models having 128K context length, with the 9B performing the best among sub-10B models on LiveCodeBench (23.4% pass rate). On the Aider code editing benchmark, however, it lags behind all llama-3.1-70b, GPT-3.5, and GPT-4o-mini. Hence, it's probably the best if you need a small locally-running coding model, or else you're better off using one of the better models via the API.
🤏 The folks from OpenBMB at Tsinghua University released MiniCPM3-4B, a 4B parameter model with impressive performance on English, Chinese, Maths, and Coding benchmarks, beating Phi-3.5-mini-instruct & GPT-3.5-Turbo and competing with Llama3.1-8B-Instruct & Qwen2-7B-Instruct. However, always run it on your own evals and vibe checks before drop-in replacing it as your next local model.
🔠 Word Game Bench is a fun benchmark to evaluate LLMs on some of the popular word puzzle games like Wordle and Connections. Along with a leaderboard, you can also see the performance of popular LLMs on the new puzzles published daily. Findings? Claude-3-opus, GPT-4o, and Claude-3.5-sonnet perform well on Wordle by a large margin. While on Connections, 4o takes the lead followed by 3.5-sonnet and 3-opus.
🗃️ Product Improvements:
📿 Claude introduced an Enterprise offering with SSO and RBAC for finer controls. They also dropped in a massive 500k context window and a native GitHub integration so you can work on entire codebases with Claude.
👀 Groq has added image inputs on its platform using LLaVA V1.5 7B, and boy is it fast! On a 1024x1024 image and 100 tokens of text, it took around 0.99 sec to generate 100 output tokens which is >4X faster than GPT-4o.
🐳 The DeepSeek V2 Chat and DeepSeek Coder V2 models have been merged and upgraded into the new model, DeepSeek-v2.5 and also updated on the API replacing both
deepseek-coder
anddeepseek-chat
. The announcement blog says that the new model significantly surpasses the previous versions in both general capabilities and code abilities and better aligns with human preferences. However, on the Aider Code Editing benchmark, it did not perform much differently than the previous DeepSeek-Coder-V2.⚡️ vLLM released v0.6.0 with 2.7x throughput improvements and a 5x latency reduction in time per output token for Llama 8B. They achieved this by separating the API server and inference engine into different processes, batch scheduling multiple steps ahead, async processing and several other optimizations.
🧬 DeepMind announced Alpha Proteo, an AI for designing novel proteins that bind more successfully to target molecules. Previous AIs like AlphaFold can give insights into how Proteins interact and function, but they cannot create new proteins to directly manipulate those interactions. AlphaProteo can generate new protein binders for diverse target proteins, that can help advance drug design, disease understanding, and other practical applications.
🗞️ In headlines:
🔮 OpenAI teased GPT-Next at their recent Japan event saying it has “nearly 100 times improvements on past performance“, but still no other details apart from a new OpenAI Newsroom account on Twitter :/
📕 TIME came up with a rather interesting cover featuring “The 100 Most Influential People in AI” and it left a lot of people wondering why some of them are even there and what real “influence” they had in the AI landscape while many great researchers who should be mentioned did not get a feature.
💰 Ilya Sutskever’s AI Startup, Safe Superintelligence, Raises $1 Billion from NFDG, a16z, Sequoia, DST Global, and SV Angel and is hiring new engineers.
🔐 0x Digest
⚔️ Vitalik talks about “glue and coprocessor architectures”, and how lately they are being used in most modern innovations now. Glue takes care of flexibility & simpler tasks and the coprocessor on the other hand is optimized for specific and heavy tasks. This trend enables more efficiency gains while preserving developer friendliness, security, and openness.
New Launches
💰Pocket Universe launched a Rug Detector, which warns you if a token looks like these serial rug-pullers launched it. Since serial rug-pullers launched 16,000+ rugs in the past 3 months, it’s a must-have extension right now.
💳 In this episode of “Another Week, another Card”, Mastercard launched a new Mastercard crypto Euro debit card with Mercuryo, allowing users to spend cryptocurrencies from their self-custodial wallets at over 100M European merchants.
✒️ Arbitrum Stylus live on Arbitrum One & Nova mainnet. It adds Wasm VM making it a new MultiVM paradigm. It allows you to write contracts in languages that compile into Wasm such as Rust, C, C++, etc. They have a Rust SDK to write Stylus programs and have provided stylus-hello-world to start with basic things.
🏨 Skyscanner Integrates With Travala, making it the first crypto-native travel platform to receive a Skyscanner integration. This will allow hotel booking through crypto.
💼 Matter Labs trims workforce by 16% as demand for ZKsync Era falls.
🎮 Sugartown, a game behind Ora token and initially marketed as “by Zynga” took an unexpected turn. Last Saturday, Zynga divested from the game and now everything is operating under a new org D20 Labs. The Crypto Twitter was not amused by this move.
🛠️ Dev & Design Digest
😔 The slow evaporation of the free/open source surplus, it’s been discussed a lot lately that the State of FOSS is not quite good and it may face a significant decline unless parts of the system prove to be more sustainable than they currently seem.
📋 A neat long article on The web's clipboard, and how it stores data of different types. (Nibbler A has a draft post on the same from 2022, about the same thing, but he never finished it.)
📝 The TC39 committee has tweaked the process to make rolling out new features faster and smoother. How? Well, they added a new stage between Stages 2 and 3, presenting 🥁 “Stage 2.7” for feature proposals. In this stage, the proposal is approved "in principle" but needs to have a full test suite and prototypes developed before moving to Stage 3. This makes the iteration of the design faster if changes are to be made in the design without implementing tests. (between 2 and 2.7)
🚚 In one of the largest technical migrations in history, Tumblr is to move its half a billion blogs to WordPress as decided by Automattic (the company that acquired them in 2019)
What brings us to awe 😳
😮💨 In 2007, former Microsoft executive Linda Stone noticed that she and many others were "holding their breath" while working on screens, a phenomenon she coined "screen apnea". Her research found that 80% of participants showed signs of shallow or suspended breathing while using screens. The linked article also discusses “How can we retrain our breath while typing, tapping, and scrolling?”.
👨🏻💻 Steve Balmer talks about an interview question he used to ask people and John Graham-Cumming wrote a blog on it as Steve Ballmer's incorrect binary search interview question. Nevertheless, the problem is interesting because before even starting the search he asks “Should you play this game?” and “No” is kinda the correct answer.
🎮 Guillermo (CEO of Vercel) posted about a game fully built by v0 (Vercel’s AI tool for creating web apps), i.e. how creating games is easier than pronouncing his name.
The original creator tweeted that the game was created without writing any code. We are really in crazy times, the time from idea to MVP is going down quickly.🧠 How Our Longest Nerve Orchestrates the Mind-Body Connection, a shallow dive blog, explains how the Vagus nerve allows the mind to influence the body and vice versa. It can both stimulate and dampen bodily responses, and its widespread influence makes it a target for therapies targeting neurological and psychological disorders.
Today I (we) Learnt 📑
🪒 Hanlon's Razor is the adage: "Never attribute to malice that which is adequately explained by stupidity." Or sometimes, "Never attribute to malice what can be attributed to incompetence." (or as my mom says “bhola reh gaya”)
🪝 The
ElementRef
is a type helper from React, to easily extract the type from the element you're targeting. Matt Pocock wrote about “Strongly Type useRef with ElementRef”. This solves the issue of devs banging their heads to figure out what to pass inuseRef<IDontKnow>
.💳 Credit Card numbers are validated by an algorithm called "Luhn's Algorithm". [found when doom-scrolling through 𝕏]
🤝 You have read ~50% of Nibble, the following section brings tools out from the wild.
What we have been trying 🔖
📚 LibGen Raycast Extension: Search books on Library Genesis and directly download them from Raycast.
🔵 Is My Blue: Test your color perception with this interactive test. (it seems like both of us are on the greener side and Turquoise is blue for us, what’s yours?)
🧱 blocks.md: a tool that takes your Markdown files and turns them into forms and web pages that are beautiful, customizable, accessible, and fully localizable.
🔖 The Novice's LLM Training Guide: LLM Training Guide by cracked 4chan devs.
Builders’ Nest 🛠️
🔥 hotscript: A library of composable functions for the type level! Transform your TypeScript types in any way you want using functions you already know.
🔔 tinystatus: Tiny status page generated by a Python script.
🔐 staticrypt: A tool to password protect a static HTML page, decrypted in-browser in JS with no dependency. No server logic is needed. It uses AES-256 with WebCrypto to encrypt your html string with your password. [Try this, password is
nibble76
].⏰ style-observer: MutationObserver, but for CSS. Get notified when the computed value of a CSS property changes.
Meme of the week 😌
Off-topic reads/watches 🧗
💪🏻 A Labor of Love by Seth Godin, on why you might be lucky if you have something where you can expend labor and be joyful, but be careful, “When we focus on one, we often decrease the other.”
🔪 Who killed the art of Product Management in India? A short murder mystery by The Ken.
🏁 The Art of Finishing by Tomas Stropus. The article discusses strategies for learning to finish projects, rather than constantly starting new ones. It also covers how to break the “cycle of enthusiasm, struggle, and disappointment”.
🤝 Practical Approaches for More Effective Teamwork by Seth Godin, explains some basic tweaks in your process to make a team work like a team.
Wisdom Bits 👀
“You don't have to be great to start, but you have to start to be great.”
— Zig Ziglar
Wallpaper of the week 🌁
🌌 Grab the week’s wallpaper at wow.nibbles.dev
Weekly Standup 🫠
Nibbler A had a read (docs), write (code), and own (builder’s high) week. He played with “Nibbles” (as in half a byte) while exploring MPT. He ended the week by getting some
unsolicited advice from friends, family, mentors, etc.Nibbler P had a busy week which got extra painful with a sudden abdomen cramp. Some reading and good weather thankfully made the weekend on a good note.
If you liked what you just read, recommend us to a friend who’d love this too 👇🏻