profile

Apart Research

We share newsletters about the progress of ML safety, run fun ML safety hackathons and develop the collaborative research platform AI Safety ideas.

Featured Post

Governing AI & Evaluating Danger

Governing AI & Evaluating Danger We might need to shut it all down, AI governance seems more important than ever and technical research is challenged. Welcome to this week's update! We've renamed our newsletter the AI Safety Digest (AISD) and will make a few changes during the next few weeks, so...
Read now
6 months ago • 4 min read

What a Week! GPT-4 & Japanese Alignment

A Self-Replicating GPT-4! What a week. There was already a lot to cover Monday when I came in for work and I was going to do a special feature on the Japan Alignment Conference 2023 and watched all their recordings. Then GPT-4 came out yesterday and all my group chats began buzzing. So in this...
7 months ago • 5 min read

Perspectives on AI Safety

Interpretability on Reinforcement Learning and Language Models This week, we take a look at interpretability used on a Go-playing neural network, glitchy tokens and the opinions and actions of top AI labs and entrepreneurs. Watch this week's MLAISU on YouTube or listen to it on Podcast. Research...
7 months ago • 4 min read

Bing Wants to Kill Humanity W07

Failures of language models Welcome to this week’s ML & AI safety update where we look at Bing going bananas, see that certification mechanisms can be exploited and that scaling oversight seems like a solvable problem from our latest hackathon results. Watch this week's MLAISU on YouTube or...
7 months ago • 4 min read

Will Microsoft and Google start an AI arms race? W06

Will Microsoft and Google start an AI arms race? We would not be an AI newsletter without covering the past week’s releases from Google and Microsoft but we will use this chance to introduce the concept of AI race dynamics and why researchers are getting more cynical. Watch this week's MLAISU on...
8 months ago • 3 min read

Extreme AI Risk W05

Extreme AI Risk In this week's newsletter, we explore the topic of modern large models’ alignment and examine criticisms of extreme AI risk arguments. Of course, don't miss out on the opportunities we've included at the end! Watch this week's MLAISU on YouTube or listen to it on...
8 months ago • 2 min read

Was ChatGPT a good idea? W04

Was ChatGPT a good idea? In this week’s ML & AI Safety Update, we hear Paul Christiano’s take on one of OpenAI’s main alignment strategies, dive into the second round winners of the inverse scaling prize and share the many fascinating projects from our mechanistic interpretability hackathon. And...
8 months ago • 4 min read

Compiling code to neural networks? W03

Compiling code to neural networks? Welcome to this week’s ML & AI Safety Report where we dive into overfitting and look at a compiler for Transformer architectures! This week is a bit short because the mechanistic interpretability hackathon is starting today – sign up on ais.pub/mechint and join...
8 months ago • 2 min read

Robustness & Evolution W02

Robustness & Evolution Welcome to this week’s ML Safety Report where we talk about robustness in machine learning and the human-AI dichotomy. Stay until the end to check out several amazing competitions you can participate in today. Watch this week's MLAISU on YouTube or listen to it on...
9 months ago • 4 min read

Hundreds of research ideas! W01

AI Improving Itself Over 200 research ideas for mechanistic interpretability, ML improving ML and the dangers of aligned artificial intelligence. Welcome to 2023 and a happy New Year from us at the ML & AI Safety Updates! Watch this week's MLAISU on YouTube or listen to it on Spotify. Mechanistic...
9 months ago • 5 min read
Share this page
Built with ConvertKit