WHISTLEBLOWER Reveals Complete AGI TIMELINE, 2024 - 2027 (Q*, QSTAR)

TheAIGRID
4 Mar 202445:05

Summary

TLDR这份视频脚本披露了一份据称来自OpenAI内部的机密文件,暗示OpenAI计划在2027年之前实现人工通用智能(AGI)。该文件详细阐述了OpenAI正在训练一个125万亿参数的多模态模型,预计将在2023年12月完成训练。尽管文件中有一些猜测成分,但它提供了相当多的证据和细节,令人怀疑OpenAI可能真的在秘密推进AGI的工作。这份视频探讨了该文件所引发的一系列问题和猜测。

Takeaways

  • 😃 视频指出 OpenAI 有一份秘密计划,计划在 2027 年之前实现 AGI(通用人工智能)。
  • 🤔 这份文件声称 OpenAI 在 2022 年 8 月开始培训了一个 1.25 万亿参数的多模态模型,并在 2023 年 12 月完成训练。
  • 🔍 文件显示 OpenAI 一直在努力创建一个人脑大小(1千万亿参数)的 AI 模型,这是他们实现 AGI 的计划。
  • 📈 根据 Chinchilla 定律,即使 1 千万亿参数的模型性能略低于人类,如果使用大量数据进行训练,它就能超过人类水平。
  • 😲 一些 AI 领袖如 Hinton 和 Hassabis 近期发出警告,表示 AGI 的到来比预期的要快。
  • 🕵️ 微软对 OpenAI 的 100 亿美元投资有望为 OpenAI 提供训练 AGI 系统所需的计算资源。
  • 💰 Sam Altman 正在筹集 70 亿美元,可能是为了训练大规模 AGI 系统所需的巨额计算成本。
  • ⚠️ 有声音呼吁暂停训练超过 GPT-4 水平的 AI 系统,包括正在训练中的 GPT-5。
  • 🔐 OpenAI 计划在 2027 年之前解决"超级对齐"问题,以确保安全释放 AGI。
  • 🔜 视频暗示,每年都会有新的GPT模型发布,GPT-7 之后可能就是 AGI 系统。

Q & A

  • OpenAI计划到2027年创造AGI的文件揭示了哪些关键信息?

    -该文件透露,OpenAI从2022年8月开始训练一个具有125万亿参数的多模态模型,首个阶段命名为Q星,模型在2023年12月训练完成,但由于高昂的推理成本,发布被取消。文件还暗示,OpenAI的长期计划是通过逐年发布新模型,最终在2027年达到AGI。

  • 文档中提到的“Q星”、“ARUS”和“GOI”是什么?

    -Q星、ARUS和GOI被提及为OpenAI开发的模型名称。其中,ARUS模型的开发被取消,因为它运行效率不高。这些名称被视为OpenAI内部计划和模型的一部分,指示了公司在人工智能领域的研究方向和进展。

  • 为什么OpenAI会将原本计划的GPT-5模型取消或重命名?

    -原本计划的GPT-5模型被取消或重命名的具体原因在文档中没有详细说明,但暗示这与模型开发过程中的策略调整和技术挑战有关,可能是由于在推理成本、性能预期或技术突破方面遇到的问题。

  • Elon Musk对OpenAI的计划提出诉讼有何影响?

    -Elon Musk提起的诉讼导致了对OpenAI计划的一些延迟,特别是影响了原计划中的GPT-6和GPT-7的开发和发布。Musk的诉讼主要是基于对OpenAI远离其开源目标和创造高级技术应对公众开放的承诺的担忧。

  • 什么是AGI,以及它与当前AI技术的区别是什么?

    -AGI(人工通用智能)是指能够执行任何智力任务的人工智能,与人类智力水平相当。它与当前的AI技术(通常专注于特定任务的解决方案)的主要区别在于其通用性和灵活性,AGI能够在没有特定培训的情况下理解和执行各种复杂任务。

  • 文档如何定义人类水平的AGI,以及它的实现对社会有什么潜在影响?

    -文档定义人类水平的AGI为能够执行任何一个聪明人类能够完成的智力任务的人工智能。它的实现可能会彻底改变社会,包括经济、就业、教育和技术发展等方面,同时也引发了对安全、伦理和社会影响的广泛关注。

  • 为什么说模型参数数量是预测AI性能的关键指标?

    -模型参数数量被视为预测AI性能的关键指标,因为参数越多,模型处理和理解复杂数据的能力通常越强。参数数量与生物大脑中的突触数目相类比,用来估计AI模型的复杂度和潜在的智能水平。

  • OpenAI如何通过参数计数和数据量来逼近人类大脑的性能?

    -OpenAI通过增加模型的参数计数并训练它们使用大量的数据来逼近人类大脑的性能。通过模仿人脑中神经元间突触的连接方式,以及利用海量数据进行训练,OpenAI旨在创造出能够模拟人类认知过程的AI模型。

  • 文档中提到的“Chinchilla法则”是什么,它如何影响AI模型的训练?

    -Chinchilla法则基于DeepMind的研究,指出当前模型训练方式在数据使用方面存在低效,通过使用更多的数据进行训练,可以显著提高AI模型的性能。这一发现促使OpenAI和其他AI研究机构重新评估其训练策略,使用更多数据以期望达到更优的训练效果。

  • 为什么说AI领域的研究是动态且迅速发展的?

    -AI领域的研究被认为是动态且迅速发展的,因为每天都有大量的新研究和技术突破被发布,不断推动了AI技术的极限,并改变了我们对可能实现的智能水平的理解。这一领域的快速进步要求研究者、开发者和利益相关者持续关注最新动态,以适应不断变化的技术环境。

Outlines

00:00

📄 开篇介绍

视频开始讨论一个揭示OpenAI到2027年创建人工通用智能(AGI)计划的文档。强调了该文档包含许多推测和不完全是事实,但提到了一些关键内容,比如OpenAI于2022年8月开始训练一个125万亿参数的多模态模型,以及由于高昂的推理成本而取消了其发布。视频作者提醒观众以怀疑的态度看待这些信息,同时强调了对OpenAI未来计划的兴趣。

05:02

🔍 文档深入解析

深入分析了文档中提及的OpenAI的计划,包括关于GPT 4.5被重命名为GPT 5,以及GPT 5原定于2025年发布但被取消的信息。讨论了一些被提及的模型,如arus和GOI,及其与OpenAI相关文章的联系。同时,文档提及了GPT 6(现称GPT 7)的发布被埃隆·马斯克的诉讼所延迟,以及关于OpenAI的公开资源目标的讨论。

10:03

🚀 向AGI迈进

该段落讨论了100万亿参数模型的潜力以及通过参数数量预测AI性能的可行性。提到了一个重要的研究,展示了随着参数数量的增加,AI在不同任务上的表现也随之提升,尽管收益递减。这段落也提到了如何通过增加数据量来弥补100万亿参数模型在性能上的不足,以及OpenAI使用的工程技术来桥接这一差距。

15:04

🧠 关于模型的进一步洞察

深入探讨了OpenAI对GPT 4模型的开发计划,包括关于100万亿参数模型的误解和澄清。提到了OpenAI利用更少的模型参数实现更高性能的方法,以及在网络上训练模型以处理更复杂问题的挑战。还讨论了关于GPT 4的早期泄露信息,以及OpenAI如何回应这些泄露信息。

20:05

🔧 模型测试与预期

讨论了GPT 4在2022年10月和11月被测试的信息,以及这些测试与先前泄露的一致性。文档还揭示了OpenAI官方对于100万亿参数GPT 4模型的立场,以及关于模型视频处理能力的泄露信息。这段落展示了关于AI性能的预期,尤其是与视频生成相关的能力。

25:06

🤖 机器人学与未来展望

这一部分讨论了将AI应用于机器人技术的潜力,尤其是关于视觉数据的重要性和通过大规模模型训练实现复杂任务的可能性。提及了特斯拉在其自动驾驶技术中完全依赖视觉数据的决定,以及通过视觉和语言数据训练的大模型如何处理机器人任务。这强调了通过训练更大模型来实现更高级别的AI性能的趋势。

30:06

🌐 AGI和超级对齐的道路

分析了OpenAI对于创建AGI的长期计划和目标,特别是关于通过提高模型的参数数量和训练数据量来实现更接近人类水平的AI性能。这部分还涉及了关于AI研究方向和超级对齐(super alignment)问题的深入讨论,以及知名AI研究人员对于AI发展速度和潜在风险的担忧。

35:07

💡 综合理解与未来展望

最后一段深入探讨了OpenAI在创建AGI方面的总体战略和计划,包括预计每年发布一个新的GPT模型,直到2027年。讨论了OpenAI对超级对齐技术的期望,以及如何通过逐步发布更新和更强大的模型来为社会和技术环境提供适应AGI的时间。同时,也反思了关于AGI发展的不确定性和可能的未来趋势。

Mindmap

Keywords

💡人工通用智能(AGI)

人工通用智能(Artificial General Intelligence)指的是能够像人类一样执行各种智力任务的人工智能系统。根据视频内容,OpenAI的目标是在2027年前开发出人类大脑规模的AGI模型,具有类似人类的推理和学习能力。视频中提到,OpenAI计划通过迭代开发GPT模型系列,最终实现GPT-8或类似的125万亿参数模型达到AGI水平。

💡GPT(生成式预训练转换器)

GPT是生成式预训练转换器(Generative Pre-trained Transformer)的缩写,指的是OpenAI开发的一系列大型语言模型。视频重点讨论了GPT-4、5、6和7的开发进展,认为它们是通向AGI的关键步骤。根据视频,OpenAI计划每年发布新的GPT系统,以逐步增加参数规模和功能,直到实现AGI级别的GPT-8。

💡参数量

参数量指的是神经网络中可训练的参数数量。视频中强调,增加参数量是实现人工智能提升的关键方式之一。视频援引研究发现,在参数量达到与人脑同等水平(约1000亿万亿个参数)时,人工智能性能将达到人类水平。因此,OpenAI的目标是开发出拥有1000亿万亿参数的AGI系统。

💡计算资源

计算资源指的是训练大型人工智能模型所需的计算能力,包括GPU、TPU等专用硬件。视频提到,增加参数量需要大量计算资源,而OpenAI通过与公司如Cerebras等合作来获取所需的计算能力。视频还暗示,OpenAI正在寻求7万亿美元的投资,可能与为AGI项目获取计算资源有关。

💡数据

数据是训练人工智能模型不可或缺的燃料。视频引用SamAltman的观点,互联网上存在足够的数据来训练出AGI系统。视频还提到"Chinchilla scaling laws"的概念,即通过大量数据来弥补模型参数量与人脑参数量之间的差距,从而提高性能。

💡多模态

多模态指的是结合文本、图像、视频、音频等多种形式的数据训练人工智能模型。视频暗示OpenAI计划开发的AGI系统将是多模态的,能够处理互联网上的各种数据形式。这与仅基于文本训练的早期GPT模型形成鲜明对比。

💡超级对齐(Superintelligence Alignment)

超级对齐指的是确保高度智能的人工智能系统符合人类意愿和价值观的技术和方法。视频显示,OpenAI计划在2027年前解决"超级对齐"问题,以确保AGI系统的安全和可控性。这被视为实现AGI的关键障碍之一。

💡泄露(Leak)

视频中多次提及OpenAI关于未来AI发展计划的"泄露"消息。这些泄露来自不同来源,包括一些内部人士、研究人员和企业家等,透露了OpenAI正在开发的高参数量模型的细节,如参数数量、训练进度和预计发布时间等。虽然视频承认这些泄露可能不完全准确,但认为它们与OpenAI官方声明有一定吻合之处。

💡缩放定律(Scaling Laws)

缩放定律指的是人工智能模型性能与参数量、计算资源和训练数据之间的关系规律。视频提到,OpenAI最初的缩放定律估计有误,低估了实现AGI所需的数据和计算资源。而后来DeepMind的"Chinchilla scaling laws"研究改变了这一认知,因此OpenAI调整了开发计划,决定大幅增加所需的计算资源和训练数据量。

💡风险意识

随着AGI的临近,视频显示一些人工智能领域的重要人物对其潜在风险越来越谨慎。DeepMind的Hassabis、谷歌的Hinton等人发出警告,呼吁行业谨慎行事。视频还提到存在号召暂停AGI研究6个月的公开信,以及Elon Musk对OpenAI提起诉讼等情节。这反映出人们对超级智能AI的担忧与期待并存。

Highlights

There was a recent document that apparently reveals OpenAI's secret plan to create AGI by 2027.

The document states that OpenAI started training a 125 trillion parameter multimodal model called 'qstar' in August 2022, which finished training in December 2023 but the launch was cancelled due to high inference cost.

Multiple AI researchers and entrepreneurs claim to have had inside information about OpenAI training models with over 100 trillion parameters, intended for AGI.

OpenAI's president Greg Brockman stated in 2019 that their plan was to build a human brain sized model within 5 years to achieve AGI.

AI leaders like Demis Hassabis and Geoffrey Hinton have recently expressed growing concerns about the potential risks of advanced AI capabilities.

After the release of GPT-4, the Future of Life Institute released an open letter calling for a 6-month pause on training systems more powerful than GPT-4, including the planned GPT-5.

Sam Altman confidently stated that there is enough data on the internet to create AGI.

OpenAI realized their initial scaling laws were flawed and have adjusted to take into account DeepMind's 'Chinchilla' laws, which show vastly more data can lead to massive performance boosts.

While a 100 trillion parameter model may be slightly suboptimal, OpenAI plans to use the Chinchilla scaling laws and train on vastly more data to exceed human-level performance.

Microsoft invested $10 billion into OpenAI in early 2023, providing funds to train a compute-optimal 100 trillion parameter model.

An OpenAI researcher's note reveals they were working on preventing an AI system called 'qstar' from potentially destructive outcomes.

The document theorizes that OpenAI plans to release a new model each year until 2027, aligning with their 4-year timeline to solve the 'super alignment' problem for safe AGI release.

GPT-7 is speculated to be the last pre-AGI model before GPT-8 achieves full AGI capability in 2027.

Sam Altman mentioned OpenAI can accurately predict model capabilities by training less compute-intensive systems, potentially forecasting the path to AGI.

Altman is reportedly trying to raise $7 trillion, likely to fund the immense compute required for training a brain-scale AGI model using the Chinchilla scaling laws.

Transcripts

00:00

so there was a recent document that

00:02

actually apparently reveals open ai's

00:05

secret plan to create AGI by 2027 now

00:08

I'm going to go through this document

00:10

with you Page by Page I've read it over

00:12

twice and there are some key things that

00:13

actually did stand out to me so without

00:16

further Ado let's not waste any time and

00:18

of course just before we get into this

00:19

this is of course going to contain a lot

00:23

of speculation remember that this

00:24

document isn't completely 100% factual

00:27

so just take this video with a huge of

00:30

salt so you can see here the document

00:32

essentially says revealing open ai's

00:34

plan to create AGI by 2027 and that is a

00:37

rather important date which we will come

00:39

back to if we look at this first thing

00:41

you can see there's an introduction okay

00:42

and of course remember like I said there

00:44

is a lot of speculation in this document

00:46

there are a lot of different facts and

00:47

of course like I said anyone can write

00:49

any document and submit it to um you

00:51

know Twitter or Reddit or anything but I

00:53

think this document does contain a

00:55

little bit more than that so it starts

00:57

out by stating that in this document I

00:59

will be revealing information I have

01:01

gathered regarding opening eyes delayed

01:03

plans to create human level AGI by 2027

01:07

not all of it will be easily verifiable

01:09

but hopefully there's enough evidence to

01:11

convince you summary is basically that

01:13

openai has started training a 125

01:16

trillion parameter multimodal model in

01:19

August of 2022 and the first stage was a

01:22

rakus also called qstar and the model

01:24

finished training in December of 2023

01:27

but the launch was cancelled due to the

01:29

high inference cost

01:30

and before you guys think it's just

01:31

document with like just words I'm going

01:33

to show you guys later on like all of

01:35

the crazy kind of stuff that is kind of

01:38

verifiable that does actually um line up

01:40

with some of the stuff that I've seen as

01:42

someone that's been paying attention to

01:43

this stuff so this is literally just the

01:45

introduction um the juicier stuff does

01:47

come later but essentially they actually

01:49

talk about the and this is just like an

01:50

overview so you're going to want to

01:51

continue watching they essentially state

01:53

that you know this is the original GPT 5

01:55

which was planned for release in 2025

01:57

bobi GPT 4.5 has been renamed name to

02:00

gbt 5 because the original gbt 5 has

02:02

been cancelled now I got to be honest

02:03

this paragraph here is a little bit

02:05

confusing um but I do want to say that

02:07

the words arus and the words GOI are

02:09

definitely models that were referred to

02:12

by several articles that were referring

02:14

to leaks from open eye and I think they

02:17

were actually on the information so this

02:19

is some kind of stuff that I didn't

02:21

really hear that much about but the

02:22

stuff that I did hear was pretty crazy

02:25

so um this arus and this goby thing

02:27

although you might not have heard a lot

02:28

about it of course it is like like a

02:30

kind of like half and half leak but like

02:31

I was saying this stuff is kind of true

02:33

so you can see here open AI dropped work

02:35

on a new arus model in rare AI setback

02:38

and this one actually just talks about

02:40

um how by the middle of open AI you know

02:42

scrapping an araus launch after it

02:43

didn't run as efficiently so there's

02:45

actually some references to this but a

02:47

lot of the stuff is a little bit

02:48

confusing but we're going to get on to

02:49

the main part of this story now I just

02:51

wanted to include that just to show you

02:52

that you know these names aren't made up

02:54

because if I was watching this video for

02:55

the first time and I hadn't seen some of

02:56

the prior articles before I'd be

02:58

thinking what on Earth is a r what on

03:00

Earth is GOI I've only heard about qar

03:02

so essentially let's just take a look

03:04

and it says the next stage of qstar

03:06

originally GPT 6 but since renamed

03:09

gpt7 originally for release in 2026 has

03:12

been put on hold because of the recent

03:14

lawsuit by Elon Musk if you haven't been

03:16

paying attention to the space

03:17

essentially Elon Musk just f a lawsuit

03:19

released a video yesterday um stating

03:21

that open ey have strayed far too long

03:23

from their goals and if they are

03:25

creating some really advanced technology

03:26

the public do deserve to have it open

03:28

source because that was their goal um

03:30

and essentially you can see here it says

03:32

qar GBC planned to be released in 2027

03:36

achieving full AGI and one thing that I

03:38

do want to say about this because

03:39

essentially they're stating that you

03:40

know they're doing this up to gpt7 and

03:42

then after gpt7 they do get to AGI one

03:45

thing that I do think okay and I'm going

03:47

to come back to this as well is that the

03:48

dates kind of do line up and I say kind

03:51

of because not like 100% because we

03:53

don't know but presuming let's just

03:55

presume okay because GPT 4 was released

03:58

in 2023 right um let's just say you know

04:01

every year release a new model okay um

04:03

that would mean that you know in 2024 we

04:05

would get gbt 5 in 2025 we get GPT 6 in

04:08

2026 we would get GPT 7 and in 2027 we

04:12

would get GPT 8 which is of course AGI

04:14

now one thing I do think about this that

04:16

is kind of interesting and remember I'm

04:17

going to come back to this so pay

04:19

attention Okay what I'm basically saying

04:21

is that if openi are consistent with

04:23

their year releases so for example if

04:25

they are going to release a new model

04:26

every year and if we continue at the

04:28

same rate like a new GPC every single

04:30

year which is possible him stating that

04:31

gp7 being the last release before GPT 8

04:35

which is Agi does actually kind of make

04:37

sense because once again and I know you

04:38

guys are going to hate this but if we

04:40

look at the trademarks okay remember

04:41

that they trademarked this around the

04:43

same time okay around that 2023 time

04:45

when all of this crazy stuff was going

04:46

on and I think it's important to note as

04:49

well is that like there's no gp8 you

04:52

might might argue that if they're going

04:53

to use all the GPT names why wouldn't

04:55

they just trademark GPT a and I think

04:57

maybe because like the document States

04:59

the model after gpt7 could be AGI and

05:02

I'm going to give you guys another

05:03

reason on top of that um another reason

05:05

is and I'm going to show you guys that

05:06

later on in the video but essentially um

05:08

open ey's timeline on super alignment

05:10

actually does coincide with this Theory

05:12

which is a little bit Co coincidental of

05:14

course like I said pure speculation

05:15

could be completely false open ey like I

05:17

said before can go ahead and completely

05:19

change their entire plans you know they

05:21

can go ahead and drop two models in one

05:23

year the point I'm trying to make is

05:24

that um certain timelines do align but

05:26

just remember this because I'm going to

05:27

come back to this because of some

05:28

documents stuff that you're going to see

05:30

in this document at the end of the video

05:31

so anyways um you know it says Elon Musk

05:33

caused a delay because of his lawsuit

05:35

this why I'm revealing the information

05:36

now because no further harm can be done

05:38

so I guess Elon musk's lawsuit has kind

05:40

of um you know if you wanted bought you

05:43

some time so he says I've seen many

05:45

definitions of AGI artificial general

05:46

intelligence but I will Define AGI

05:48

simply as an artificial general

05:49

intelligence that can do any

05:50

intellectual task a smart human can this

05:52

is how most people Define the term now

05:54

2020 was the first time that I was

05:56

shocked by an AI system so this is just

05:57

some um of course you know talk about

06:00

his experience with you know AI systems

06:02

I'm guessing the person who wrote this

06:03

but you know AGI if you don't know AGI

06:05

is like an AI system that can do any

06:06

task human can but one thing that is

06:08

important to discern is that you know

06:10

AGI there was a recent paper that

06:11

actually talks about the levels of AGI

06:13

and I think it's important to remember

06:15

that AGI isn't just you know one AI that

06:17

can do absolutely everything there are

06:18

going to be levels to this AGI system

06:20

that we've seen so far and in this paper

06:21

levels of AGI they actually talk about

06:23

how you know we're already at emerging

06:25

AGI which is you know emerging which is

06:27

equal or somewhat better than an

06:28

unskilled human so we are at level one

06:30

AGI and then of course we've got um you

06:33

know competent AGI which is going to be

06:34

at least 50% of the 50th percentile of

06:37

skilled adults and that's competent AGI

06:38

that's not yet achieved and then of

06:40

course we've got expert AGI which is

06:41

90th percentile of skilled adults which

06:43

is not yet achieved then we've got

06:45

virtuoso AGI which is not yet achieved

06:47

which is 99th percentile of all skilled

06:50

adults and then we've got artificial

06:52

super intelligence which is just 100% so

06:54

I think it's important to understand

06:55

that there are these levels to AGI

06:57

because once someone says AGI I mean is

06:58

it 90 9 can it do like half you know

07:00

it's like it's just it's just pretty

07:02

confusing but I think this is uh a

07:03

really good framework for actually

07:05

looking at the definition because trust

07:06

me it's an industry standard but it is

07:08

very very confusing so here's where he

07:09

basically says that you know um you know

07:11

GPT 3.5 which powered the famous chat

07:14

GPT and of course gpt3 which was the not

07:16

the successor but the predecessor of 3.5

07:19

it says you know these were a massive

07:20

step forward towards AGI but the note is

07:22

you know gbt2 and all chatbot since

07:24

Eliza had no real ability to respond

07:26

coherently so while such gpt3 a massive

07:28

leap and of course this is where we talk

07:29

about parameter count and of course he

07:31

says deep learning is a concept that

07:32

essentially goes back to the beginning

07:33

of AI research in the 1950s first new

07:36

network was created in the 50s y y y so

07:38

basically this is where he's giving the

07:39

description of a parameter and he says

07:41

you may already know but to give a brief

07:42

digestible summary it's a nalist to a

07:45

synapse in a biological brain which is a

07:47

connection between neurons and each

07:48

neuron in a biological brain has roughly

07:50

a thousand connections to other neurons

07:52

obviously digital networks OB vular to

07:54

biological brains basically saying that

07:55

you know of course we're comparing them

07:56

but different but um how many synapses

07:59

or parameters are in a human brain the

08:01

most commonly cited figure for synapse

08:02

count in the brain is roughly 100

08:04

trillion which would mean each neuron is

08:06

100 billion in the human brain has

08:08

roughly 1,000 connection and remember um

08:10

this number 100 trillion because it's

08:12

going to actually be a very very big

08:15

number that uh you do need to remember

08:17

so of course you can see here the human

08:18

brain consists of 100 billion urans and

08:20

over 100 trillion synaptic connections

08:22

okay and essentially this is trying to

08:24

you know um just pair the similarities

08:27

between parameters and synapses so

08:29

entially stating here that you know if

08:30

each neuron in a brain and trust me guys

08:32

this is just all going to come into

08:34

everything like I know you guys might be

08:35

thinking what is the point of talking

08:36

about this I just want to hear about qar

08:38

but just trust me all of this stuff it

08:40

does actually make sense like I've read

08:41

this a lot of times so I'm going to skip

08:43

some pages but the pages I'm talking

08:44

about now just trust me guys you're

08:45

going to want to read them it basically

08:47

says here if each neuron in a brain has

08:48

a th000 connections this means a cat has

08:50

roughly 250 billion synapses and a dog

08:53

has roughly 530 billion synapses synapse

08:55

count generally seems to predict to

08:56

intelligence with a few exceptions for

08:58

instance elephants techn have a higher

08:59

signups count than humans but yet

09:01

display lower intelligence of course

09:03

basically here's where he's actually

09:04

talking about how you know the simplest

09:05

explanation for larger signups accounts

09:07

with lower intelligence is a smaller

09:09

amount of quality data and from an

09:10

evolutionary perspective brains are

09:12

quote unquote trained on billions of

09:14

years of epigenetic data and human

09:16

brains evolve from higher quality

09:17

socialization communication data than

09:19

elephants leading to our Superior

09:20

ability to reason but the point he's

09:22

trying to make here is that you know

09:23

while there are nuances that you know

09:25

don't make sense synapse count is

09:27

definitely important and I think we've

09:28

definitely seen that um with the

09:30

similarities in the parameter size with

09:33

the explosion of llms and what we've

09:35

seen in these multimodal models and

09:36

their capabilities and it says again the

09:38

explosion in a capabilties since the

09:40

early 20110 has been a result of far

09:41

more computing power and far more data

09:43

gbt2 had 1.5 billion connections Which

09:46

is less than a mouse's brain and DBT 3

09:48

had 175 billion connections which is get

09:50

somewhat closer to a cat's brain and

09:52

obviously it's intuitively obvious that

09:54

an AI system the size of a cat's brain

09:56

would be superior to a system than the

09:58

size of a mouse's brain so so here's

09:59

where things start to get interesting so

10:00

he says in 2020 after the release of the

10:03

175 billion parameter gbt 3 many

10:05

speculated about the potential

10:07

performance of a Model 600 times larger

10:09

at 100 trillion parameters just remember

10:11

this number because this number is about

10:13

to just keep you know repeating in your

10:15

head and of course he says the big

10:16

question is is it possible to predict AI

10:18

performance by parameter count and as it

10:20

turns out the answer is yes as you'll

10:21

see on the next page and this is where

10:22

he actually references this article

10:25

which is called extrapolating GPT and

10:26

performance by lrien and it was not

10:29

score written in 2022 and basically it

10:31

talks about how as you scale up in

10:33

parameter count you approach Optimal

10:34

Performance so essentially this graph

10:36

seems to be illustrating the

10:37

relationship between neuron networks

10:39

measured by the number of parameters

10:40

which can be thought of as the strength

10:42

of connections between neurons and their

10:44

performance on various tasks and these

10:46

tasks included language related

10:47

challenges like translation read and

10:49

comprehension and question and answering

10:51

among others and the performance on

10:52

these task is measured in the vertical

10:54

axis higher values indicating better

10:56

performance and the graph shows that as

10:58

the number of parameters in increases

10:59

the performance on these tasks also

11:01

tends to but of course it does have

11:02

diminishing returns As you move right

11:04

because the curves actually do tend to

11:06

Plateau as they reach the higher

11:07

parameter counts of course the various

11:09

colors on this chart just essentially

11:11

represent different tasks and each dot

11:12

on those lines represents a neural

11:14

network model of a certain size and

11:15

certain parameter count being tested on

11:18

that you can see right down here this is

11:20

where the trained G you can see the gbt

11:22

performance so it says flop us TR at gp3

11:25

and then of course you can see right

11:26

here this is apparently the number of

11:28

synap in the brain and just remember the

11:31

number 100 trillion or 200 trillion

11:32

because it's going to be really

11:33

important so s it then says as Lan

11:37

Illustrated extrapolations show that air

11:39

performance inexplicably seems to reach

11:41

human level at the same time as a human

11:43

level brain size is matched with the

11:45

parameter count his count for the

11:46

synapse the brain is roughly 200

11:48

trillion parameters as opposed to the

11:49

commonly cited 100 trillion figure but

11:51

the point still stands 100 trillion

11:53

parameters is remarkably close to

11:55

Optimal by the way an important thing to

11:57

not is that although 100 trillion is

11:58

slightly suboptimal in performance there

12:00

is an engineering technique that openi

12:02

is using to bridge this cap and I'll

12:04

explain this towards the very end of

12:05

this document because it is crucial to

12:06

open ey is building and Lan's post is

12:09

one of many similar posts online it's an

12:11

extrapolation of Performance Based on

12:12

the jump between previous models and

12:13

open ey certain has much more detailed

12:16

metrics and they've come to the same

12:17

conclusion as lanon as I'll show later

12:20

in this document so if AI performance is

12:22

predictable based on parameter count and

12:25

100 trillion parameters is enough for

12:27

human level performance when will 100