Welcome to the GPT-4 Developer Demo Livestream.
欢迎来到 GPT-4 开发者演示直播。
Honestly, it's kind of hard for me to believe that this day is here.
老实说,我有点难以相信这一天会到来。
OpenAI has been building this technology really since we started the company, but for the past two years, we've been really focused on delivering GPT-4. That started with rebuilding our entire training stack, Actually training the model, and then seeing what it was capable of.
OpenAI 自我们成立公司以来就一直在构建这项技术,但在过去的两年里, 我们一直专注于交付 GPT-4。 首先是重建我们的整个训练堆栈,实际训练模型, 然后查看它的能力。
Trying to figure out its capabilities, its risks, working with partners in order to test it in real world scenarios.
试图找出它的能力、它的风险,与合作伙伴合作, 以便在现实世界的场景中对其进行测试。
Really tuning its behavior, optimizing the model, getting it available so that you can use it.
真正调整它的行为,优化模型,让它可用,这样你就可以使用它了。
And so today, our goal is to show you a little bit of how to make GPT-L4 shine.
所以今天,我们的目标是向您展示如何让 GPT-L4 大放异彩。
How to really get the most out of it, where its weaknesses are, where we're still working on it, and Just how to really use it as a good tool, a good partner.
如何真正充分利用它,它的弱点在哪里, 我们仍在努力的地方,以及如何真正将它用作一个好工具,一个好伙伴。
So if you're interested in participating in the stream, if you go to our Discord, so it's discord.
所以如果你有兴趣参与直播,如果你去我们的 Discord,那就是 discord。
gg slash OpenAI, there's comments in there and we'll take a couple of audience suggestions.
gg slash OpenAI,里面有评论,我们会听取一些听众的建议。
So the first thing I want to show you is the first task that GPT-4 could do that we never really got 3.5 to do.
因此,我想向您展示的第一件事是 GPT-4 可以完成的第一项任务,而我们从未真正让 3.5 完成。
The way to think about this is all throughout training, that you're constantly doing all this work.
考虑这一点的方法是贯穿整个训练过程,你一直在做所有这些工作。
It's 2 AM, the pager goes off, You fix the model, and you're always wondering, is it going to work?
现在是凌晨 2 点,传呼机响了,你修好了模型,然后你一直在想,它能用吗?
Is all of this effort actually going to pan out?
所有这些努力真的会成功吗?
So we all had a pet task that we really liked, and that we would all individually be trying to see is the model capable of it now?
所以我们都有一个我们真正喜欢的宠物任务,我们都会单独尝试看看这个模型现在有能力吗?
I'm going to show you the first one that we had a success for 4, but never really got there for 3.5. So I'm just going to copy the top of our blog post from today, going to paste it into our playground.
我将向您展示第一个, 我们在 4 上取得了成功, 但在 3.5 上从未真正实现。 因此,我将从今天开始复制我们博客文章的顶部,并将其粘贴到我们的 playground 中。
Now, this is our new chat completions playground that came out two weeks ago.
现在,这是我们两周前推出的新聊天完成游乐场。
I'm going to show you first with GPT 3.5, 4 has the same API to it, the same playground.
我将首先向您展示 GPT 3.5,4 具有相同的 API,相同的游乐场。
The way that it works is you have a system message where you explain to the model what it's supposed to do, and we've made these models very steerable.
它的工作方式是你有一个系统消息, 你可以在其中向模型解释它应该做什么,我们已经使这些模型非常易于操纵。
So you can provide it with really any instruction you want, whatever you dream up.
所以你可以为它提供任何你想要的指令,无论你想什么。
The model will adhere to it pretty well, and in the future, it will get increasingly, increasingly powerful at steering the model very reliably.
该模型将很好地遵守它,并且在未来,它将在非常可靠地操纵模型方面变得越来越强大。
You can then paste whatever you want as a user, the model will return messages as an assistant.
然后您可以作为用户粘贴任何您想要的内容,该模型将作为助手返回消息。
The way to think of it is that we're moving away from just raw text in, raw text out, where you can't tell where different parts of the conversation come from, but towards this much more structured format that gives the model the opportunity to know, well, this is the user asking me to do something that the developer didn't intend, I should listen to the developer here.
思考它的方式是, 我们正在远离原始文本输入,原始文本输出, 在这种情况下你无法分辨对话的不同部分来自哪里,而是转向这种结构化程度更高的格式, 它为模型提供了有机会知道,嗯,这是用户让我做一些开发者无意的事情,我应该在这里听开发者的。
So now, time to actually show you the task that I'm referring to.
所以现在,是时候实际向您展示我所指的任务了。
So everyone's familiar with, summarize.
所以大家熟悉了,总结一下。
This is an article into a sentence, getting a little more specific, but where every word begins with G. So this is 3.5. Let's see what it does.
这是一篇文章变成一个句子, 更具体一点,但每个单词都以 G 开头。 所以这是 3.5。 让我们看看它做了什么。
Yeah, it didn't even try, just gave up on the task.
是的,它甚至没有尝试,只是放弃了任务。
This is pretty typical for 3.5, trying to do this particular task.
这对于 3.5 来说是非常典型的,试图完成这个特定的任务。
If it's a very stilted article or something like that, maybe it can succeed, but for the most part, 3.5 just gives up.
如果是很矫情的文章之类的,说不定还能成功,但大部分3.5就直接放弃了。
But let's try the exact same prompt, the exact same system message in GPT-4. So borderline, whether you want to count AI or not, but so let's say AI doesn't count.
但是让我们尝试完全相同的提示,GPT-4 中完全相同的系统消息。所以分界线,不管你想不想算人工智能,但假设人工智能不算数。
That's cheating.
那是作弊。
So fair enough.
太公平了。
The model happily accepts my feedback.
该模型愉快地接受了我的反馈。
So now to make sure it's not just good for Gs, I'd like to turn this over to the audience.
所以现在为了确保它不仅对 Gs 有好处,我想把它交给观众。
I'll take a suggestion on what letter to try next.
我会就接下来要尝试的字母提出建议。
In the meanwhile, while I'm waiting for our moderators to pick the lucky letter, I will give a try with A. But in this case, I'll say GPT-4 is fine.
与此同时,在等待我们的版主挑选幸运字母的同时,我会尝试使用 A。 但在这种情况下,我会说 GPT-4 没问题。
Why not?
为什么不?
Also, pretty good summary.
另外,总结得很好。
So I'll hop over to our Discord.
所以我会跳到我们的 Discord。
All right.
好的。
Wow.
哇。
People are being a little ambitious here.
人们在这里有点雄心勃勃。
I'm really trying to put the model through the paces.
我真的在努力让模型走上正轨。
We're going to try Q, which if you think about this for a moment, I want the audience to really think about how would you do a summary of this article that all starts with Q?
我们将尝试使用 Q, 如果您稍微考虑一下,我希望听众真正考虑如何对这篇以 Q 开头的文章进行总结?
It's not easy.
这并不容易。
It's pretty good.
这个很不错。
That's pretty good.
那很好。
All right.
好的。
So I've shown you summarizing an existing article.
所以我已经向您展示了对现有文章的总结。
I want to show you how you can flexibly combine ideas between different articles.
我想向您展示如何在不同文章之间灵活组合想法。
So I'm going to take this article that was on Hacker News yesterday, copy-paste it into the same conversation.
所以我打算把昨天在 Hacker News 上发表的这篇文章复制粘贴到同一个对话中。
So it has all the context of what we were just doing.
所以它具有我们刚刚所做的所有上下文。
I'm going to say, find one common theme between this article and the GPT-4 blog.
我要说的是,在本文和 GPT-4 博客之间找到一个共同主题。
So this is an article about Pinecone, which is a Python web app development framework, and it's making the technology more accessible, user-friendly.
所以这是一篇关于 Pinecone 的文章,它是一个 Python 网络应用程序开发框架,它使该技术更易于访问,对用户更友好。
If you don't think that was insightful enough, you can always give some feedback and say, that was not insightful enough.
如果您认为这不够有见地,您可以随时提供一些反馈并说,这还不够有见地。
Please.
请。
No, I'll just even just leave it there.
不,我什至会把它留在那里。
Leave it up to the model to decide.
留给模型来决定。
So bridging the gap between powerful technology and practical applications seems not bad.
因此,弥合强大技术与实际应用之间的差距似乎还不错。
Of course, you can ask for any other kind of task you want using its flexible language, understanding, and synthesis.
当然,您可以使用其灵活的语言、理解和综合来要求您想要的任何其他类型的任务。
You can ask for something like, now turn the GPT-4 blog post into a rhyming poem.
你可以要求类似的东西,现在把 GPT-4 博客文章变成一首押韵诗。
Picked up on OpenAI evals, open source for all, helping to guide answering the call.
接受 OpenAI 评估,为所有人开源,帮助指导接听电话。
Which by the way, if you'd like to contribute to this model, please give us evals.
顺便说一下,如果您想为这个模型做出贡献,请给我们评估。
We have an open source evaluation framework that will help us guide and all of our users understand what the model is capable of and to take it to the next level.
我们有一个开源评估框架,它将帮助我们指导和我们所有的用户了解模型的功能并将其提升到一个新的水平。
So there we go.
所以我们开始了。
This is consuming existing content using GPT-4 with a little bit of creativity on top.
这是使用 GPT-4 消耗现有内容,再加上一点点创造力。
But next, I want to show you how to build with GPT-4, what it's like to create with it as a partner.
但接下来,我想向您展示如何使用 GPT-4 进行构建,以及作为合作伙伴使用它进行创建的感觉。
So the thing we're going to do is we're going to actually build a Discord bot.
所以我们要做的是实际构建一个 Discord 机器人。
I'll build it live and show you the process, show you debugging, show you what the model can do, where its limitations are, and how to work with them in order to achieve new heights.
我将实时构建它并向您展示过程,向您展示调试, 向您展示模型可以做什么,它的局限性在哪里, 以及如何与它们一起工作以达到新的高度。
So the first thing I'll do is tell the model that this time, it's supposed to be an AI programming assistant.
所以我要做的第一件事就是告诉模型,这次它应该是一个 AI 编程助手。
Its job is to write things out in pseudocode first and then actually write the code.
它的工作是先用伪代码写出内容,然后再实际编写代码。
This approach is very helpful to let the model break down the problem into smaller pieces.
这种方法非常有助于让模型将问题分解成更小的部分。
Then that way, you're not asking it to just come up with a super hard solution to a problem all in one go.
这样一来,您就不会要求它一次性想出一个解决问题的超难解决方案。
It also makes it very interpretable because you can see exactly what the model was thinking and you can even provide corrections if you'd like.
它还使它非常易于解释,因为您可以准确地看到模型在想什么,如果您愿意,您甚至可以提供更正。
So here is the prompt that we're going to ask it.
所以这是我们要问的提示。
This is the thing that 3.5 would totally choke on if you've tried anything like it.
如果你尝试过类似的东西,这就是 3.5 会完全窒息的事情。
But so we're going to ask for a Discord bot that uses the GPT-4 API to read images and text.
但是,我们将要求使用 GPT-4 API 来读取图像和文本的 Discord 机器人。
Now, there's one problem here, which is this model's training cutoff is in 2021, which means it has not seen our new chat completions format.
现在,这里有一个问题,就是这个模型的训练截止日期是 2021 年,这意味着它还没有看到我们新的聊天完成格式。
So I literally just went to the blog post from two weeks ago, copy-pasted from the blog post including the response format.
所以我真的只是去了两周前的博客文章,从博客文章中复制粘贴, 包括响应格式。
It has not seen the new image extension to that, and so I just wrote that up in just very minimal detail about how to include images.
它没有看到新的图像扩展,所以我只是写了关于如何包含图像的非常简单的细节。
Now, the model can actually leverage that documentation that it did not have memorized, but it does not know.
现在, 该模型实际上可以利用它没有记住的文档,但它不知道。
In general, these models are very good at using information that it's been trained on in new ways and synthesizing new content.
一般来说,这些模型非常擅长使用以新方式训练过的信息并合成新内容。
You can see that right here that it actually wrote an entirely new bot.
您可以在这里看到它实际上编写了一个全新的机器人。
Now, let's actually see if this bot is going to work in practice.
现在,让我们实际看看这个机器人是否会在实践中发挥作用。
So you should always look through the code to get a sense of what it does.
因此,您应该始终通读代码以了解它的作用。
Don't run untrusted code from humans or from AIs.
不要运行来自人类或 AI 的不受信任的代码。
One thing to note is that the Discord API has changed a lot over time, and particularly that there's one feature that has changed a lot since this model was trained.
需要注意的一件事是, 随着时间的推移, Discord API 发生了很大变化,特别是自训练该模型以来, 有一项功能发生了很大变化。
Give it a try.
试一试。
In fact, yes, we are missing the intense keyword.
事实上,是的,我们缺少 intense 关键字。
This is something that came out in 2020. So the model does know it exists, but it doesn't know which version of the Discord API we're using.
这是 2020 年出现的东西。 所以模型确实知道它存在,但它不知道我们使用的是哪个版本的 Discord API。
So are we out of luck?
那么我们运气不好吗?
Well, not quite.
好吧,不完全是。
We can just simply paste to the model exactly the error message.
我们可以简单地将错误消息准确地粘贴到模型中。
Not even going to say, hey, this is from running your code, could you please fix it?
甚至不会说,嘿,这是运行你的代码,你能修复它吗?
We'll just let it run.
我们就让它运行吧。
The model says, oh yeah, whoops, the intense argument.
模特说,哦,是的,哎呀,激烈的争论。
Here's the correct code.
这是正确的代码。
Now, let's give this a try.
现在,让我们试一试。
Once again, making sure that we understand what the code is doing.
再次确保我们了解代码的作用。
Now, a second issue that can come up is it doesn't know what environment I'm running in.
现在,可能出现的第二个问题是它不知道我在什么环境中运行。
If you notice, it says, hey, here's this inscrutable error message, which if you've not used Jupyter Notebook a lot with AsyncIO before, you probably have no idea what this means.
如果您注意到, 它会说,嘿,这是一条难以理解的错误消息,如果您之前没有大量使用 Jupyter Notebook 和 AsyncIO,您可能不知道这意味着什么。
But fortunately, once again, you can just say to the model, hey, I'm using Jupyter and would like to make this work, and you fix it.
但幸运的是, 再一次,你可以对模型说,嘿,我正在使用 Jupyter, 我想让它工作,然后你修复它。
The specific problem is that there's already an event loop running, so you need to use this NestAsyncIO library.
具体问题是已经有一个事件循环在运行,所以你需要使用这个 NestAsyncIO 库。
You need to call NestAsyncIO.
您需要调用 NestAsyncIO。
apply.
申请。
The model knows all of this, correctly instantiates all of these pieces into the bot.
该模型知道所有这些,正确地将所有这些部分实例化到机器人中。
It even helpfully tells you, oh, you're running in Jupyter.
它甚至会很有帮助地告诉您,哦,您正在 Jupyter 中运行。
Well, you can do this bang, pip, install in order to install the package if you don't already have it.
好吧,如果您还没有这个包,您可以执行 bang、pip、install 来安装它。
That was very helpful.
这很有帮助。
So now, we'll run and it looks like something happened.
所以现在,我们要跑了,看起来好像发生了什么事。
So the first thing I'll do is go over to our Discord, and I will paste in a screenshot of our Discord itself.
所以我要做的第一件事就是转到我们的 Discord,然后我将粘贴我们 Discord 本身的屏幕截图。
So remember, GPT-4 is not just a language model, it's also a vision model.
所以请记住,GPT-4 不仅仅是一种语言模型,它还是一种视觉模型。
In fact, it can flexibly accept inputs that intersperse images and text arbitrarily like a document.
事实上,它可以像文档一样灵活地接受任意穿插图像和文本的输入。
Now, the image feature is in preview.
现在,图像功能处于预览状态。
So this is going to be a little sneak peek.
所以这将是一个小偷窥。
It's not yet publicly available.
它尚未公开。
It's something we're working with one partner called BeMyEyes, in order to really start to develop it and get it ready for primetime.
这是我们正在与一个名为 BeMyEyes 的合作伙伴合作的东西,以便真正开始开发它并为黄金时段做好准备。
But you can ask anything you like.
但你可以问任何你喜欢的。
For example, I'll say, GPT-4, hello world.
例如,我会说,GPT-4,你好世界。
Can you describe this image in painstaking detail?
你能详细描述这个图像吗?
First of all, think of how you would do this yourself.
首先,想想你自己会怎么做。
There's a lot of different things you could latch onto, a lot of different pieces of the system you could describe.
您可以抓住很多不同的东西,可以描述系统的很多不同部分。
We can go over to the actual code and we can see that, yep, we in fact received the message, have formatted an appropriate request for our API.
我们可以查看实际代码, 我们可以看到,是的,我们实际上收到了消息,已经为我们的 API 格式化了适当的请求。
Now, we wait because one of the things we have to do is we have to make the system faster.
现在,我们等待,因为我们必须做的一件事是我们必须使系统更快。
That's one of the things that we're working on optimizing.
这是我们正在努力优化的事情之一。
In the meanwhile, I just want to say to the audience that's watching, we'll take an audience request next.
与此同时, 我只想对正在观看的观众说,接下来我们将接受观众请求。
So if you have an image and a task you'd like to accomplish, please submit that to the Discord.
因此,如果您有想要完成的图像和任务,请将其提交至 Discord。
Our moderators will pick one that will run.
我们的主持人将选择一个将要运行的。
So we can see that the Discord, it looks like we have a response.
所以我们可以看到 Discord,看起来我们有回应。
Perfect.
完美的。
So it's a screenshot of a Discord application interface.
这是 Discord 应用程序界面的屏幕截图。
Pretty good.
不错。
Did not even describe it.
甚至没有描述它。
It knows that it's Discord.
它知道这是 Discord。
It's probably Discord written there somewhere where it just knows this from prior experience.
它可能是 Discord 写在某个地方,它只是从以前的经验中知道这一点。
Server icon labeled GPT-4 describes the interface in great detail.
标有 GPT-4 的服务器图标非常详细地描述了界面。
Talks about all the people telling me that I'm supposed to do queue, very kind audience, and describes a bunch of the notification messages and the users that are in the channel.
谈论所有告诉我我应该排队的人,非常友好的听众, 并描述了一堆通知消息和频道中的用户。
So there you go.
所以你去吧。
That's some pretty good understanding.
这是一些很好的理解。
Now, this next one, if you notice, first of all, we got a post, but the model did not actually see the message.
现在,下一个,如果你注意到的话,首先,我们收到了一个帖子, 但模型实际上并没有看到消息。
So is this a failure of the model or of the system around the model?
那么这是模型的失败还是模型周围系统的失败?
Well, we can take a look.
好吧,我们可以看看。
If you notice here, content is an empty string.
如果您注意到这里,内容是一个空字符串。
We received a blank message contents.
我们收到了一条空白的消息内容。
The reason for this is a dirty trick that we played on the AI.
这是因为我们在 AI 上玩了一个卑鄙的把戏。
So if you go to the Discord documentation, and you scroll through it all the way down to, it's hard for me to even find honestly, to the message content intent.
因此,如果您转到 Discord 文档,然后一直向下滚动到它,老实说,我什至很难找到消息内容的意图。
You'll see this was added as of September 2022 as a required field.
您会看到这是自 2022 年 9 月起作为必填字段添加的。
So in order to receive a message that does not explicitly tag you, you now have to include this new intent in your code.
因此, 为了接收未明确标记您的消息,您现在必须在代码中包含此新意图。
Remember I said, intents have changed a lot over time.
记住我说过,随着时间的推移,意图发生了很大变化。
This is much newer than the model is possibly able to know.
这比模型可能知道的要新得多。
So maybe we're out of luck, we have to debug this by hand.
所以也许我们运气不好,我们必须手动调试它。
But once again, we can try to use GPT-4's language understanding capabilities to solve this.
但是再一次,我们可以尝试使用 GPT-4 的语言理解能力来解决这个问题。
Now, keep in mind, this is a document of like, I think this is like 10,000, 15,000 words, something like that.
现在,请记住,这是一份类似的文件,我认为这大约有 10,000、15,000 字之类的内容。
It's not formatted very well.
它的格式不是很好。
This is literally a command a copy-paste.
这实际上是一个复制粘贴命令。
This is what it's supposed to parse through to find in the middle of that document that, oh yeah, message contents, that's required now.
这是它应该解析的内容, 以便在该文档的中间找到,哦,是的,现在需要的消息内容。
But let's see if it can do it.
但让我们看看它是否能做到。
So we will ask for, I am receiving blank message contents.
所以我们会要求,我收到空白消息内容。
Can you, why could this be happening?
你能,为什么会这样?
How do I fix it?
我如何解决它?
So one thing that's new about GPT-4 is context length.
因此,关于 GPT-4 的一个新事物是上下文长度。
32,000 tokens is the upper limit that we support right now, and the model is able to flexibly use long documents.
32000个token是我们目前支持的上限,模型可以灵活的使用长文档。
It's something we're still optimizing, so we recommend trying it out, but not necessarily really scaling it up just yet unless you have an application that really benefits from it.
这是我们仍在优化的东西,所以我们建议尝试一下, 但不一定要真正扩大它, 除非你有一个真正从中受益的应用程序。
So if you're really interested in long context, please let us know.
因此,如果您真的对长上下文感兴趣,请告诉我们。
We want to see what applications it unlocks.
我们想看看它解锁了哪些应用程序。
But if you see, it says, oh yeah, message content intent was not enabled, and so you can either ask the model to write some code for you, or you could actually just do it the old-fashioned way.
但是如果你看到, 它会说,哦,是的,消息内容意图没有启用,所以你可以让模型为你写一些代码,或者你实际上可以用老式的方式来做。
Either way is fine.
无论哪种方式都可以。
I think that this is an augmenting tool makes you much more productive, but it's still important that you are in the driver's seat and are the manager and knows what's going on.
我认为这是一个增强工具, 可以让你的工作效率更高,但你仍然很重要, 你是司机,是经理,知道发生了什么。
So now we're connected once again, and Boris, would you like to rerun the message?
现在我们再次连接,鲍里斯,你想重新发送消息吗?
Once again, we can see that we have received it, even though the bot was not explicitly tagged.
再一次,我们可以看到我们已经收到了它,即使没有明确标记 bot。
Seems like a pretty good description.
似乎是一个很好的描述。
Interesting.
有趣的。
This is an interesting image actually.
这实际上是一个有趣的图像。
It looks like it's a dolly generated one.
它看起来像是一个推车生成的。
Let's actually try this one as well.
让我们也试试这个。
What's funny about this image?
这张图有什么好笑的?
Oh, it's already been submitted.
哦,已经提交了。
So once again, we can verify that it's making the right API calls.
因此,我们可以再次验证它是否进行了正确的 API 调用。
Squirrels do typically eat nuts.
松鼠通常吃坚果。
We don't expect them to use a camera or act like a human.
我们不希望他们使用相机或像人一样行事。
So I think that's a pretty good explanation of why that image is funny.
所以我认为这很好地解释了为什么这张图片很有趣。
So I'm going to show you one more example of what you can do with this model.
因此,我将再向您展示一个示例,说明您可以使用此模型做什么。
So I have here a nice hand-drawn mock-up of a joke website.
所以我这里有一个漂亮的笑话网站手绘模型。
Definitely worthy of being put up on my refrigerator.
绝对值得放在我的冰箱上。
So I'm just going to take out my phone, literally take a photo of this mock-up, and then I'm going to send it to our Discord.
所以我只是要拿出我的手机,真的给这个模型拍张照片,然后我要把它发送到我们的 Discord。
Going to send it to our Discord.
打算将它发送到我们的 Discord。
This is of course the rockiest part, making sure that we actually send it to the right channel, which in fact I think maybe I did not.
这当然是最困难的部分,确保我们真的把它发送到正确的渠道,事实上我想也许我没有。
Sent it to the wrong channel.
发错频道了。
It's funny, it's always the non-AI parts of these demos that are the hardest part to do.
有趣的是,这些演示中的非 AI 部分始终是最难完成的部分。
Here we go.
开始了。
Technology is now solved, and now we wait.
技术现在已经解决了,现在我们等待。
So the thing that's amazing in my mind is that what's going on here is we're talking to a neural network, and this neural network was trained to predict what comes next.
所以在我看来令人惊奇的是, 这里发生的事情是我们正在与一个神经网络交谈,这个神经网络被训练来预测接下来会发生什么。
It played this game of being shown a partial document, and then predicted what comes next across an unimaginably large amount of content.
它玩了这个被展示部分文件的游戏,然后预测接下来会发生什么, 内容量大得难以想象。
From there, it learns all of these skills that you can apply in all of these very flexible ways.
从那里,它学习所有这些技能,您可以通过所有这些非常灵活的方式应用这些技能。
So we can actually take now this output.
所以我们现在实际上可以得到这个输出。
So literally we just said to output the HTML from that picture.
所以从字面上看,我们只是说从该图片输出 HTML。
Here we go.
开始了。
Actual working JavaScript, filled in the jokes for comparison.
实际工作的 JavaScript,填充了用于比较的笑话。
This was the original of our mock-up.
这是我们模型的原型。
So there you go, going from hand-drawn beautiful art, if I do say so myself, to working website.
所以你去了,从手绘美丽的艺术,如果我自己这么说,到工作网站。
This is all just potential.
这一切都只是潜力。
You can see lots of different applications.
您可以看到许多不同的应用程序。
We ourselves are still figuring out new ways to use this.
我们自己仍在寻找使用它的新方法。
So we're going to work with our partner, we're going to scale up from there, but please be patient because it's going to take us some time to really make this available for everyone.
所以我们将与我们的合作伙伴合作,我们将从那里扩大规模,但请耐心等待, 因为我们需要一些时间才能真正让每个人都可以使用它。
So I have one last thing to show you.
所以我还有最后一件事要给你看。
I've shown you reading existing content.
我已经向您展示了阅读现有内容。
I've shown you how to build with the system as a partner.
我已经向您展示了如何作为合作伙伴使用系统进行构建。
The last thing I'm going to show is how to work with the system to accomplish a task that none of us like to do, but we all have to.
我要展示的最后一件事是如何使用系统来完成一项我们都不喜欢做但我们都必须做的任务。
So you may have guessed, the thing we're going to do is taxes.
所以你可能已经猜到了,我们要做的是税收。
Now, note that GPT is not a certified tax professional nor am I, so you should always check with your tax advisor.
现在请注意, GPT 不是经过认证的税务专业人士,我也不是,因此您应该始终咨询您的税务顾问。
But it can be helpful to understand some dense content, to just be able to empower yourself to be able to solve problems and get a handle on what's happening when you could not otherwise.
但是,理解一些密集的内容可能会有所帮助,从而能够让您自己能够解决问题并掌握正在发生的事情, 而您无法通过其他方式解决问题。
So once again, I'll do a system message.
所以再一次,我会做一个系统消息。
In this case, I'm going to tell it that it's tax GPT, which is not a specific thing that we've trained into this model.
在这种情况下, 我要告诉它是 GPT 税,这不是我们训练到这个模型中的特定事物。
You can be very creative if you want with the system message to really get the model in the mood of what is your job?
如果你想通过系统消息真正让模特了解你的工作,你可以非常有创意?
What are you supposed to do?
你该怎么办?
So I pasted in the tax code.
所以我粘贴了税码。
This is about 16 pages worth of tax code.
这是大约 16 页的税法。
There's this question about Alice and Bob, they got married at one point and here are their incomes, and they take a standard deduction, they're filing jointly.
有一个关于爱丽丝和鲍勃的问题,他们曾经结过婚, 这是他们的收入,他们采用标准扣除,他们共同申报。
So first question, what is their standard deduction for 2018?
那么第一个问题,他们 2018 年的标准扣除额是多少?
So while the model is chugging, I'm going to solve this problem by hand to show you what's involved.
因此,当模型运行时,我将手动解决此问题以向您展示所涉及的内容。
So the standard deduction is the basic standard deduction plus the additional.
所以标准扣除是基本标准扣除加附加。
The basic one is 200 percent for joint return of sub-paragraph C, which is here.
基本款C款联合返还200%,在这里。
So additional doesn't apply, the limitation doesn't apply.
所以附加不适用,限制不适用。
These apply.
这些适用。
Wait, special rules for taxable year 2018, which is the one we care about through 2025, you have to substitute 12,000 for 3,000. So 200 percent of 12,000, 24,000 is the final answer.
等等,2018 纳税年度的特殊规定,也就是我们关心到 2025 年的纳税年度,您必须用 12,000 代替 3,000。所以 12,000 的 200%,24,000 是最终答案。
If you notice, the model got to the same conclusion, and you can actually read through its explanation.
如果您注意到, 该模型得出了相同的结论,您实际上可以通读它的解释。
To tell you the truth, the first time I tried to approach this problem myself, I could not figure it out.
老实说, 第一次尝试自己解决这个问题时,我想不通。
I spent half an hour reading through the tax code, trying to figure out this back-reference and why there's sub-paragraph.
我花了半个小时通读了税法,试图找出这个反向引用以及为什么有小段。
Just what's even going on?
究竟是怎么回事?
It was only by asking the model to spell out its reasoning, and then I followed along.
只是让模型说出它的推理,然后我就跟着做了。
So I was like, oh, I get it now.
所以我想,哦,我现在明白了。
I understand how this works.
我明白这是怎么回事。
So that, I think, is where the power of this system lies.
所以,我认为,这就是这个系统的力量所在。
It's not perfect, but neither are you.
它不完美,但你也不完美。
Together, it's this amplifying tool that lets you just reach new heights.
总之,正是这种放大工具可以让您达到新的高度。
You can go further.
你可以走得更远。
You can say, now calculate their total viability.
你可以说,现在计算它们的总生存能力。
Here we go.
开始了。
It's doing the calculation.
它正在计算。
Honestly, every time it does it, it's amazing.
老实说,每次它这样做,都很棒。
This model is so good at mental math.
这个模型非常擅长心算。
It's way, way better than I am at mental math.
它的方式,比我在心算方面要好得多。
It's not hooked up to a calculator.
它没有连接到计算器。
That's another way that you could really try to enhance these systems.
这是您可以真正尝试增强这些系统的另一种方式。
But it has these raw capabilities that are so flexible.
但它具有这些非常灵活的原始功能。
It doesn't care if it's code.
它不关心它是否是代码。
It doesn't care if it's language.
它不关心它是否是语言。
It doesn't care if it's tax.
不在乎是不是税。
All of these capabilities in one system that can be applied towards the problem that you care about, towards your application, towards whatever you build.
一个系统中的所有这些功能都可以应用于您关心的问题、您的应用程序、您构建的任何内容。
So to end it, the final thing that I will show is a little other dose of creativity, which is now summarize this problem into a rhyming poem.
所以最后, 我要展示的最后一件事是一点点其他的创造力,现在把这个问题总结成一首押韵的诗。
There we go.
我们开始了。
A beautiful, beautiful poem about doing your taxes.
一首关于纳税的美丽而美丽的诗。
So thank you everyone for tuning in.
所以谢谢大家收看。
I hope you learn something about what the model can do, how to work with it.
我希望您了解模型可以做什么以及如何使用它。
Honestly, we're just really excited to see what you're going to build.
老实说,我们真的很高兴看到您将要构建什么。
I've talked about OpenAI evals.
我已经谈到了 OpenAI 评估。
Please contribute.
请贡献。
We think that this model, improving it, bring it to the next level, is something that everyone can contribute to, and that we think it can really benefit a lot of people, and we want your help to do that.
我们认为这个模型,改进它,把它提升到一个新的水平, 是每个人都可以做出贡献的东西,我们认为它真的可以让很多人受益,我们希望你能帮助做到这一点。
So thank you very much.
非常感谢。
We're so excited to see what you're going to build.
我们很高兴看到您将要构建什么。