Tesla FSD 12 live broadcast debut! There was only one intervention in 45 minutes, and the video “fed” the AI to “chauffeur”

cls.cn · Aug 28, 2023 20:26

来源：财联社

①在非预设道路上，车辆能自行礼让行人、避开路障、路口转向、在两条直行道中选择一条车少的。

②通过视频训练数据，AI可以自己学习驾驶；来自优秀驾驶员的高质量数据，是训练特斯拉自动驾驶的关键。

③特斯拉即将上线一个新算力集群，其中包含1万颗英伟达H100。

正如之前承诺的一样，马斯克上周末用一辆搭载HW3的Model S，向外界直播展示了特斯拉FSD 12测试版。

在这场45分钟的直播中，坐在方向盘后、举着手机的马斯克仅干预了一次车辆行为，车辆在非预设道路上，便能自行礼让行人、避开路障、路口转向、在两条直行道中选择一条车少的。

马斯克表示，FSD 12可以在不熟悉的环境中离线使用；若有干预行为发生，系统会记录并发回特斯拉分析。

而在直播开始后约20分钟时，马斯克进行了全程唯一一次干预接管。当时，这辆Model S需要直行，因此停下等待红灯。但当左转信号灯转绿时，车辆竟然也紧跟启动，好在马斯克与一旁的工程师及时制止。

在这之后，马斯克表示，要给FSD“投喂”更多左转红绿灯的视频。

投喂视频就能“喂”出“AI代驾”？

实际上，在这场直播中，当车辆自行在减速带减速、避开滑板车骑行者时，马斯克多次强调，FSD 12中没有任何一行相应代码，人为设定车辆作出这些动作——其没有被训练过如何读取路标，也不知道什么是滑板车，FSD 12完成这些行为完全是大量视频训练的结果。通过视频训练数据，AI可以自己学习驾驶，“像人类一样做事”。

如果FSD在特定场景下没有作出正确决策，特斯拉便会向其神经网络训练投入更多数据（主要是视频）。

当然，平庸随意的数据是不够的，供给神经网络的数据需要精心挑选。马斯克也特别强调，来自优秀驾驶员的高质量数据，才是训练特斯拉自动驾驶的关键。

“大量平庸的数据并不能改善驾驶，数据管理相当困难。我们有很多软件，可以控制系统选择什么数据、训练什么数据。”

而对特斯拉而言，其数据的一大主要来源便是来自全球各地的车队。马斯克还透露，特斯拉在世界各地拥有多名FSD测试驾驶员，包括新西兰、泰国、挪威、日本等。

从2020年起，特斯拉便开始将Autopilot决策从编程逻辑转向神经网络与AI。经过3年的发展，从本次马斯克的FSD 12直播也能看到，几乎整个决策与场景处理已转移至特斯拉神经网络与AI上。

FSD 11独占控制堆栈中有超过30万行C++代码，而12中代码寥寥。之前马斯克也曾指出，车辆控制（vehicle control）是“特斯拉FSD AI拼图”上的最后一块拼图，其将使得这30万行以上的C++代码减少约2个数量级。

全AI端到端驾驶控制

特斯拉FSD 12是其最重要的一次升级，实现了全AI端到端的驾驶控制。

至于为何选择端到端方案？马斯克直播之前与WholeMars连线时，给出了更多细节。

“人类就是这么做的，”他表示，“光子输入，手脚动作（控制）输出。”——人类依靠眼睛和生物神经网络开车，对于自动驾驶而言，摄像头与神经网络AI便是正确的通用决策方案。

虽说AI神经网络难以解释具体细节，但相应地，人类乘客在打车时，也无法准确地知道司机在想什么，只能看到司机的评价。

券商指出，端到端方案与之前的关键区别之一就在于，传统的模块化架构是将智能驾驶拆分称单独任务，交由专门的AI模型或模块来处理，例如感知、预测、规划等；而端到端AI则是“感知决策一体化”，即将“感知”与“决策”融合到一个模型中。

目前，特斯拉绝大多数训练还是需要依靠英伟达的GPU，特斯拉自家的Dojo超算则是作为辅助。今年以来，特斯拉已为训练花费了20亿美元。

特斯拉还正在加班加点，筹备一个新算力集群，其中包含1万颗英伟达H100，有望在本周一（8月28日）上线。值得一提的是，该集群使用的是Infiniband进行连接传输，马斯克更坦言，如今Infiniband比GPU更缺。

编辑/lambor

Source: Finance Association

① On non-predefined roads, vehicles can be polite to pedestrians, avoid roadblocks, turn at intersections, and choose between two direct roads with fewer cars.

② Through video training data, AI can learn to drive on its own; high-quality data from excellent drivers is the key to training Tesla's autonomous driving.

③ Tesla is about to launch a new computing power cluster, which includes 10,000 Nvidia H100s.

As promised before, Musk used a Model S equipped with HW3 to show the outside world a beta version of the Tesla FSD 12 live broadcast last weekend.

In this 45-minute live broadcast, Musk, sitting behind the wheel and holding his phone, only interfered with the vehicle's behavior once. On a road other than the default road, the vehicle was able to excuse pedestrians, avoid roadblocks, turn around at intersections, and choose one of the two straight roads with fewer cars.

Musk said that FSD 12 can be used offline in unfamiliar environments; if intervention occurs, the system will record and send back to Tesla for analysis.

And about 20 minutes after the live broadcast began,Musk intervened and took over the whole process for the only time.At the time, the Model S needed to go straight, so I stopped and waited for a red light. However, when the left-turn signal light turned green, the vehicle actually started immediately. Fortunately, Musk and the engineers on his side stopped it in time.

After that, Musk said he wanted to “feed” the FSD with more videos of left-turning traffic lights.

Can you “feed” an “AI chauffeur” by posting a video?

In fact, in this live broadcast, when the vehicle decelerated on its own in the speed bump and avoided the scooter riders, Musk emphasized many times,There isn't a single line of code in FSD 12 that artificially sets the vehicle to perform these actions——They haven't been trained to read road signs, and they don't know what a scooter is,FSD 12 accomplishes these actions entirely as a result of extensive video training. Through video training data, AI can learn to drive on its own,“Do things like humans”.

If FSD doesn't make the right decision in a specific scenario, Tesla will invest more data (mostly video) into its neural network training.

Of course,Mediocre and random data is not enough; the data supplied to the neural network needs to be carefully selected. Musk also placed special emphasis on high-quality data from excellent drivers, which is the key to training Tesla for autonomous driving.

“A large amount of mediocre data doesn't improve driving, and data management is quite difficult. We have a lot of software that can control what data the system selects and what data to train.”

For Tesla, a major source of data is its fleet of cars from all over the world. Musk also revealed that Tesla has many FSD test drivers around the world, including New Zealand, Thailand, Norway, and Japan.

Since 2020, Tesla has begun to shift Autopilot decisions from programming logic to neural networks and AI. After 3 years of development, as can be seen from Musk's FSD 12 live broadcast, almost all decision-making and scenario processing has been transferred to Tesla's neural networks and AI.

FSD 11 has more than 300,000 lines of C++ code in its exclusive control stack, while there is very little code in 12.Musk also previously pointed out that vehicle control (vehicle control) is the final piece of the “Tesla FSD AI Puzzle,” which will reduce more than 300,000 lines of C++ code by about 2 orders of magnitude.

Full AI end-to-end driving control

Tesla FSD 12 is itsThe most important upgrade was to achieve full AI end-to-end driving control.

As for why choose an end-to-end solution? Musk gave more details when connecting with WholeMars before Live.

“This is how humans do it,” he said. “Photon input, hand and foot movement (control) output.” ——Humans rely on eyes and biological neural networks to drive. For autonomous driving, cameras and neural network AI are the correct general decision-making solutions.

Although it is difficult for AI neural networks to explain specific details, corresponidentally, when taking a taxi, human passengers cannot accurately know what the driver is thinking; they can only see the driver's comments.

Brokers pointed out that one of the key differences between the end-to-end solution and the previous one is that the traditional modular architecture divides intelligent driving into separate tasks and leaves them to special AI models or modules for processing, such as perception, prediction, planning, etc.; while end-to-end AI is “integration of perception and decision making,” that is, integrating “perception” and “decision making” into one model.

Currently, the vast majority of Tesla training still relies on Nvidia GPUs, and Tesla's own Dojo supercomputing is used as an aid.Since this year, Tesla has spent $2 billion on training.

Tesla is still working overtime,A new computing power cluster is being prepared, which includes 10,000 Nvidia H100s, and is expected to go live this Monday (August 28).It is worth mentioning that this cluster uses Infiniband for connection transmission.Musk went on to say that today InfiniBand is less than GPUs.

edit/lambor

The translation is provided by third-party software.

The above content is for informational or educational purposes only and does not constitute any investment advice related to Futu. Although we strive to ensure the truthfulness, accuracy, and originality of all such content, we cannot guarantee it.

特斯拉FSD 12直播首秀！45分钟仅一次干预，视频「喂」出AI「代驾」

Tesla FSD 12 live broadcast debut! There was only one intervention in 45 minutes, and the video “fed” the AI to “chauffeur”

投喂视频就能“喂”出“AI代驾”？

全AI端到端驾驶控制

Can you “feed” an “AI chauffeur” by posting a video?

Full AI end-to-end driving control

Risk Disclaimer

Statement