Sponsored

Explain HW3 v HW4 Implications for CT?

scottf200

Well-known member
Joined
Jul 31, 2021
Threads
53
Messages
2,221
Reaction score
3,241
Location
Western NC
Vehicles
X; immed family 3 & Y
Country flag
Tesla does not use any kind of image post-processing, its sensor data directly into the ML model. So when the sensor changes, it is an even bigger difference for the machine learning models to deal with.

FSD is trained on video captured from cars using HW3. Since HW4 has better cameras that look totally different, the neural nets will need to be re-trained on these cameras once enough data has been accumulated.
Your comments don't make any sense because HW4 with better cameras are getting FSD ... but you don't think the images are post processed. How'd they get HW4 vehicles on FSD then.
Sponsored

 

charliemagpie

Well-known member
First Name
Charlie
Joined
Jul 6, 2021
Threads
48
Messages
2,982
Reaction score
5,369
Location
Australia
Vehicles
CybrBEAST
Occupation
retired
Country flag
V12 is End to End AI.

It is hard to fathom that the car will learn hundreds of millions of permutations and will have more experience and be a better driver, forward or in reverse, than any human.

And it will continue to improve exponentially.

Doing a pirouette while it's hitching up will not be beyond its capability, lol.

Change is coming so fast. If we are right now, still pondering whether it can simply hitch a trailer, I think we are going to be shocked.
 

Baldey

Well-known member
First Name
Jenia
Joined
Feb 24, 2023
Threads
10
Messages
387
Reaction score
666
Location
Colorado
Vehicles
tesla M3, 2025 CT
Occupation
QA automation
Country flag
Your comments don't make any sense because HW4 with better cameras are getting FSD ... but you don't think the images are post processed. How'd they get HW4 vehicles on FSD then.
Sorry, i'm not sure if i understand your comment either :p There must be some confusion.. I did say "any kind of post-proccessing" , so i think the confusion may have been my bad. What i meant is that they disable any post processing done by the manufacturer, and do their own processing of the RAW CMOS sensor data.


I think i remember Musk saying something about having a lot more sensor data available than you see in a typical image, when asked about low light visibility. He said the FSD network is trained on raw sensor data, and not images that humans would see. You could construct an image from that data, but the FSD network "sees" on a level below that.

Here is a good explanation on what happens to the light hitting a digital sensor, in 10 steps before the jpeg. I am not sure which step the FSD network is trained on, but i am pretty sure it is 4, or maybe even 3.
https://photo.stackexchange.com/questions/1455/what-is-raw-technically
 
Last edited:

TyPope

Well-known member
First Name
Ty
Joined
Mar 31, 2020
Threads
33
Messages
3,210
Reaction score
4,922
Location
Chesapeake Beach, MD
Vehicles
'23 MYLR, FS Cyberbeast 280xx
Occupation
Current Operations for... an organization
Country flag
So... My Model Y with HW4 now has FSD. There's no reason the Cybertruck won't have FSD on release. We can stop beating this dead horse now.
 

Baldey

Well-known member
First Name
Jenia
Joined
Feb 24, 2023
Threads
10
Messages
387
Reaction score
666
Location
Colorado
Vehicles
tesla M3, 2025 CT
Occupation
QA automation
Country flag
Here is a 3 hour video from Andrew Karpathy himself (we definitely live in a simulation, with a name like that) on some of the details of how they process camera inputs:
 


CYBRSMTH

Well-known member
Joined
Jan 17, 2022
Threads
2
Messages
454
Reaction score
496
Location
Ohio
Vehicles
Honda Fit
Country flag
Can folks more agile with the Tesla HW3/4 and FSD safe help to fill in the gaps on this xwitter discussion going around, in terms of its purported effects on the CT?

Interested in the best for/against implications?

It seems to suggest that because CT will come with HW4, it won’t be shipping with FSD anytime soon?

The quantity and quality of the cameras on HW4 would be worth waiting for FSD. You’ve got a front-facing camera for the first time and two extra cameras on the side. Plus the quality upgrade of HW4 cameras as seen on YouTube videos, like Wham Bam Teslacam are measurable. Plus you still have Auto Pilot.
 

charliemagpie

Well-known member
First Name
Charlie
Joined
Jul 6, 2021
Threads
48
Messages
2,982
Reaction score
5,369
Location
Australia
Vehicles
CybrBEAST
Occupation
retired
Country flag
The cameras have better resolution and can see further.

Elon did say the AI can measure the amount of light protons..

I guess AI can determine an object by the way the light proton bounces off it.

(Seems as Sci-fi as much as how Wi-Fi can be used to see through walls.)

..In the least, I figure HW4 can process light better to make the image clearer in the dark.
 

firsttruck

Well-known member
Joined
Sep 25, 2020
Threads
205
Messages
2,761
Reaction score
4,441
Location
mx
Vehicles
none
Country flag
The quantity and quality of the cameras on HW4 would be worth waiting for FSD. You’ve got a front-facing camera for the first time and two extra cameras on the side. Plus the quality upgrade of HW4 cameras as seen on YouTube videos, like Wham Bam Teslacam are measurable. Plus you still have Auto Pilot.
There is evidence that in the Cybertruck front bumper area there now might be a front camera.

But I do not remember anything about Cybertruck having two extra side cameras.

Do you have a link to info about two extra side cameras?
 
Last edited:

flowerlandfilms

Well-known member
First Name
Eryk
Joined
Dec 6, 2020
Threads
6
Messages
811
Reaction score
1,707
Location
Australia
Vehicles
Yamaha SRV-250, Honda Odyssey RB1
Occupation
Film Maker
Country flag
I think it's a reasonable analogy (for those of a certain vintage) to compare it to when PC's went colour capable. You could run the old black and white software on the new hardware in some cases, and it got the job done, but it took some time before those same programs became available in versions that supported more than monochrome, and had additional capability that took advantage of the new hardware.
 

CYBRSMTH

Well-known member
Joined
Jan 17, 2022
Threads
2
Messages
454
Reaction score
496
Location
Ohio
Vehicles
Honda Fit
Country flag
There is evidence that in the Cybertruck front bumper area there now might be a front camera.

But I do not remember anything about Cybertruck having two extra side cameras.

Do you have a link to info about two extra side cameras?
Ah, I got this wrong. There are fewer cameras on HW4 for the Model 3 and Y. Not sure about CT. I was basing this off of the latest HW4 camera comparison on videos like Dirty Tesla. Maybe there was talk about two cameras on the side scuttle to create more of a 360 view, but it was just wishful thinking.
 


Deleted member 17810

Guest
Here is a 3 hour video from Andrew Karpathy himself (we definitely live in a simulation, with a name like that) on some of the details of how they process camera inputs:
*ML = Machine Learning | FSD= Full Self Driving |

This talk, goes against your premise that the raw data is fed into the FSD ML.

Their patents, their explanation all say that the RAW video feed is both cropped, downsampled, and processes before it reaches the FSD ML.

The primary reason being that it's too much data to process RAW camera data. It's also the reason they are building DOJO, the bandwidth transmission can't keep up with the sensor capture yet.

What they are doing is using ML to crop/down sample RAW data into visual coherent data.

The entire premise of the new FSD is that "it's what humans see" .

As camera sensors go up in resolution, they need more processing power. Old camera data in old teslas were downsampled as to not bottle neck the systems. -> dojo ->
redacted
 
OP
OP
cvalue13

cvalue13

Well-known member
Joined
Aug 17, 2022
Threads
74
Messages
7,153
Reaction score
13,769
Location
Austin, TX
Vehicles
F150L
Occupation
Fun-employed
Country flag
As camera sensors go up in resolution, they need more processing power. Old camera data in old teslas were downsampled as to not bottle neck the systems. -> dojo
And important to make a distinction between what DOJO/NVIDIA clusters are doing - training the model - vs what the vehicle’s onboard hardware is doing - executing the results of that training.

the vehicle’s on board hardware is computing decisions at the edge, locally. this allows its compute to use a lot of the sensor data (because transmission fidelity, latency, and cost, are all fine).

separately, I expect that at some point that large amount of local data is by some suite of rules selected and prioritized for feeding via the cloud back to DOJO/NVIDIA clusters for training instances

that transmission is costly, low fidelity, and high latency (likely hours or days later, when the vehicle is next connected to stable and sufficient wifi)

I’d be interested to hear from anyone with further expertise on how Tesla handles its edge vs cloud based relationship between sensors

my company is in this space
 

Deleted member 17810

Guest
And important to make a distinction between what DOJO/NVIDIA clusters are doing - training the model - vs what the vehicle’s onboard hardware is doing - executing the results of that training.

the vehicle’s on board hardware is computing decisions at the edge, locally. this allows its compute to use a lot of the sensor data (because transmission fidelity, latency, and cost, are all fine).

separately, I expect that at some point that large amount of local data is by some suite of rules selected and prioritized for feeding via the cloud back to DOJO/NVIDIA clusters for training instances

that transmission is costly, low fidelity, and high latency (likely hours or days later, when the vehicle is next connected to stable and sufficient wifi)

I’d be interested to hear from anyone with further expertise on how Tesla handles its edge vs cloud based relationship between sensors
I just typed a bunch of stuff that says, the high def stuff gets uploaded and the selective downsampled get used realtime.

but yes the words describe the actions of the motions of the electronics and visual capture devices to later transmit through electronic means.

the entire point of the visual only training is to remove the ahem fuck ups between depth perception, just like people's eyes.

my company is in this space
I hope it's not in the blindspot.
 

Baldey

Well-known member
First Name
Jenia
Joined
Feb 24, 2023
Threads
10
Messages
387
Reaction score
666
Location
Colorado
Vehicles
tesla M3, 2025 CT
Occupation
QA automation
Country flag
*ML = Machine Learning | FSD= Full Self Driving |

This talk, goes against your premise that the raw data is fed into the FSD ML.

Their patents, their explanation all say that the RAW video feed is both cropped, downsampled, and processes before it reaches the FSD ML.

The primary reason being that it's too much data to process RAW camera data. It's also the reason they are building DOJO, the bandwidth transmission can't keep up with the sensor capture yet.

What they are doing is using ML to crop/down sample RAW data into visual coherent data.

The entire premise of the new FSD is that "it's what humans see" .

As camera sensors go up in resolution, they need more processing power. Old camera data in old teslas were downsampled as to not bottle neck the systems. -> dojo ->
redacted
Not sure what you mean, or why you are defining basic acronyms.

Karpathy starts off the talk by describing the particular type of RegNet they've termed Hydra, which is a collection of neural nets that extract features from an image at the bottom layer, and try to make sense of those features in a spacial and temporal way at the top. I only have a basic ML certification from udemy, and this PDF on regnets broke my brain. Feel free to attempt: https://arxiv.org/pdf/2101.00590.pdf

He clearly shows that raw data is going directly into the ResNet:
Tesla Cybertruck Explain HW3 v HW4 Implications for CT? 1693579440441


So i fail so see where in this presentation he contradicts my point? Do you have any evidence to back up your claim that they pre-proccess raw data before sending it to the ML models?

I think we might be on the same page, but with a misunderstanding. you yourself said "What they are doing is using ML to crop/down sample RAW data into visual coherent data." . That is basically that i am saying, though this has nothing to do with cropping, downsampling, or compressing. I don't know the specifics, but i am pretty sure they have plenty of bandwidth to process around 8 raw streams in real time. But yes, they use multiple neural nets to translate the raw camera feeds into a vector space, from which they can make driving decisions.
 
Last edited:

Deleted member 17810

Guest
It's in the weeds but this is post processing from the sensors:

you said:
Karpathy starts off the talk by describing the particular type of RegNet they've termed Hydra, which is a collection of neural nets that extract features from an image at the bottom layer
*etiquette = conforming to the audience


It's also etiquette* to explain acronyms before delving in.
Sponsored

 
 








Top