Overview | Polyline, the key to building data for computer vision autonomous driving
The use of Lidar for autonomous driving solutions has been commonplace, but the advent of Tesla Vision, a camera-based perception solution, is shaking up the industry. At the heart of computer vision autonomous driving solutions is the task of data annotation. This involves identifying objects such as other cars, pedestrians, traffic signs, and lane markings, as well as assigning polyline labels to the dataset, so that the machine can understand what it is seeing.
Developing computer vision autonomous driving solutions is an essential gateway to increasing driver safety and comfort. Additionally, camera-based autonomous driving solutions are gaining traction due to their lower cost compared to traditional methods. Beyond that, Tesla has explained many times why computer vision is more innovative than Lidar technology.
Our client is a company that provides software for cameras in a wide range of vehicles. In addition to that, they are taking on various technical challenges such as bringing their solution to SoC platforms. With the dawn of the camera-based autonomous driving era, they were preparing to develop a solution for autonomous vehicles.
Problem | Building autonomous driving data with polyline labeling
In order to develop a computer vision autonomous driving solution, it is necessary to build a large amount of artificial intelligence datasets. In particular, it was necessary to build data for various edge cases such as lanes, roads, and data in night environments. Furthermore, in addition to domestic road data, it was necessary to build overseas road datasets such as Japan and the UnitedStates.
The final goal of the project was to enable autonomous driving with high accuracy within the customer's platform through software that utilizes the built data.If autonomous recognition of traffic direction and other obstacles along the driving route is possible, we believe that full autonomous driving is not faraway.
Here's how DataHunt and the client agreed on data labeling and processing.
Data labeling requested by the client
- Data collected domestically and internationally (Japan, US, Europe) Data collected in special environments such as at night
- Data collected via multi cam
- Further processing of pre-labeled data
Data processing methods
- Lane: Polyline labeling of road lanes in image data
- Road edges: Boundary labeling of roads
- Other markings: labeling parts of the road that could be confused as lanes.
The project involved building a large amount of polyline labeling data, so we needed to be clear and concise in what we communicated to the labelers working on the project. We also needed to disseminate the information to the workers quickly due to frequent policy changes on processing.
As one of the most challenging projects in computer vision data construction, it was important to pay attention to worker input and skills.
Solution | Building computer vision data with data annotation
The more complex the data processing project, the more important it is to process the data with high accuracy.
- Labeling lanes on the road for line detection, classifying the shape and typeof lanes, and working according to the width of the lanes.
- Anotate data with the Polyline method
- Determine vanishing point > Check work scope > Draw lanes > Adjust width > Classify lanes > Work on overlapping parts
- Classification: Select road shape (5 types), number of lane lines (3 types),location (4 types), special cases (2 types), color (4 types), etc.
- Label the boundaries of roads where vehicles can drive, and classify the types of boundaries.
- Data annotation with Polyline workflow
- Determine vanishing point > Confirm work scope > Mark road boundaries> Classify > Work with occluded parts
- Classification: Location (4 types), Type (19 types such as beacon, wall, etc.)
- Removes elements that can be confused with lanes, such as arrows, letters, and shapes on the road.
- Data annotation with Polygon workflow
- Check work scope > Select other markings as polygons > Fix overlapping parts with lanes
Task order and inspection criteria
(Worker)Primary work > (Reviewer) Repeats the review and provides feedback to the worker and requests rework > (PL) Adjusts detailed work contents for the reviewed work > (PM) Final review before delivery
In the case of lane work, lanes vary greatly in appearance and type. Road edges were more complex than lane work, especially for overseas roads, which required careful polyline labeling.
Datahunt typically works with in-country labelers to facilitate communication and task management. Since there may be unfamiliar cases with overseas data, we needed to focus on it. In addition to training at the time of job entry, we conducted sample tests, etc. In particular, we conducted intensive training on data annotation tasks such as polyline labeling of lanes and boundaries, which were difficult.
Result | 300,000 polylines, error rate within 5 percent
In the end, DataHunt completed the following data annotation tasks within the project, including those requested by the client.
Data labeling requested by the client
- Data collected domestically and internationally (Japan, US, Europe)
- Data collected in special environments such as at night
- Data collected via multi cam
- Built about 300,000 photos of autonomous driving data, including allEuropean/US/Japanese/domestic autonomous driving data
- Before delivery, the data passes the customer's own quality inspection standards (error rate within 5%) through more than four stages of work/review.
Data Processing Method
- Lane: Polyline labeling of the lane of the road in the data (image)
- Road edge: Polyline labeling of the boundary of the road
- Other markers: Polygon labeling of parts of the road that can be confused with lanes.
AI and humans working together
This was a project that required a vast array of policies and guidelines and real-time responses to various policy changes. At the same time, the large number of people involved made it challenging in terms of labelers and quality control. In order to work efficiently, we first established processes and guidelines.
In particular, we took this opportunity to introduce AI pre-labeling and automated training tests internally at DataHunt to regularize the worker training process. We will be able to secure workers who pass clear standards in the future.
I would like to take this opportunity to thank the workers and our internal staff for getting up to speed with the difficult tasks.
Through this project, DataHunt has secured 600 workers who have completed training and project execution for a highly demanding autonomous driving project. We are confident that we will be able to quickly deploy them in similar projects and build faster and more accurate polyline and polygon data.