Today, we're going to take a look at how we use AI job validation at DataHunt, what information it provides, and how much it has improved our validation efficiency. We're not going to get too theoretical, so let's get straight to the action.
At DataHunt, we go through a meticulous process of work and review during data processing. It's a human-in-the-loop process where humans and AI work together to complement each other.
In this process, AI job checking is done primarily on the data that has been processed, and its role is to identify work products where the AI model is likely to be wrong and suggest corrections. In short, it assesses the reliability of the work, which can lead to several questions.
A We'll share the specifics when we're ready.
Yes, but by alerting you to work that is likely to be incorrect, you can review it more carefully.
We're applying a number of techniques to make it work quickly while still being highly accurate.
The bottom line is that by increasing the accuracy of the review, you can create more accurate results and save time.
In Data Hunt, we currently have validation features for classification, detection, and segmentation of objects in images. Let's take a look at how AI can help us with the detection task.
The detection task is to give a bounding box (a rectangle representing the area of an object) and the attributes of that object. For the sake of simplicity, we'll define it as"Bounding box = area + attributes". Within a single image, there maybe no objects at all, or conversely, there may be a lot of them. In either case, the worker's output needs to be inspected, and the inspector is faced with a situation where the bounding box of every object is one of the following cases.
In all of these cases, except for number 1, corrections should be made during the review process.
The above illustration shows a few examples of validation. When the AI proactively alerts you to areas or attributes of the work that are unreliable, the reviewer can accept or reject the AI's suggestions and continue the review. To accomplish this, the DataHunt platform trains its model from the data it has worked with so far, as requested by the customer. It's a common story, but learning has the following characteristics
As a customer, you want to make sure you understand the above and that you're training at the right time. There's a trade-off: if you try to make your work too accurate from the start, you'll slow things down, and if you wait to train until you've accumulated a significant amount of data, you'll reduce the amount the model can actually help you check.
Of course, there area few other factors that can affect the performance of your model. The distribution between attributes in the training data could be too uneven, the type of model, different training techniques, etc. Currently, we don't allow customers to tweak these details, because while it's certainly possible to tweak them yourself to make a better model, in many cases it's not worth the increased monetary and time costs of training.
Once you've trained your model, you can now utilize the validation features. We'll show you how to use it in our platform user guide, which we're currently working on, but in the meantime, let's analyze how much AI Job Inspection improved the quality of our work with some quick test results.
To evaluate the performance of this feature, we conducted a hands-on experiment with data from one of our customers that had already completed tasks and inspections. To evaluate the performance of the inspections, we modified some of the task results to generate errors. There were four types of errors
The distribution of errors is based on the frequency of errors in real-world tasks. For each error, we determined how many were found by the AI and summed them to calculate the overall inspection performance.
Overall, we found 57 out of 70 errors, for a detection rate of about 81%. The detection rates foreach error were as follows
We found that the detection rate for region errors was relatively low, which we believe is due to the ambiguity of the criteria for region errors: when we consider a correct answer to be one that fits the region of an object, there is no absolute standard for how different it must be from the correct answer to be considered an error.
In this experiment, we made the error so that the IoU with the correct answer was over 0.8, which is pretty close to the correct answer, which is probably why it was difficult to check. (To summarize, we made it an error, but it was still so close to the correct answer that it was hard to tell if it was an error or not)
As you can see, we are implementing AI automated checking for a few tasks at DataHunt, and through our experiments with real data, we have found that it can be a great help in checking. Tasks - In the process leading up to checking, you might think that if you can do it very painstakingly and accurately from the task stage, then checking becomes a little less important, but no one can guarantee that.
AI is playing a big role in finding those human mistakes that will always happen, and as a result, we're able to reduce the review time and therefore the cost. I think that's one of the strengths of DataHunt is that we're able to significantly improve the quality of the data labels that we end up with, so that we can deliver quality data.