I need three years to build up my IT team because we collect a large amount of data. The data collection process alone can take months or even years. After three years, I will have a perfect dataset. However, when presenting the data to the AI team, they might find flaws, gaps, or something missing that could make all the hard work during those years and months useless.
We’ll do AI then what’s wrong with this approach? It turns out that’s a really bad strategy.
Why This Approach Is Wrong
Waiting to involve your AI team until after data collection is complete is not the best strategy. Instead, start showing your collected data to the AI team as soon as possible. It allows the AI team to provide feedback on the data types to collect and the necessary IT infrastructure to build.
Example Scenario: Maybe an AI team can look at your factory data and say, “Hey, You know what? If you can collect data from this manufacturing machine, Not just once every 10 minutes, but instead once every minute, then we could do a better job building a preventative maintenance system for you.”
Common Misconceptions
Another misconception is that You have so much data. Surely, the AI team can make it valuable. You have it in the note, but we just discussed that the data has not been shared with the team yet. Only the team knows better whether the data you want to use for that project is accurate or not.
The Importance of Data Quality
Data is valuable or invaluable, but mistakes can occur during data collection. Even when conducting surveys and accurately collecting data, errors can still happen during data entry into the database.
If you have a large amount of data, but bad data, then AI will learn inaccurate things. This will be a problem because Data is now Messy.
Data problems
- Inaccurate labels
- Missing values.
Multiple types of data
- Unstructured Data: Images, Audio, text.
Conclusion
Involving your AI team early in the data collection can save time and resources. You can build a more robust and effective AI system by getting feedback on data types and collection methods.
Remember, the quality of your data is just as important as the quantity.