Generative AI Redefines Data Analysts Jobs Full Replacement Unlikely, Says Expert
Generative AI is unlikely to replace the role of data analysts entirely, but it will certainly redefine their role, according to Galen Okazaki, a data analysis professional. In a recent article, Okazaki explores the capabilities and limitations of generative AI in the field of data analysis.
Generative AI, such as OpenAI’s ChatGPT, Bard, and Bing Chat, has the ability to write SQL, Python, and R code. It can even perform regression analysis, descriptive analysis, pattern recognition, and create visualizations without the need for manual coding. However, there are significant limitations to consider.
One major limitation is the requirement to upload only one table or a two-dimensional CSV file with a size limit of 100 MB. This restriction makes it impractical for data analysts who work with large and complex datasets. Additionally, there are challenges in terms of data security and control when pushing sensitive data outside a company’s firewall to an external AI model.
While it is theoretically possible for companies to build their own generative AI models, the process is complex, costly, and requires specialized expertise. It is only feasible for a select few organizations. Furthermore, even with access to generative AI, domain knowledge remains essential for effective data analysis. The ability to ask the right questions and interpret the findings requires human expertise.
Okazaki emphasizes that generative AI’s Achilles heel is its inability to answer complex, situational questions that have never been asked before. It relies on existing training data and cannot generate probability-driven answers for novel scenarios. The value of data analysis lies in its ability to provide immediate answers to unforeseen, mission-critical questions that require domain knowledge.
However, generative AI already has practical applications in data analysis. Its ability to write code and explain the code it generates can be beneficial for learning and accelerating the coding process. GitHub’s Copilot, for instance, suggests coding solutions in real-time. Open-source generative AI models like Databricks’ Dolly are also emerging, providing alternatives for companies interested in leveraging AI without the need for extensive resources.
In conclusion, Okazaki believes that generative AI will reshape data analysis workflows. Repetitive tasks and certain types of analyses may be performed by AI in the future, but domain knowledge will remain crucial. Data analysts of the future will need to combine business-level expertise with generative AI tools to enhance their efficiency and productivity.
Overall, Okazaki encourages individuals to embrace and learn about generative AI, as its capabilities continue to expand with the development of new APIs and plugins. While it won’t replace data analysts entirely, generative AI will undoubtedly play a significant role in the field.