Title | : | Can Parallel Database Systems Help Big Data Analytics? |
Speaker | : | Prof. Carlos Ordonez (University of Houston) |
Details | : | Thu, 21 Aug, 2025 11:00 AM @ SSB 334 |
Abstract: | : | Parallel relational DBMSs remain an important data management technology, despite the big data analytics and no-SQL waves. On the other hand, for AI data analytics in a broad sense, there are plenty of non-DBMS tools including data science languages, linear algebra libraries, generic data mining programs and large-scale parallel systems. Hence it would seem a parallel DBMS is not a good technology to analyze big data, going beyond SQL queries, acting just as a reliable and fast data repository for tabular data. In this talk, we argue that is not the case, explaining important research that has enabled AI analytics on big data. However, we also argue DBMSs cannot compete with parallel systems like Spark to analyze web-scale text data or NN libraries like Pytorch to compute neural networks, if there is large RAM. Therefore, each technology will keep influencing each other. We conclude with a proposal of long-term research issues, considering the AI trend. |