Apache Hadoop for large-scale data processing using machine learning techniques

by adminMay 14, 2026May 14, 2026033

Paper Title: Apache Hadoop for large-scale data processing using machine learning techniques

Authors: Nidaa Ghalib Ali, Mohanaed Ajmi Falih, Ali Ajmi Falih

Corresponding Author: Nidaa Ghalib Ali (inb.nedaa10@atu.edu.iq)/ Iraq

Abstract

As big data volumes increase and data variety becomes greater, there is a need for more advanced technology. The paper discusses Volume, Variety, and Velocity, which are known as the 3Vs of Big Data, along with Valence and Veracity. As organizations battle with these complexities, Apache Spark perhaps emerges as a technology that can overcome the limitations of Hadoop MapReduce to enable real-time analytics. The focus of this paper is on Big Data. The study evaluates the effectiveness of the K-Nearest Neighbors (KNN) algorithm on structured data. Decision Tree regression is evaluated on unstructured data, and logistic regression on semi-structured data in this study. The algorithms performed well on structured data; however, all the models failed to predict unstructured data. Moreover, an examination of the framework’s performance proves the computational efficiency of Apache Hadoop and Apache Spark. Furthermore, in terms of processing speed across all data types and algorithms, Spark outperformed Hadoop. As a result, it requires advanced analytical tools. Apache Spark is a modern, high-performance data processing framework that enables organizations to manage Big Data in real time.

Download Full Paper

Keywords

Big Data, Hadoop, Spark, Machine learning

Cite:

Ghalib Ali , N. ., Ajmi Falih, M. ., & Ajmi Falih, A. . (2026). Apache Hadoop for large-scale data processing using machine learning techniques. Future Technology, 5(3), 128–138. Retrieved from https://fupubco.com/futech/article/view/762

Future Publishing LLC

Apache Hadoop for large-scale data processing using machine learning techniques

Abstract

Keywords

Cite:

admin

Leave a Comment Cancel Reply

China Controls 73% of Lithium| One Technology Is About to Change That Forever

An explainable AI-based workforce intelligence framework for integrating future skill demand…

Transfer learning in neural networks: leveraging pre-trained models for improved performance

The Race to Build Artificial Suns| And Who’s Winning

Green cement production technology for reducing GHG emissions from the industrial sector of Kazakhstan

Green hydrogen for a sustainable energy

Join Future’s New Scientific Journals

The Future of Hydrogen-Boron Fusion: An Interview with...

Future Technology Journal is Indexed by Scopus

Hydrogen integration in renewables

Abstract

Keywords

Cite:

Related posts

Leave a Comment Cancel Reply