Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (18)

Search Parameters:
Keywords = Hadoop ecosystem

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
11 pages, 4553 KiB  
Article
Safety Autonomous Platform for Data-Driven Risk Management Based on an On-Site AI Engine in the Electric Power Industry
by Dongyeop Lee, Daesik Lim and Joonwon Lee
Appl. Sci. 2025, 15(2), 630; https://doi.org/10.3390/app15020630 - 10 Jan 2025
Viewed by 518
Abstract
The electric power industry poses significant risks to workers with a wide range of hazards such as electrocution, electric shock, burns, and falls. Regardless of the types and characteristics of these hazards, electric power companies should protect their workers and provide a safe [...] Read more.
The electric power industry poses significant risks to workers with a wide range of hazards such as electrocution, electric shock, burns, and falls. Regardless of the types and characteristics of these hazards, electric power companies should protect their workers and provide a safe and healthy working environment, but it is difficult to identify the potential health and safety risks present in their workplace and take appropriate action to keep their workers free from harm. Therefore, this paper proposes a novel safety autonomous platform (SAP) for data-driven risk management in the electric power industry. It can automatically and precisely provide a safe and healthy working environment with the cooperation of safety mobility gateways (SMGs) according to the safety rule and risk index data created by the risk level of a current task, a worker profile, and the output of an on-site artificial intelligence (AI) engine in the SMGs. We practically implemented the proposed SAP architecture using the Hadoop ecosystem and verified its feasibility through a performance evaluation of the on-site AI engine and real-time operation of risk assessment and alarm notification for data-driven risk management. Full article
Show Figures

Figure 1

21 pages, 10483 KiB  
Article
Evading Cyber-Attacks on Hadoop Ecosystem: A Novel Machine Learning-Based Security-Centric Approach towards Big Data Cloud
by Neeraj A. Sharma, Kunal Kumar, Tanzim Khorshed, A B M Shawkat Ali, Haris M. Khalid, S. M. Muyeen and Linju Jose
Information 2024, 15(9), 558; https://doi.org/10.3390/info15090558 - 10 Sep 2024
Viewed by 1029
Abstract
The growing industry and its complex and large information sets require Big Data (BD) technology and its open-source frameworks (Apache Hadoop) to (1) collect, (2) analyze, and (3) process the information. This information usually ranges in size from gigabytes to petabytes of data. [...] Read more.
The growing industry and its complex and large information sets require Big Data (BD) technology and its open-source frameworks (Apache Hadoop) to (1) collect, (2) analyze, and (3) process the information. This information usually ranges in size from gigabytes to petabytes of data. However, processing this data involves web consoles and communication channels which are prone to intrusion from hackers. To resolve this issue, a novel machine learning (ML)-based security-centric approach has been proposed to evade cyber-attacks on the Hadoop ecosystem while considering the complexity of Big Data in Cloud (BDC). An Apache Hadoop-based management interface “Ambari” was implemented to address the variation and distinguish between attacks and activities. The analyzed experimental results show that the proposed scheme effectively (1) blocked the interface communication and retrieved the performance measured data from (2) the Ambari-based virtual machine (VM) and (3) BDC hypervisor. Moreover, the proposed architecture was able to provide a reduction in false alarms as well as cyber-attack detection. Full article
(This article belongs to the Special Issue Cybersecurity, Cybercrimes, and Smart Emerging Technologies)
Show Figures

Figure 1

16 pages, 3541 KiB  
Article
Development of a Low-Cost Distributed Computing Pipeline for High-Throughput Cotton Phenotyping
by Vaishnavi Thesma, Glen C. Rains and Javad Mohammadpour Velni
Sensors 2024, 24(3), 970; https://doi.org/10.3390/s24030970 - 2 Feb 2024
Cited by 3 | Viewed by 1496
Abstract
In this paper, we present the development of a low-cost distributed computing pipeline for cotton plant phenotyping using Raspberry Pi, Hadoop, and deep learning. Specifically, we use a cluster of several Raspberry Pis in a primary-replica distributed architecture using the Apache Hadoop ecosystem [...] Read more.
In this paper, we present the development of a low-cost distributed computing pipeline for cotton plant phenotyping using Raspberry Pi, Hadoop, and deep learning. Specifically, we use a cluster of several Raspberry Pis in a primary-replica distributed architecture using the Apache Hadoop ecosystem and a pre-trained Tiny-YOLOv4 model for cotton bloom detection from our past work. We feed cotton image data collected from a research field in Tifton, GA, into our cluster’s distributed file system for robust file access and distributed, parallel processing. We then submit job requests to our cluster from our client to process cotton image data in a distributed and parallel fashion, from pre-processing to bloom detection and spatio-temporal map creation. Additionally, we present a comparison of our four-node cluster performance with centralized, one-, two-, and three-node clusters. This work is the first to develop a distributed computing pipeline for high-throughput cotton phenotyping in field-based agriculture. Full article
(This article belongs to the Special Issue Sensor and AI Technologies in Intelligent Agriculture)
Show Figures

Figure 1

34 pages, 10875 KiB  
Article
EverAnalyzer: A Self-Adjustable Big Data Management Platform Exploiting the Hadoop Ecosystem
by Panagiotis Karamolegkos, Argyro Mavrogiorgou, Athanasios Kiourtis and Dimosthenis Kyriazis
Information 2023, 14(2), 93; https://doi.org/10.3390/info14020093 - 3 Feb 2023
Cited by 6 | Viewed by 2375
Abstract
Big Data is a phenomenon that affects today’s world, with new data being generated every second. Today’s enterprises face major challenges from the increasingly diverse data, as well as from indexing, searching, and analyzing such enormous amounts of data. In this context, several [...] Read more.
Big Data is a phenomenon that affects today’s world, with new data being generated every second. Today’s enterprises face major challenges from the increasingly diverse data, as well as from indexing, searching, and analyzing such enormous amounts of data. In this context, several frameworks and libraries for processing and analyzing Big Data exist. Among those frameworks Hadoop MapReduce, Mahout, Spark, and MLlib appear to be the most popular, although it is unclear which of them best suits and performs in various data processing and analysis scenarios. This paper proposes EverAnalyzer, a self-adjustable Big Data management platform built to fill this gap by exploiting all of these frameworks. The platform is able to collect data both in a streaming and in a batch manner, utilizing the metadata obtained from its users’ processing and analytical processes applied to the collected data. Based on this metadata, the platform recommends the optimum framework for the data processing/analytical activities that the users aim to execute. To verify the platform’s efficiency, numerous experiments were carried out using 30 diverse datasets related to various diseases. The results revealed that EverAnalyzer correctly suggested the optimum framework in 80% of the cases, indicating that the platform made the best selections in the majority of the experiments. Full article
Show Figures

Figure 1

28 pages, 4528 KiB  
Article
A Framework for Attribute-Based Access Control in Processing Big Data with Multiple Sensitivities
by Anne M. Tall and Cliff C. Zou
Appl. Sci. 2023, 13(2), 1183; https://doi.org/10.3390/app13021183 - 16 Jan 2023
Cited by 11 | Viewed by 5621
Abstract
There is an increasing demand for processing large volumes of unstructured data for a wide variety of applications. However, protection measures for these big data sets are still in their infancy, which could lead to significant security and privacy issues. Attribute-based access control [...] Read more.
There is an increasing demand for processing large volumes of unstructured data for a wide variety of applications. However, protection measures for these big data sets are still in their infancy, which could lead to significant security and privacy issues. Attribute-based access control (ABAC) provides a dynamic and flexible solution that is effective for mediating access. We analyzed and implemented a prototype application of ABAC to large dataset processing in Amazon Web Services, using open-source versions of Apache Hadoop, Ranger, and Atlas. The Hadoop ecosystem is one of the most popular frameworks for large dataset processing and storage and is adopted by major cloud service providers. We conducted a rigorous analysis of cybersecurity in implementing ABAC policies in Hadoop, including developing a synthetic dataset of information at multiple sensitivity levels that realistically represents healthcare and connected social media data. We then developed Apache Spark programs that extract, connect, and transform data in a manner representative of a realistic use case. Our result is a framework for securing big data. Applying this framework ensures that serious cybersecurity concerns are addressed. We provide details of our analysis and experimentation code in a GitHub repository for further research by the community. Full article
Show Figures

Figure 1

20 pages, 616 KiB  
Article
The Time Machine in Columnar NoSQL Databases: The Case of Apache HBase
by Chia-Ping Tsai, Che-Wei Chang, Hung-Chang Hsiao and Haiying Shen
Future Internet 2022, 14(3), 92; https://doi.org/10.3390/fi14030092 - 15 Mar 2022
Cited by 3 | Viewed by 3270
Abstract
Not Only SQL (NoSQL) is a critical technology that is scalable and provides flexible schemas, thereby complementing existing relational database technologies. Although NoSQL is flourishing, present solutions lack the features required by enterprises for critical missions. In this paper, we explore solutions to [...] Read more.
Not Only SQL (NoSQL) is a critical technology that is scalable and provides flexible schemas, thereby complementing existing relational database technologies. Although NoSQL is flourishing, present solutions lack the features required by enterprises for critical missions. In this paper, we explore solutions to the data recovery issue in NoSQL. Data recovery for any database table entails restoring the table to a prior state or replaying (insert/update) operations over the table given a time period in the past. Recovery of NoSQL database tables enables applications such as failure recovery, analysis for historical data, debugging, and auditing. Particularly, our study focuses on columnar NoSQL databases. We propose and evaluate two solutions to address the data recovery problem in columnar NoSQL and implement our solutions based on Apache HBase, a popular NoSQL database in the Hadoop ecosystem widely adopted across industries. Our implementations are extensively benchmarked with an industrial NoSQL benchmark under real environments. Full article
(This article belongs to the Special Issue Advances in High Performance Cloud Computing)
Show Figures

Figure 1

24 pages, 1008 KiB  
Article
SPARQL2Flink: Evaluation of SPARQL Queries on Apache Flink
by Oscar Ceballos, Carlos Alberto Ramírez Restrepo, María Constanza Pabón, Andres M. Castillo and Oscar Corcho
Appl. Sci. 2021, 11(15), 7033; https://doi.org/10.3390/app11157033 - 30 Jul 2021
Cited by 5 | Viewed by 2956
Abstract
Existing SPARQL query engines and triple stores are continuously improved to handle more massive datasets. Several approaches have been developed in this context proposing the storage and querying of RDF data in a distributed fashion, mainly using the MapReduce Programming Model and Hadoop-based [...] Read more.
Existing SPARQL query engines and triple stores are continuously improved to handle more massive datasets. Several approaches have been developed in this context proposing the storage and querying of RDF data in a distributed fashion, mainly using the MapReduce Programming Model and Hadoop-based ecosystems. New trends in Big Data technologies have also emerged (e.g., Apache Spark, Apache Flink); they use distributed in-memory processing and promise to deliver higher data processing performance. In this paper, we present a formal interpretation of some PACT transformations implemented in the Apache Flink DataSet API. We use this formalization to provide a mapping to translate a SPARQL query to a Flink program. The mapping was implemented in a prototype used to determine the correctness and performance of the solution. The source code of the project is available in Github under the MIT license. Full article
(This article belongs to the Special Issue Big Data Management and Analysis with Distributed or Cloud Computing)
Show Figures

Figure 1

15 pages, 7168 KiB  
Article
Design and Implementation of Edge-Fog-Cloud System through HD Map Generation from LiDAR Data of Autonomous Vehicles
by Junwon Lee, Kieun Lee, Aelee Yoo and Changjoo Moon
Electronics 2020, 9(12), 2084; https://doi.org/10.3390/electronics9122084 - 7 Dec 2020
Cited by 25 | Viewed by 3986
Abstract
Self-driving cars, autonomous vehicles (AVs), and connected cars combine the Internet of Things (IoT) and automobile technologies, thus contributing to the development of society. However, processing the big data generated by AVs is a challenge due to overloading issues. Additionally, near real-time/real-time IoT [...] Read more.
Self-driving cars, autonomous vehicles (AVs), and connected cars combine the Internet of Things (IoT) and automobile technologies, thus contributing to the development of society. However, processing the big data generated by AVs is a challenge due to overloading issues. Additionally, near real-time/real-time IoT services play a significant role in vehicle safety. Therefore, the architecture of an IoT system that collects and processes data, and provides services for vehicle driving, is an important consideration. In this study, we propose a fog computing server model that generates a high-definition (HD) map using light detection and ranging (LiDAR) data generated from an AV. The driving vehicle edge node transmits the LiDAR point cloud information to the fog server through a wireless network. The fog server generates an HD map by applying the Normal Distribution Transform-Simultaneous Localization and Mapping(NDT-SLAM) algorithm to the point clouds transmitted from the multiple edge nodes. Subsequently, the coordinate information of the HD map generated in the sensor frame is converted to the coordinate information of the global frame and transmitted to the cloud server. Then, the cloud server creates an HD map by integrating the collected point clouds using coordinate information. Full article
(This article belongs to the Special Issue IoT Sensor Network Application)
Show Figures

Figure 1

16 pages, 5166 KiB  
Article
Implementation of a Sensor Big Data Processing System for Autonomous Vehicles in the C-ITS Environment
by Aelee Yoo, Sooyeon Shin, Junwon Lee and Changjoo Moon
Appl. Sci. 2020, 10(21), 7858; https://doi.org/10.3390/app10217858 - 5 Nov 2020
Cited by 13 | Viewed by 10776
Abstract
To provide a service that guarantees driver comfort and safety, a platform utilizing connected car big data is required. This study first aims to design and develop such a platform to improve the function of providing vehicle and road condition information of the [...] Read more.
To provide a service that guarantees driver comfort and safety, a platform utilizing connected car big data is required. This study first aims to design and develop such a platform to improve the function of providing vehicle and road condition information of the previously defined central Local Dynamic Map (LDM). Our platform extends the range of connected car big data collection from OBU (On Board Unit) and CAN to camera, LiDAR, and GPS sensors. By using data of vehicles being driven, the range of roads available for analysis can be expanded, and the road condition determination method can be diversified. Herein, the system was designed and implemented based on the Hadoop ecosystem, i.e., Hadoop, Spark, and Kafka, to collect and store connected car big data. We propose a direction of the cooperative intelligent transport system (C-ITS) development by showing a plan to utilize the platform in the C-ITS environment. Full article
(This article belongs to the Special Issue Internet of Things (IoT))
Show Figures

Figure 1

20 pages, 893 KiB  
Article
A Hadoop-Based Platform for Patient Classification and Disease Diagnosis in Healthcare Applications
by Hassan Harb, Hussein Mroue, Ali Mansour, Abbass Nasser and Eduardo Motta Cruz
Sensors 2020, 20(7), 1931; https://doi.org/10.3390/s20071931 - 30 Mar 2020
Cited by 29 | Viewed by 7102
Abstract
Nowadays, the increasing number of patients accompanied with the emergence of new symptoms and diseases makes heath monitoring and assessment a complicated task for medical staff and hospitals. Indeed, the processing of big and heterogeneous data collected by biomedical sensors along with the [...] Read more.
Nowadays, the increasing number of patients accompanied with the emergence of new symptoms and diseases makes heath monitoring and assessment a complicated task for medical staff and hospitals. Indeed, the processing of big and heterogeneous data collected by biomedical sensors along with the need of patients’ classification and disease diagnosis become major challenges for several health-based sensing applications. Thus, the combination between remote sensing devices and the big data technologies have been proven as an efficient and low cost solution for healthcare applications. In this paper, we propose a robust big data analytics platform for real time patient monitoring and decision making to help both hospital and medical staff. The proposed platform relies on big data technologies and data analysis techniques and consists of four layers: real time patient monitoring, real time decision and data storage, patient classification and disease diagnosis, and data retrieval and visualization. To evaluate the performance of our platform, we implemented our platform based on the Hadoop ecosystem and we applied the proposed algorithms over real health data. The obtained results show the effectiveness of our platform in terms of efficiently performing patient classification and disease diagnosis in healthcare applications. Full article
(This article belongs to the Special Issue Sensor and Systems Evaluation for Telemedicine and eHealth)
Show Figures

Figure 1

24 pages, 552 KiB  
Article
Two-Step Classification with SVD Preprocessing of Distributed Massive Datasets in Apache Spark
by Athanasios Alexopoulos, Georgios Drakopoulos, Andreas Kanavos, Phivos Mylonas and Gerasimos Vonitsanos
Algorithms 2020, 13(3), 71; https://doi.org/10.3390/a13030071 - 24 Mar 2020
Cited by 15 | Viewed by 4996
Abstract
At the dawn of the 10V or big data data era, there are a considerable number of sources such as smart phones, IoT devices, social media, smart city sensors, as well as the health care system, all of which constitute but a small [...] Read more.
At the dawn of the 10V or big data data era, there are a considerable number of sources such as smart phones, IoT devices, social media, smart city sensors, as well as the health care system, all of which constitute but a small portion of the data lakes feeding the entire big data ecosystem. This 10V data growth poses two primary challenges, namely storing and processing. Concerning the latter, new frameworks have been developed including distributed platforms such as the Hadoop ecosystem. Classification is a major machine learning task typically executed on distributed platforms and as a consequence many algorithmic techniques have been developed tailored for these platforms. This article extensively relies in two ways on classifiers implemented in MLlib, the main machine learning library for the Hadoop ecosystem. First, a vast number of classifiers is applied to two datasets, namely Higgs and PAMAP. Second, a two-step classification is ab ovo performed to the same datasets. Specifically, the singular value decomposition of the data matrix determines first a set of transformed attributes which in turn drive the classifiers of MLlib. The twofold purpose of the proposed architecture is to reduce complexity while maintaining a similar if not better level of the metrics of accuracy, recall, and F 1 . The intuition behind this approach stems from the engineering principle of breaking down complex problems to simpler and more manageable tasks. The experiments based on the same Spark cluster indicate that the proposed architecture outperforms the individual classifiers with respect to both complexity and the abovementioned metrics. Full article
(This article belongs to the Special Issue Mining Humanistic Data 2019)
Show Figures

Figure 1

30 pages, 2154 KiB  
Review
Big Data and Business Analytics: Trends, Platforms, Success Factors and Applications
by Ifeyinwa Angela Ajah and Henry Friday Nweke
Big Data Cogn. Comput. 2019, 3(2), 32; https://doi.org/10.3390/bdcc3020032 - 10 Jun 2019
Cited by 116 | Viewed by 53620
Abstract
Big data and business analytics are trends that are positively impacting the business world. Past researches show that data generated in the modern world is huge and growing exponentially. These include structured and unstructured data that flood organizations daily. Unstructured data constitute the [...] Read more.
Big data and business analytics are trends that are positively impacting the business world. Past researches show that data generated in the modern world is huge and growing exponentially. These include structured and unstructured data that flood organizations daily. Unstructured data constitute the majority of the world’s digital data and these include text files, web, and social media posts, emails, images, audio, movies, etc. The unstructured data cannot be managed in the traditional relational database management system (RDBMS). Therefore, data proliferation requires a rethinking of techniques for capturing, storing, and processing the data. This is the role big data has come to play. This paper, therefore, is aimed at increasing the attention of organizations and researchers to various applications and benefits of big data technology. The paper reviews and discusses, the recent trends, opportunities and pitfalls of big data and how it has enabled organizations to create successful business strategies and remain competitive, based on available literature. Furthermore, the review presents the various applications of big data and business analytics, data sources generated in these applications and their key characteristics. Finally, the review not only outlines the challenges for successful implementation of big data projects but also highlights the current open research directions of big data analytics that require further consideration. The reviewed areas of big data suggest that good management and manipulation of the large data sets using the techniques and tools of big data can deliver actionable insights that create business values. Full article
Show Figures

Figure 1

18 pages, 7491 KiB  
Article
Improvement of Kafka Streaming Using Partition and Multi-Threading in Big Data Environment
by Bunrong Leang, Sokchomrern Ean, Ga-Ae Ryu and Kwan-Hee Yoo
Sensors 2019, 19(1), 134; https://doi.org/10.3390/s19010134 - 2 Jan 2019
Cited by 16 | Viewed by 8682
Abstract
The large amount of programmable logic controller (PLC) sensing data has rapidly increased in the manufacturing environment. Therefore, a large data store is necessary for Big Data platforms. In this paper, we propose a Hadoop ecosystem for the support of many features in [...] Read more.
The large amount of programmable logic controller (PLC) sensing data has rapidly increased in the manufacturing environment. Therefore, a large data store is necessary for Big Data platforms. In this paper, we propose a Hadoop ecosystem for the support of many features in the manufacturing industry. In this ecosystem, Apache Hadoop and HBase are used as Big Data storage and handle large scale data. In addition, Apache Kafka is used as a data streaming pipeline which contains many configurations and properties that are used to make a better-designed environment and a reliable system, such as Kafka offset and partition, which is used for program scaling purposes. Moreover, Apache Spark closely works with Kafka consumers to create a real-time processing and analysis of the data. Meanwhile, data security is applied in the data transmission phase between the Kafka producers and consumers. Public-key cryptography is performed as a security method which contains public and private keys. Additionally, the public-key is located in the Kafka producer, and the private-key is stored in the Kafka consumer. The integration of these above technologies will enhance the performance and accuracy of data storing, processing, and securing in the manufacturing environment. Full article
(This article belongs to the Special Issue Selected papers from Smart Data 2018 & Big Data Service 2018)
Show Figures

Figure 1

34 pages, 19994 KiB  
Article
A Magnetoencephalographic/Encephalographic (MEG/EEG) Brain-Computer Interface Driver for Interactive iOS Mobile Videogame Applications Utilizing the Hadoop Ecosystem, MongoDB, and Cassandra NoSQL Databases
by Wilbert McClay
Cited by 6 | Viewed by 7764
Abstract
In Phase I, we collected data on five subjects yielding over 90% positive performance in Magnetoencephalographic (MEG) mid-and post-movement activity. In addition, a driver was developed that substituted the actions of the Brain Computer Interface (BCI) as mouse button presses for real-time use [...] Read more.
In Phase I, we collected data on five subjects yielding over 90% positive performance in Magnetoencephalographic (MEG) mid-and post-movement activity. In addition, a driver was developed that substituted the actions of the Brain Computer Interface (BCI) as mouse button presses for real-time use in visual simulations. The process was interfaced to a flight visualization demonstration utilizing left or right brainwave thought movement, the user experiences, the aircraft turning in the chosen direction, or on iOS Mobile Warfighter Videogame application. The BCI’s data analytics of a subject’s MEG brain waves and flight visualization performance videogame analytics were stored and analyzed using the Hadoop Ecosystem as a quick retrieval data warehouse. In Phase II portion of the project involves the Emotiv Encephalographic (EEG) Wireless Brain–Computer interfaces (BCIs) allow for people to establish a novel communication channel between the human brain and a machine, in this case, an iOS Mobile Application(s). The EEG BCI utilizes advanced and novel machine learning algorithms, as well as the Spark Directed Acyclic Graph (DAG), Cassandra NoSQL database environment, and also the competitor NoSQL MongoDB database for housing BCI analytics of subject’s response and users’ intent illustrated for both MEG/EEG brainwave signal acquisition. The wireless EEG signals that were acquired from the OpenVibe and the Emotiv EPOC headset can be connected via Bluetooth to an iPhone utilizing a thin Client architecture. The use of NoSQL databases were chosen because of its schema-less architecture and Map Reduce computational paradigm algorithm for housing a user’s brain signals from each referencing sensor. Thus, in the near future, if multiple users are playing on an online network connection and an MEG/EEG sensor fails, or if the connection is lost from the smartphone and the webserver due to low battery power or failed data transmission, it will not nullify the NoSQL document-oriented (MongoDB) or column-oriented Cassandra databases. Additionally, NoSQL databases have fast querying and indexing methodologies, which are perfect for online game analytics and technology. In Phase II, we collected data on five MEG subjects, yielding over 90% positive performance on iOS Mobile Applications with Objective-C and C++, however on EEG signals utilized on three subjects with the Emotiv wireless headsets and (n < 10) subjects from the OpenVibe EEG database the Variational Bayesian Factor Analysis Algorithm (VBFA) yielded below 60% performance and we are currently pursuing extending the VBFA algorithm to work in the time-frequency domain referred to as VBFA-TF to enhance EEG performance in the near future. The novel usage of NoSQL databases, Cassandra and MongoDB, were the primary main enhancements of the BCI Phase II MEG/EEG brain signal data acquisition, queries, and rapid analytics, with MapReduce and Spark DAG demonstrating future implications for next generation biometric MEG/EEG NoSQL databases. Full article
(This article belongs to the Section Neuro-psychiatric Disorders)
Show Figures

Figure 1

20 pages, 966 KiB  
Article
Hadoop Oriented Smart Cities Architecture
by Vlad Diaconita, Ana-Ramona Bologa and Razvan Bologa
Sensors 2018, 18(4), 1181; https://doi.org/10.3390/s18041181 - 12 Apr 2018
Cited by 19 | Viewed by 8032
Abstract
A smart city implies a consistent use of technology for the benefit of the community. As the city develops over time, components and subsystems such as smart grids, smart water management, smart traffic and transportation systems, smart waste management systems, smart security systems, [...] Read more.
A smart city implies a consistent use of technology for the benefit of the community. As the city develops over time, components and subsystems such as smart grids, smart water management, smart traffic and transportation systems, smart waste management systems, smart security systems, or e-governance are added. These components ingest and generate a multitude of structured, semi-structured or unstructured data that may be processed using a variety of algorithms in batches, micro batches or in real-time. The ICT architecture must be able to handle the increased storage and processing needs. When vertical scaling is no longer a viable solution, Hadoop can offer efficient linear horizontal scaling, solving storage, processing, and data analyses problems in many ways. This enables architects and developers to choose a stack according to their needs and skill-levels. In this paper, we propose a Hadoop-based architectural stack that can provide the ICT backbone for efficiently managing a smart city. On the one hand, Hadoop, together with Spark and the plethora of NoSQL databases and accompanying Apache projects, is a mature ecosystem. This is one of the reasons why it is an attractive option for a Smart City architecture. On the other hand, it is also very dynamic; things can change very quickly, and many new frameworks, products and options continue to emerge as others decline. To construct an optimized, modern architecture, we discuss and compare various products and engines based on a process that takes into consideration how the products perform and scale, as well as the reusability of the code, innovations, features, and support and interest in online communities. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

Back to TopTop