Apache Kafka is an open-source stream-processing software platform developed by Linkedin and donated to Apache Software Foundation, written in Scala and Java. The project aims to provide a unified, high-throughput, low-latency platform for handling real-time data feeds.
Artificial intelligence (AI) is the ability of a computer program or a machine to think and learn. It is also a field of study which tries to make computers “smart”.As machines become increasingly capable, mental facilities once thought to require intelligence are removed from the definition.

Amazon Web Services (AWS) is a comprehensive, evolving cloud computing platform provided by Amazon. It provides a mix of infrastructure as a service (IaaS), platform as a service (PaaS) and packaged software as a service (SaaS) offerings.

Azure is a public cloud computing platform—it is Infrastructure as a Service (IaaS) and Platform as a Service (PaaS) that can be used for services such as analytics, virtual computing, storage, networking, and much more. It can be used to replace or supplement your on-premise servers.

Big Data is a phrase used to mean a massive volume of both structured and unstructured data that is so large it is difficult to process using traditional database and software techniques. In most enterprise scenarios the volume of data is too bigor it moves too fast or it exceeds current processing capacity.

Blockchain is a chain of blocks which contain information. The data which is stored inside a block depends on the type of block chain. For Example, A Bitcoin Block contains information about the Sender, Receiver, number of bitcoins to be transferred.

Data science is the study of data. It involves developing methods of recording, storing, and analysing data to effectively extract useful information. The goal of data science is to gain insights and knowledge from any type of data both structured and unstructured.

DevOps (development and operations) is an enterprise software development phrase used to mean a type of agile relationship between development and IT operations. The goal of DevOps is to change and improve the relationship by advocating better communication and collaboration between these two business units.


Hadoop is 100% open source Java‐based programming framework that supports the processing of large data sets in a distributed computing environment. To process and store the data, It utilizes inexpensive, industry‐standard servers. The key features of Hadoop are Cost effective system, Scalability, Parallel processing of distributed data, Data locality optimization, Automatic failover management and supports large clusters of nodes.

Hadoop is an Apache open source framework written in java that allows distributed processing of large data sets across clusters of computers using simple programming models. The Hadoop framework application works in an environment that provides distributed storage and computation across clusters of computers. Hadoop is designed to scale up from single server to thousands of machines, each offering local computation and storage.

Informatica PowerCenter is a widely used extraction, transformation and loading (ETL) tool used in building enterprise data warehouses. The components withinInformatica PowerCenter aid in extracting data from its source, transforming it as per business requirements and loading it into a target data warehouse.
The Internet of Things is simply A network of Internet connected objects able to collect and exchange data. It is commonly abbreviated as IoT. The word Internet of Things has two main parts; Internet being the backbone of connectivity, and Things meaning objects / devices .

