Why Apache Ignite?
Apache Ignite is an open source In-Memory Data Grid (IMDG), distributed database, caching and high performance computing platform. It offers a bucketload of features and integrates well with other Apache frameworks such as Hadoop, Spark, and Cassandra.
So why do we need Apache Ignite? We need it for its High Performance and Scalability.
Of course, the phrase high performance might be very popular in our industry, but it's equally ambiguous. There's no established numerical threshold for when regular performance becomes high performance, just as there's no clear threshold for when data becomes Big Data, or when services become Microservices.
Fortunately, culture tends to generate its own barometers, and in computer science, the term high performance generally refers to the prowess possessed by supercomputers. Supercomputers are used to achieve high throughput using distributed parallel processing. They are mainly used for processing compute-intensive tasks such as weather forecasting, gene model analysis, big-bang simulations, and so on. High performance computing enables us to process huge chunks of data as quickly as possible.
Following the supercomputers analogy, we can stack up many virtual machines/workstations (form a grid) to process a computationally intensive task, but in traditional database-centric applications, parallel processing doesn't scale linearly. If we add 10 more machines to the grid, it will not process 10 times faster. At most, it can gain 2-4% in performance.
Apache Ignite plays a key role here to achieve a 20-30% linear performance improvement. It keeps data in RAM for fast processing and linear scaling. If you add more workstations to the grid, it will offer higher scalability and performance gains.
If you need to process records in a transactional manner and still need a 20-30% performance gain over a traditional database, Apache Ignite can offer you high performance improvement, linear scalability, and ACID compliant transactions with high availability and resiliency.
Apache Ignite can be used for various types of data sources, from high volume financial service transaction data to streams of IoT sensor data. Ignite stores data in RAM for fast processing throughput but for resiliency, you can persist the data in a third-party data store as well as in the native Ignite persistence store. We will explore each of them later.
Ignite offers an ANSI SQL query API to query data, an API to perform CRUD on caches, ACID transactions, a compute and service grid, streams, and complex event processing to Machine Learning APIs.
NoSQL came into the picture to solve the RDBMS scalability bottleneck, they are eventually consistency and follows the CAP theorem of distributed transaction. Doesn't offer transactional consistency, relational SQL joins but scales many times faster than the RDBMs. NewSQL is a new type of databases offer the ACID complaint distributed transaction that can scale. Apache Ignite can be termed as a NewSQL db