Java offers a wide set of data structure implementations ready for developers. Collections are a great and powerful example.
These standard data structures are limited by borders of a single JVM. They depend on available memory within one server. They don't scale during high loads.
In-memory data grids (IMDG) may help to solve this problem. They offer distributed versions of Java data structures. Data is spread across multiple servers. Data grids provide failover features and prevent data loss when a server crashes. And you can simply scale them up and down. Let's go through the most popular Java native IMDG implementations and compare distributed data structures provided.
Java offers a wide set of data structure implementations ready for developers. Collections are a great and powerful example.
These standard data structures are limited by borders of a single JVM. They depend on available memory within one server. They don't scale during high loads.
In-memory data grids (IMDG) may help to solve this problem. They offer distributed versions of Java data structures. Data is spread across multiple servers. Data grids provide failover features and prevent data loss when a server crashes. And you can simply scale them up and down. Let's go through the most popular Java native IMDG implementations and compare distributed data structures provided.
HashMap
for caching?Hands up, who never used the HashMap
for caching?
Me too. It's OK when you know about its limitations and you accept them
HashMap
HashMap
Map<Long, User> cache = new HashMap<>();public User getUserById(Long id) { User user = cache.get(id); if (user == null) { user = dbTool.loadUser(id); cache.put(id, user); } return user;}
HashMap
Collections.synchronizedMap()
ConcurrentHashMap
HashMap
ConcurrentMap
Collections.synchronizedMap()
ConcurrentHashMap
HashMap
ConcurrentMap
Collections.synchronizedMap()
ConcurrentHashMap
Distributed system,
which holds data structures in RAM
among multiple servers.
Distributed partitioned hash map with every cluster node owning a portion of the overall data -- Ignite
There are 3 main reasons ....
The scale of latencies:
Why IMDG:
geographical backup
Massive heap
What are the Topologies / Deployment types of IMDGs?
2 basic deployment types
we can also group the usage by locations where we actually store the data
Map<String, Integer> cityInhabitants = new ConcurrentHashMap<>();cityInhabitants.put("Istanbul", 15_067_724);cityInhabitants.put("London", 9_126_366);cityInhabitants.put("Prague", 1_308_632);//...System.out.println("London population: " + cityInhabitants.get("London"));
DefaultCacheManager manager = new DefaultCacheManager( GlobalConfigurationBuilder.defaultClusteredBuilder().build());Configuration configuration = new ConfigurationBuilder() .clustering() .cacheMode(CacheMode.DIST_SYNC) .build();manager.defineConfiguration("cityInhabitants", configuration);Map<String, Integer> cityInhabitants = manager.getCache("cityInhabitants");cityInhabitants.put("Istanbul", 15_067_724);cityInhabitants.put("London", 9_126_366);cityInhabitants.put("Prague", 1_308_632);//...System.out.println("London population: " + cityInhabitants.get("London"));
Enterprise-like configuration, but I still remember configuring EJBs on JBoss AS in version 3, so this one is easy-peasy
A CacheManager is the primary mechanism for retrieving a Cache instance and is often used as a starting point to using the Cache.
CacheManagers are heavyweight objects, and we foresee no more than one CacheManager being used per JVM (unless specific configuration requirements require more than one; but either way, this would be a minimal and finite number of instances).
HazelcastInstance hz = Hazelcast.newHazelcastInstance();Map<String, Integer> cityInhabitants = hz.getMap("cityInhabitants");cityInhabitants.put("Istanbul", 15_067_724);cityInhabitants.put("London", 9_126_366);cityInhabitants.put("Prague", 1_308_632);//...System.out.println("London population: " + cityInhabitants.get("London"));
The configuration is more straight forward here.
Each vendor provides its specific extension of the Map API. Check the specific return types of methods used to retrieve the Map
Ignite ignite = Ignition.start();IgniteCache<String, Integer> cityInhabitants = ignite.getOrCreateCache("cityInhabitants");cityInhabitants.put("Istanbul", 15_067_724);cityInhabitants.put("London", 9_126_366);cityInhabitants.put("Prague", 1_308_632);//...System.out.println("London population: " + cityInhabitants.get("London"));
Maybe you've realized here, I don't use the java.util.Map as a cache type here.
Ignite ignite = Ignition.start();IgniteCache<String, Integer> cityInhabitants = ignite.getOrCreateCache("cityInhabitants");cityInhabitants.put("Istanbul", 15_067_724);cityInhabitants.put("London", 9_126_366);cityInhabitants.put("Prague", 1_308_632);//...System.out.println("London population: " + cityInhabitants.get("London"));
IgniteCache
doesn't implement java.util.Map
!Maybe you've realized here, I don't use the java.util.Map as a cache type here.
Ignite ignite = Ignition.start();IgniteCache<String, Integer> cityInhabitants = ignite.getOrCreateCache("cityInhabitants");cityInhabitants.put("Istanbul", 15_067_724);cityInhabitants.put("London", 9_126_366);cityInhabitants.put("Prague", 1_308_632);//...System.out.println("London population: " + cityInhabitants.get("London"));
IgniteCache
doesn't implement java.util.Map
!IgniteCache
implements javax.cache.Cache
!Maybe you've realized here, I don't use the java.util.Map as a cache type here.
JCache uses the top-level package name of javax.cache, and defines the following five core interfaces:
<dependency> <groupId>javax.cache</groupId> <artifactId>cache-api</artifactId> <version>${version.jcache}</version></dependency>
JCache uses the top-level package name of javax.cache, and defines the following five core interfaces:
There are two defined mechanisms in which an entry can be stored in a cache. The default mechanism is called store-by-value, in which the key-value pairs are stored in the cache, and new copies of the entries are made and returned when accessed from the cache. The other (optional) mechanism is store-by-reference, in which the cache stores and returns reference to application-provided key-value pairs. This lets updates to the application-provided key-value pairs to be seen in subsequent accesses without having to update the cache entries themselves.
Similar to creating or getting a distributed Map there are other standard data structures available in the distributed world.
In this table the colors have following meaning
docker run -it --rm apacheignite/ignite:2.7.6docker run -it --rm hazelcast/hazelcast:3.12.4# docker run -it --rm jboss/infinispan-server:10.0.1.Final# Can you find a difference? (wink, wink)docker imagesREPOSITORY TAG IMAGE ID CREATED SIZEapacheignite/ignite 2.7.6 ce9ff5b69430 2 months ago 527MBhazelcast/hazelcast 3.12.4 32c507fed571 5 weeks ago 116MBjboss/infinispan-server 10.0.1.Final 596b3626a09b 5 weeks ago 366MB
github.com/kwart
twitter.com/jckwart
javlog.cacek.cz
Keyboard shortcuts
↑, ←, Pg Up, k | Go to previous slide |
↓, →, Pg Dn, Space, j | Go to next slide |
Home | Go to first slide |
End | Go to last slide |
b / m / f | Toggle blackout / mirrored / fullscreen mode |
c | Clone slideshow |
p | Toggle presenter mode |
t | Restart the presentation timer |
?, h | Toggle this help |
Esc | Back to slideshow |