Usecases for JBoss Data Grid & JBoss Data Virtualization

Vijay Chintalapati bio photo By Vijay Chintalapati Comment
Usecases
JDV as Data Source
JDG as Data Source
JDG as JDV Materialization Target

JDV as Data Source

Use Case

The enterprise data is spread over multiple data sources. There is a need to get rapid yet real time access to this data and not worry about querying the data or transforming the data into usable format for application consumption (Java objects).

Solution

  • Use JBoss Data Virtualization (JDV) for solving the multi-datasource problem
    • JDV provides real time access to the underlying data sources
    • JDV can expose a view that takes care of all the complexities involved in data transformation that one would have to otherwise handle in application code
  • Use JBoss Data Grid (JDG) for Enterprise Caching needs
    • Using JDG JPA Cache Store in embedded mode, one can using an existing relational table to load (store) the data into (from) a cache
    • The JPA Cache Store hides the complexities of transforming the relational result set (rows) into Java objects and store/retrieve operations on the cache
  • Once a reasonable limit on staleness is identified the cache can be configured with expiration with a defined lifespan for the entries
  • One can further benefit if Internal Materialization is enabled on the view table. The lifespan for expiration of entries can then be set to a duration higher than the refresh interval for the internal materialization (at least by the amount of time to refresh the entire materialized table)

Caveats

  • JPA Cache Store is only available in Embedded Mode and is yet to be made available for client-server mode
  • JPA Cache Store does not proactively identify or load changed data in the underlying table and hence setting expiration on entries of the cache and turning on the internal materialization on the view table is the best way to guarantee reasonably fresh data

Sample Code

Following are the projects you can use to experiment with the above mentioned usecase. The first is prerequisite for the second. For simplicity sake, only reads are tested in this setting. Writes are advanced topic and beyond the scope of a simple demo.

JDG as Data Source

Use Case

A cluster of JDG servers had been stood up and one or more cache instances have been serving well known entity information. There is a need for analysis on one or more attributes of the cached data. Knowing the key for data retrieval is out of question and using Java API for running such queries is deemed extremely prohibitive.

and/or

JDV is already in use to alleviate disparate data problem and now there is a new need to be able to work with the data cached in JDG.

Solution

Use the translator(s) built into JDV to connect to, query and add to JDG cached data. There are translators for both embedded cache and remote cache. Embedded cache requires the DV runtime to host the cache within its JVM and hence the plausibility and scalability of such as solution is slim. However, using JDV with remote caches is quite conceivable and practical.

Steps to connect JDV to remote caches

  1. Download the JDG EAP modules for remote clients
  2. Unzip the JDG EAP modules for remote client into the root (EAP) folder of the JBoss Data Virtualization runtime
  3. Build and deploy the POJO artifacts as EAP modules that represent the objects stored into the data grid
  4. Add the Resource Adapter for remote JDG caches to the DV runtime configuration. This Resource Adapter will handle all the configuration items for looking up the right POJO(s) library, the cache name, the IP and Port of at least one running JDG runtime server, the file representing the Protobuf representation of the stored entity etc.
  5. Add the infinispan-cache-dsl translator to the DV runtime configuration
  6. Register and initialize the .proto files and corresponding marshallers. This has to be done outside of JDV.
  7. Deploy a dynamic VDB that contains a model which references the Resource Adapter in Step 4

There is work currently underway to simplify the whole process so that generation of Pojo artifacts is automated via Teiid Designer including the addition of the Resource Adapter to the runtime server is simplified via the designer.

Sample Code

Following is the link to the JDV (Teiid) Quickstart one can use to test the use case.

JDG as JDV Materialization Target

Use Case

The enterprise architecture involves JDV that solves the muti-datasource problem.

  • There is vast amount of data that could potentially be retrieved from a view table.
  • The data is mostly static.
  • Rapid access to the data is critical to the application

Solution

Use a JDG clustered (preferably distributed) cache as a materialization target. When using a cluster of JDG nodes in client-server mode:

  • The storage of materialized data is scalable as additional JDG server nodes can be added on demand
  • Fast querying and access to data

The support for JDG as materialization target will be introduced in the JBoss Data Virtualization 6.3 release. More details will be provided when it is made publicly available. Setup wise, it is not too different from the use case of JDG as a data source.

Important Considerations using JDG as Source or Materialization Target

  1. Always enable indexing on the cache. The Infinispan Query DSL does not mandate indexing but turning it on will substantially improve the search/query performance
  2. Enable batching when possible to improve write performance on the cache. This is by default turned on the when JDG is used as Materialized target
  3. There is currently no transaction support for JDG in client server mode and hence JDG would be treated as an Non-XA resource
  4. Replicated cache in JDG would perform better than Distributed cache but that would come at the expense of scalability
comments powered by Disqus