Posts Tagged 'doradus'

O’Reilly Webcast – Extending Cassandra for OLAP

oreilly doradusColleague Randy Guck, who leads our open source Doradus project, recent gave an O’Reilly Webcast on the project and using Doradus to extend Cassandra for high performance analytics.

The discussion on how Doradus leverages Cassandra, its data model and query language, the internal architecture and the concept of storage services gave in-depth background to then understand the Doradus OLAP service and how it provides near real-time data warehousing.

Randys’ slides and webcast can be fund here. It does need registration, but is well worth the effort. The webcast was sponsored by Dell, which was entirely coincidental, since it was for a Hadoop services offering. Doradus offers some interesting ways to extend and use Cassandra and Randy covers most of them in the webcast. The key point is, that Doradus is an open source project, use and source code are free. Details on Doradus are in this blog entry.

Open Source @ Dell – Doradus

I’m delighted to announce that last week Dell Software group made available it’s first major open source project, Doradus.

Doradus is the next and biggest release so far from the software group at Dell and it joins Blockade, discussed in this blog. Through 2014, I hope to be in a position to announce at least a couple more big projects, and numerous smaller ones. We are pulling together a coherent approach to this, as well as a number of smaller tools.

What is Doradus?

Doradus is a set of tooling that started out ~2.5 years ago and is, and has been used by a number of our Dell software products. It has not been available as a product itself. Doradus provides a REST API on top of the Cassandra NoSQL database, adding a number of high level features. As a pure Java service it simplifies and extends NoSQL database functionality with a graph-based data model with bi-directional relationships and full referential integrity.

Included are a powerful query language supporting full text and statistical queries; Automatic data aging;  and Two storage services that target specific application types. An  OLAP service provides ultra-dense storage and fast analytic queries. There is a client library that allows Java clients to use POJOs to access Doradus DBs. It scales horizontally with Cassandra to provide NoSQL benefits of elasticity, replication, fault-tolerance, low cost, etc.

What was open-sourced?

The Doradus components included in the OSS offering are:

  • doradus-server: Source code and config files for the server.
  • doradus-client: Source code and config files for the Java client library.
  • doradus-common: Source code for doradus-common.jar, used by both the client and server modules.
  • docs: PDF versions of the main Doradus documentation. The build scripts in the root directory also build Java docs for the client library in the folder ./doradus-client/docs.

These components are released with the Apache License 2.0. Currently, we are working through the legal issues on enhancements and contributions, and will add an Apache based CLA to encourage larger contributions. In the interim we are happy to accept bug fixes for inclusion in the next code base rev. We are also looking to add the regression test suite that we use for continuous integration build integrity.

Where can I get it?

Doradus source code, documentation, and build scripts are available here: https://github.com/dell-oss/Doradus . You can use any Git client to download the files, or click the Download ZIP button to get everything as one .zip file. The root directory has both Ant and Maven build scripts, which download dependent jar files and build the binaries. In the near future, we will post pre-built source code, doc, and binary bundles on Maven Central to simplify downloading and installing.

What is dell-oss?

One of the things we’ll be doing this year is pulling together our open source projects and contributions, to make them easier to find, and to simplify for the Dell teams that will be contributing OSS projects. Personally, I’d like to also include a section where we store copies of our incoming and outgoing licenses, templates, and completed licenses. At least as of now we’ll be doing that through dell-oss, with Ant and Maven as needed. More detail on this when we make our next project announcement.

Congratulations to Randy Guck, James Bumgardner who made the OSS effort happen, also to the other Doradus developers.


About & Contact

I'm Mark Cathcart, formally a Senior Distinguished Engineer, in Dells Software Group; before that Director of Systems Engineering in the Enterprise Solutions Group at Dell. Prior to that, I was IBM Distinguished Engineer and member of the IBM Academy of Technology. I'm an information technology optimist.


I was a member of the Linux Foundation Core Infrastructure Initiative Steering committee. Read more about it here.

Subscribe to updates via rss:

Feed Icon

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 643 other followers

Blog Stats

  • 82,963 hits