What is DataStax Enterprise 5.0? (Database Built on Apache Cassandra™)


Oh, huh! It’s my Cassandra 3. It’s a great day. Here at DataStax, we’ve just released
DataStax Enterprise 5, our most important software release yet. It’s packed with a lot of great things,
including Cassandra 3. Cassandra 3 gives you materialized views which promise to revolutionize
the way you model data. It’s also got a completely
revamped storage engine, which has been rebuilt to take account
of the changes in data modeling that have happened
over the past few years. There’s lots more in the box too. A wise man once said there are
no solutions, only trade-offs. Well, it’s the same thing with storage
in your DataStax Enterprise cluster. Sure, you want everything to be
on the fastest disks that money can buy, but the problem with that
is money has to buy them. And you’ve probably got
a colder set of data that isn’t accessed quite as often, that you might want to put
on lower cost, slower storage. And then you’ve got your hot data. It’s worth it to spend money
to put it on disks as fast as possible. With tiered storage, DataStax Enterprise
will automatically allocate your hot data
to faster storage, like this, and your colder data
to slower storage, like that. A trade-off that will
get you the most bang for your infrastructure buck. Apache Cassandra has always
had support for multiple data centers. You write date to one data center, it’s automatically replicated
to the others; it’s a killer feature. But what if parts of your system
are in out-of-the-way places, and maybe with intermittent
network access. You don’t need multiple data centers,
you need multiple databases. With advanced replication, you can create
entirely separate Cassandra clusters, which act as spokes on a central hub. Writes to tables in those spokes
are replicated to the hub, just as you’d expect. And sure, no network is perfect, but with advanced replication,
your remote clusters don’t have to have
continuous connectivity to the hub. It’s designed from the ground up to support intermittent connections
with limited bandwidth. Writes in selected tables
on remote clusters are replicated into the hub cluster, where you can read them with normal
Cassandra transactional latency, giving you, in effect, a federation
of autonomous databases located wherever you need
to manage your data. With fine grain control of the flow
of that data from the edge into the hub. DataStax Enterprise famously runs well
on commodity hardware, but the definition of commodity
hardware is a moving target. Memory is getting cheaper,
storage is getting denser, cores are getting
more plentiful by the year. And maybe there’s a particular kind of machine
that IT wants you to provision to, but a single instance
of DataStax Enterprise might be a little lonely
on that overpowered hardware. You want to utilize
your resources efficiently, but you also want to operate
in the price/performance sweet spot of the contemporary hardware market. With the new multi-instance feature,
you can provision several nodes of DataStax Enterprise on the same server. This lets you get the deal
on hardware you want and keep that growing
hardware capacity fully utilized, without paying the performance
overhead of virtualization. Because DataStax Enterprise
is built on Apache Cassandra, it uses Cassandra’s tabular data model. It’s really easy for developers to learn
and it scales like crazy, but it can have trouble representing
arbitrary relationships between things. This is where Graph comes in. Now, Graph databases are nothing new, but they are notoriously
difficult to scale. By building Graph on top of Cassandra, we give you a rich,
expressive way to model your data that succeeds where Cassandra does. Where your data is huge, and you’ve got a really high-volume
transactional workload, and Graph integrates automatically with all of the features
of DataStax Enterprise. You’ve got analytics, search, security,
management through OpsCenter; it’s all there. Even better, Graph is built
on open source foundations, like the Gremlin Graph query language
and the TinkerPop API. Graph is a game changer. Monitoring production infrastructure, whether it’s your own hardware
or in the cloud, isn’t easy. There are plenty
of standard tools for the basics, but for a complex database
like DataStax Enterprise, you want an integrated purpose-built tool. You don’t want to cobble together
a solution out of Pearl and baling wire. OpsCenter 6 is a single,
integrated management tool, specifically designed
for DataStax Enterprise. It’s fully integrated
with the latest release, including monitoring
and configuration support for Graph. It’s got scheduled backups and repairs,
token rebalancing, alerts, and it still has
great cluster visualization. And it’s certified to run on clusters
of up to a thousand nodes, plus its new life cycle management
features provide pretty robust automation for things like deploying new clusters, expanding existing clusters,
and centralizing configuration data. All together, it really does make administrative life
with DataStax Enterprise simpler. Cassandra 3, tiered storage,
advanced replication, multi-instance deployments,
a new OpsCenter, and the game changer of Graph, all make it clear that this is a release
of DataStax Enterprise you should get to know. DataStax Academy will be here to help
when you want to learn more.

Leave a Reply

Your email address will not be published. Required fields are marked *