Eric Evans | 25 Aug 02:22 2011

CQL Drivers

There's been discussion happening in #2761
(https://issues.apache.org/jira/browse/CASSANDRA-2761) on and off now
for more than 3 months, and I think it could benefit from some wider
exposure.

The issue was created in the wake of the driver move from
asf/cassandra/trunk/drivers to asf/cassandra/drivers and the original
scope was to create a working build for the JDBC driver post-move (at
the time it had no build of its own).  That work has since been
completed, but it was left open to include some related items, in
hindsight it should have been closed and other issues opened as
needed.

The remainder of the discussion that's taken place in CASSANDRA-2761
revolves around moving the driver code back under Cassandra's tree.

I don't want this to happen because, as I've mentioned elsewhere,
drivers are supposed to be coded to a specification, not a Cassandra
version; Any given driver release is expected to work with any version
of Cassandra that uses a CQL version >= to that of the driver.  As
such there is a need to release them independently, with their own
versions, on a more or less frequently basis than Cassandra does.  To
this point I think there is agreement.

Where we disagree I guess is in how to accomplish this.

Moving the driver has made things less convenient in part because in
it's current location it isn't mirrored by git.apache.org, and in part
because it's quite obviously less convenient than having everything
all in one monolithic tree.  Most often cited for the latter is #3010
(Continue reading)

Gary Dusbabek | 25 Aug 05:09 2011
Picon

Re: CQL Drivers

On Wed, Aug 24, 2011 at 17:22, Eric Evans <eevans <at> acunu.com> wrote:
> There's been discussion happening in #2761
> (https://issues.apache.org/jira/browse/CASSANDRA-2761) on and off now
> for more than 3 months, and I think it could benefit from some wider
> exposure.
>
> The issue was created in the wake of the driver move from
> asf/cassandra/trunk/drivers to asf/cassandra/drivers and the original
> scope was to create a working build for the JDBC driver post-move (at
> the time it had no build of its own).  That work has since been
> completed, but it was left open to include some related items, in
> hindsight it should have been closed and other issues opened as
> needed.
>
> The remainder of the discussion that's taken place in CASSANDRA-2761
> revolves around moving the driver code back under Cassandra's tree.
>
> I don't want this to happen because, as I've mentioned elsewhere,
> drivers are supposed to be coded to a specification, not a Cassandra
> version; Any given driver release is expected to work with any version
> of Cassandra that uses a CQL version >= to that of the driver.  As
> such there is a need to release them independently, with their own
> versions, on a more or less frequently basis than Cassandra does.  To
> this point I think there is agreement.
>
> Where we disagree I guess is in how to accomplish this.
>
> Moving the driver has made things less convenient in part because in
> it's current location it isn't mirrored by git.apache.org, and in part
> because it's quite obviously less convenient than having everything
(Continue reading)

Eric Evans | 25 Aug 05:27 2011

Re: CQL Drivers

On Wed, Aug 24, 2011 at 10:09 PM, Gary Dusbabek <gdusbabek <at> gmail.com> wrote:
> On Wed, Aug 24, 2011 at 17:22, Eric Evans <eevans <at> acunu.com> wrote:
>> There are some workarounds that have been proposed for moving the
>> drivers back under Cassandra's source tree while creating independent
>> releases from there.  For example, keeping them only in trunk/ and
>> deleting drivers/ in new branches (which doesn't solve the case for
>> #3010).   In my opinion, these are half-measures that fail to create
>> the needed separation while making the release process brittle, or
>> complicated, or both, and generate confusion (which incidentally is
>> exactly why they were moved in the first place).
>
> Some apache projects organize their code into multiple trunks, and
> that sounds like it would help us out here.

If you mean what Stephen describes here,
http://article.gmane.org/gmane.comp.db.cassandra.devel/3974, then I'd
be down for it.  Seems like it would be pretty disruptive though.

--

-- 
Eric Evans
Acunu | http://www.acunu.com |  <at> acunu

Jonathan Ellis | 25 Aug 06:57 2011
Picon

Re: CQL Drivers

So to summarize the problems with the original move,

1) the git problem

2) the JDBC build problems

3) the cqlsh problem

In reverse order:

The cqlsh problem is not the same as the JDBC problem.  The cqlsh
problem is simply, "if we have a cql shell that we ship, it would be
convenient to have the driver there in the tree already."  In other
words, we have the same problem whether we have a Python cqlsh or a
Java one.  3010 is a red herring.

The JDBC problems can be subdivided into two categories: too-tight
coupling that having it in-tree masks (but is really a problem either
way), and java build systems being a PITA.  By the second part I mean,
yes, we had JDBC building after the first month or so, but it still
required that the main Cassandra source tree be checked out and built
locally, with a cumbersome manual process to point the driver build to
it.

The build headache is actually a symptom of the coupling.  None of the
other drivers have this problem; they were forced to do things
"right"* instead of re-using things that shouldn't be re-used.  If we
were going to take another stab at it we should fix this first.  (More
accurately, we should fix it anyway whether we move drivers or not.)

(Continue reading)

Vivek Mishra | 25 Aug 14:07 2011
Picon

Connection Pooling

Any plan for connection pooling? is it already in place?

-Vivek
Nick Telford | 25 Aug 15:53 2011
Picon

Re: Connection Pooling

Connection pooling is a client concern, are you referring to a particular
CQL driver or high-level client? If so, which one?

On 25 August 2011 13:07, Vivek Mishra <vivek.mishra <at> yahoo.com> wrote:

> Any plan for connection pooling? is it already in place?
>
> -Vivek
>
Eric Evans | 25 Aug 20:49 2011

Re: CQL Drivers

On Wed, Aug 24, 2011 at 11:57 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
> The JDBC problems can be subdivided into two categories: too-tight
> coupling that having it in-tree masks (but is really a problem either
> way), and java build systems being a PITA.  By the second part I mean,
> yes, we had JDBC building after the first month or so, but it still
> required that the main Cassandra source tree be checked out and built
> locally, with a cumbersome manual process to point the driver build to
> it.
>
> The build headache is actually a symptom of the coupling.  None of the
> other drivers have this problem; they were forced to do things
> "right"* instead of re-using things that shouldn't be re-used.  If we
> were going to take another stab at it we should fix this first.  (More
> accurately, we should fix it anyway whether we move drivers or not.)

The other drivers never had any choice but to create their own
implementations of term encoding/decoding, which you could probably
use as an argument for us inflicting this whole mess on ourselves with
the JDBC driver.  That said, I agree that the Right solution is still
to fix the tight coupling and reuse.

> The git mirror is also a symptom of a deeper problem.  Managing the
> drivers from the same Jira system as core is awkward too.  Nor does
> three-day release voting or patch-oriented development feel like a
> good fit for CQL drivers.

I emphatically agree.

> If we're going to move the drivers out-of-tree, why not move them all
> the way to github?  We'll still be able to link "official" drivers
(Continue reading)

Eric Evans | 30 Aug 05:16 2011

Re: CQL Drivers

On Thu, Aug 25, 2011 at 1:49 PM, Eric Evans <eevans <at> acunu.com> wrote:
> On Wed, Aug 24, 2011 at 11:57 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
>> The git mirror is also a symptom of a deeper problem.  Managing the
>> drivers from the same Jira system as core is awkward too.  Nor does
>> three-day release voting or patch-oriented development feel like a
>> good fit for CQL drivers.
>
> I emphatically agree.
>
>> If we're going to move the drivers out-of-tree, why not move them all
>> the way to github?  We'll still be able to link "official" drivers
>> from cassandra.apache.org, so I'm not worried about the kind of
>> fragmentation we have with Thrift clients today.  But if we want a
>> little more official-ness, we could use Apache Extras on google code
>> instead.  Which IMO has better bug tracker and code review systems,
>> but I don't really have strong feelings either way.
>>
>> So, of the problems with the original move, the cqlsh "problem" is the
>> only one that by definition can't be solved if we move the drivers out
>> of tree.  I'm not enthusiastic about inflicting that on ourselves in
>> exchange for problems with the git mirror.  But in exchange for a
>> clean separation as a separate project?  That makes more sense to
>> me.**
>
> Setting aside my shock from a suggestion that is 1000 miles in the
> opposite direction, I love it.

No one else has sounded off on this, does that mean it's safe to
assume there is consensus on this?

(Continue reading)

Robert Jackson | 30 Aug 12:34 2011

Re: CQL Drivers


On Aug 29, 2011, at 11:17 PM, Eric Evans <eevans <at> acunu.com> wrote:
>> 
>>> 
>> 
> No one else has sounded off on this, does that mean it's safe to
> assume there is consensus on this?
> 

I definitely think this is the right move. 

> If so, is it Apache Extras or Github (either would be fine by me).
> 
Either would be good, but I have a preference for GitHub (easier workflow). 

Robert Jackson
Eric Evans | 30 Aug 16:04 2011

Re: CQL Drivers

On Tue, Aug 30, 2011 at 5:34 AM, Robert Jackson
<robertj <at> promedicalinc.com> wrote:
> On Aug 29, 2011, at 11:17 PM, Eric Evans <eevans <at> acunu.com> wrote:
>> If so, is it Apache Extras or Github (either would be fine by me).
>>
> Either would be good, but I have a preference for GitHub (easier workflow).

Generally I prefer Github too, but the branding is better on Apache
Extras.  I almost want to say that Google's issue tracker is better
too, but maybe not.

I guess if we're fair about it, the ability to choose between Git,
Mercurial, and Subversion is probably a "feature" as well. :)

--

-- 
Eric Evans
Acunu | http://www.acunu.com |  <at> acunu

Jake Luciani | 30 Aug 16:10 2011
Picon

Re: CQL Drivers

I agree that apache extras makes better sense sense it's Branded (tm) and
has git.

On Tue, Aug 30, 2011 at 10:04 AM, Eric Evans <eevans <at> acunu.com> wrote:

> On Tue, Aug 30, 2011 at 5:34 AM, Robert Jackson
> <robertj <at> promedicalinc.com> wrote:
> > On Aug 29, 2011, at 11:17 PM, Eric Evans <eevans <at> acunu.com> wrote:
> >> If so, is it Apache Extras or Github (either would be fine by me).
> >>
> > Either would be good, but I have a preference for GitHub (easier
> workflow).
>
> Generally I prefer Github too, but the branding is better on Apache
> Extras.  I almost want to say that Google's issue tracker is better
> too, but maybe not.
>
> I guess if we're fair about it, the ability to choose between Git,
> Mercurial, and Subversion is probably a "feature" as well. :)
>
> --
> Eric Evans
> Acunu | http://www.acunu.com |  <at> acunu
>

--

-- 
http://twitter.com/tjake
Jonathan Ellis | 30 Aug 15:26 2011
Picon

Re: CQL Drivers

On Mon, Aug 29, 2011 at 10:16 PM, Eric Evans <eevans <at> acunu.com> wrote:
> No one else has sounded off on this, does that mean it's safe to
> assume there is consensus on this?

Looks like it.  The opinions on irc were positive, too.

> If so, is it Apache Extras or Github (either would be fine by me).

No strong feelings here.

--

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Eric Evans | 31 Aug 23:24 2011

Re: CQL Drivers

On Tue, Aug 30, 2011 at 8:26 AM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
> On Mon, Aug 29, 2011 at 10:16 PM, Eric Evans <eevans <at> acunu.com> wrote:
>> No one else has sounded off on this, does that mean it's safe to
>> assume there is consensus on this?
>
> Looks like it.  The opinions on irc were positive, too.
>
>> If so, is it Apache Extras or Github (either would be fine by me).
>
> No strong feelings here.

CASSANDRA-2936 is in progress (patches attached), but is there any
reason not to get started with the Python driver now?

It'd be nice to have this all in place for 1.0.  We also seem to be
getting somewhat close (?) to having drivers for PHP and Ruby, and
it'd be nice if we didn't have to double-handle them.

--

-- 
Eric Evans
Acunu | http://www.acunu.com |  <at> acunu

Jonathan Ellis | 1 Sep 05:58 2011
Picon

Re: CQL Drivers

On Wed, Aug 31, 2011 at 4:24 PM, Eric Evans <eevans <at> acunu.com> wrote:
> CASSANDRA-2936 is in progress (patches attached), but is there any
> reason not to get started with the Python driver now?

Heads up that test/system/test_cql.py depends on the Python driver.
It should probably be moved to the Python driver's test suite.  (Which
then needs the same kind of "start a Cassandra server" ability that
the Cassandra system tests have.)

--

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Eric Evans | 1 Sep 16:08 2011

Re: CQL Drivers

On Wed, Aug 31, 2011 at 10:58 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
> On Wed, Aug 31, 2011 at 4:24 PM, Eric Evans <eevans <at> acunu.com> wrote:
>> CASSANDRA-2936 is in progress (patches attached), but is there any
>> reason not to get started with the Python driver now?
>
> Heads up that test/system/test_cql.py depends on the Python driver.
> It should probably be moved to the Python driver's test suite.  (Which
> then needs the same kind of "start a Cassandra server" ability that
> the Cassandra system tests have.)

I posed a similar question about the JDBC driver in CASSANDRA-2936.

Should these tests be considered functional tests of Cassandra, and
left be left where they are?  I know that was my intention WRT
test_cql.py (the driver itself has a few tests of its own that do not
require a server).

I guess I don't have a strong opinion either way, though it seems
easier to say that Cassandra's tests will require you to have a driver
installed, versus having to configure the build/test of a driver to
point to a Cassandra tree.

--

-- 
Eric Evans
Acunu | http://www.acunu.com |  <at> acunu

Jonathan Ellis | 2 Sep 22:49 2011
Picon

Re: CQL Drivers

On Thu, Sep 1, 2011 at 9:08 AM, Eric Evans <eevans <at> acunu.com> wrote:
> I posed a similar question about the JDBC driver in CASSANDRA-2936.
>
> Should these tests be considered functional tests of Cassandra, and
> left be left where they are?  I know that was my intention WRT
> test_cql.py (the driver itself has a few tests of its own that do not
> require a server).
>
> I guess I don't have a strong opinion either way, though it seems
> easier to say that Cassandra's tests will require you to have a driver
> installed, versus having to configure the build/test of a driver to
> point to a Cassandra tree.

I'd lean towards "the tests should be in the client trees."  It feels
odd to move the drivers out, but leave their test suites in core.
(Keeping in mind that, as you pointed out, we have two more drivers
almost done.)

--

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Eric Evans | 2 Sep 22:59 2011

Re: CQL Drivers

On Fri, Sep 2, 2011 at 3:49 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
> On Thu, Sep 1, 2011 at 9:08 AM, Eric Evans <eevans <at> acunu.com> wrote:
>> I posed a similar question about the JDBC driver in CASSANDRA-2936.
>>
>> Should these tests be considered functional tests of Cassandra, and
>> left be left where they are?  I know that was my intention WRT
>> test_cql.py (the driver itself has a few tests of its own that do not
>> require a server).
>>
>> I guess I don't have a strong opinion either way, though it seems
>> easier to say that Cassandra's tests will require you to have a driver
>> installed, versus having to configure the build/test of a driver to
>> point to a Cassandra tree.
>
> I'd lean towards "the tests should be in the client trees."  It feels
> odd to move the drivers out, but leave their test suites in core.
> (Keeping in mind that, as you pointed out, we have two more drivers
> almost done.)

Referring to the mail I sent before seeing this.... does that mean
option #1 or #3, require a Cassandra tree, or point to running
instance?

--

-- 
Eric Evans
Acunu | http://www.acunu.com |  <at> acunu

Eric Evans | 2 Sep 22:52 2011

Re: CQL Drivers

On Thu, Sep 1, 2011 at 9:08 AM, Eric Evans <eevans <at> acunu.com> wrote:
> On Wed, Aug 31, 2011 at 10:58 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
>> On Wed, Aug 31, 2011 at 4:24 PM, Eric Evans <eevans <at> acunu.com> wrote:
>>> CASSANDRA-2936 is in progress (patches attached), but is there any
>>> reason not to get started with the Python driver now?
>>
>> Heads up that test/system/test_cql.py depends on the Python driver.
>> It should probably be moved to the Python driver's test suite.  (Which
>> then needs the same kind of "start a Cassandra server" ability that
>> the Cassandra system tests have.)
>
> I posed a similar question about the JDBC driver in CASSANDRA-2936.
>
> Should these tests be considered functional tests of Cassandra, and
> left be left where they are?  I know that was my intention WRT
> test_cql.py (the driver itself has a few tests of its own that do not
> require a server).
>
> I guess I don't have a strong opinion either way, though it seems
> easier to say that Cassandra's tests will require you to have a driver
> installed, versus having to configure the build/test of a driver to
> point to a Cassandra tree.

No opinions on this?  To summarize, the options as I see them are:

1. Keep the tests that query Cassandra, as-is, with the drivers they use.
2. Leave the tests that query Cassandra with Cassandra, (i.e. consider
them Cassandra tests).
3. Keep the tests that query Cassandra, but alter them to be
idempotent and to connect to an existent node, instead of spinning
(Continue reading)

Rick Shaw | 4 Sep 04:36 2011
Picon

Re: CQL Drivers

For what it is worth, my preference would be to have unit tests that would form a regression testing package
in the tree with the client sources. Ideally the build package (whether dedicated or mixed in with the
server) would have specific tasks to build, test and install/deploy devoted to the individual client in question.

I suggest that the client should not need to rely on anything but JAR files and local configuration to do the
build and test. If they needed to be coordinated with a server build then perhaps some utility tasks to copy
current jars for the server build site to the client build site would be useful. Config files local to the
client build/test would point to test C* configs as appropriate. I think that makes me favor option #3.

On Sep 2, 2011, at 4:52 PM, Eric Evans wrote:

> On Thu, Sep 1, 2011 at 9:08 AM, Eric Evans <eevans <at> acunu.com> wrote:
>> On Wed, Aug 31, 2011 at 10:58 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
>>> On Wed, Aug 31, 2011 at 4:24 PM, Eric Evans <eevans <at> acunu.com> wrote:
>>>> CASSANDRA-2936 is in progress (patches attached), but is there any
>>>> reason not to get started with the Python driver now?
>>> 
>>> Heads up that test/system/test_cql.py depends on the Python driver.
>>> It should probably be moved to the Python driver's test suite.  (Which
>>> then needs the same kind of "start a Cassandra server" ability that
>>> the Cassandra system tests have.)
>> 
>> I posed a similar question about the JDBC driver in CASSANDRA-2936.
>> 
>> Should these tests be considered functional tests of Cassandra, and
>> left be left where they are?  I know that was my intention WRT
>> test_cql.py (the driver itself has a few tests of its own that do not
>> require a server).
>> 
>> I guess I don't have a strong opinion either way, though it seems
(Continue reading)

Jonathan Ellis | 4 Sep 05:21 2011
Picon

Re: CQL Drivers

On Sat, Sep 3, 2011 at 9:36 PM, Rick Shaw <wfshaw <at> gmail.com> wrote:
> For what it is worth, my preference would be to have unit tests that would form a regression testing package
in the tree with the client sources.

Ditto.

> I think that makes me favor option #3.

I'm rather fond of how user-friendly the Python suite is (taking care
of server setup/teardown transparently) but realistically, now that we
have robust truncate, it's probably fine to require an existing server
and just use that.

--

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Eric Evans | 4 Sep 17:12 2011

Re: CQL Drivers

On Sat, Sep 3, 2011 at 10:21 PM, Jonathan Ellis <jbellis <at> gmail.com> wrote:
> On Sat, Sep 3, 2011 at 9:36 PM, Rick Shaw <wfshaw <at> gmail.com> wrote:
>> For what it is worth, my preference would be to have unit tests that would form a regression testing
package in the tree with the client sources.
>
> Ditto.
>
>> I think that makes me favor option #3.
>
> I'm rather fond of how user-friendly the Python suite is (taking care
> of server setup/teardown transparently) but realistically, now that we
> have robust truncate, it's probably fine to require an existing server
> and just use that.

I like it too, but it pretty much requires that you have a complete
Cassandra tree around (you need config, scripts, and both run and
build deps), and obviously it needs to have been built.

This is how the JDBC driver was setup when it was moved out of tree
and at least a couple of people (yourself included I believe), found
it to be less-than-friendly.  If it looks like I'm trying to steer the
discussion away from this option, that's why.

--

-- 
Eric Evans
Acunu | http://www.acunu.com |  <at> acunu

Jonathan Ellis | 5 Sep 02:16 2011
Picon

Re: CQL Drivers

On Sun, Sep 4, 2011 at 10:12 AM, Eric Evans <eevans <at> acunu.com> wrote:
>> I'm rather fond of how user-friendly the Python suite is (taking care
>> of server setup/teardown transparently) but realistically, now that we
>> have robust truncate, it's probably fine to require an existing server
>> and just use that.
>
> I like it too, but it pretty much requires that you have a complete
> Cassandra tree around (you need config, scripts, and both run and
> build deps), and obviously it needs to have been built.
>
> This is how the JDBC driver was setup when it was moved out of tree
> and at least a couple of people (yourself included I believe), found
> it to be less-than-friendly.  If it looks like I'm trying to steer the
> discussion away from this option, that's why.

Well, IMO there's a qualitative difference between "you need to have
the source tree around and edit a file to point the JDBC build to it"
and "you need to have the server listening on 9160, a packaged release
is fine."

--

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Gmane