rubbish me | 10 Jul 00:39 2012

Fwd: BulkLoading sstables from v1.0.3 to v1.1.1



Hi 

As part of a continuous development of a system migration, we have a test build to take a snapshot of a keyspace from cassandra v 1.0.3 and bulk load it to a cluster of 1.1.1 using the sstableloader.sh.  Not sure if relevant, but one of the cf contains a secondary index. 

The build basically does: 
Drop the destination keyspace if exist 
Add the destination keyspace, wait for schema to agree 
run sstableLoader 
Do some validation of the streamed data 

Keyspace / column families schema are basically the same, apart from in the one of v1.1.1, we had compression and key cache switched on. 

On a clean cluster, (empty data, commit log, saved-cache dirs) the sstables loaded beautifully. 

But subsequent build failed with 
-- 
[21:02:02][exec] progress: [<snip ip_addresses>]... [total: 0 - 0MB/s (avg: 0MB/s)]ERROR 21:02:02,811 Error in ThreadPoolExecutorjava.lang.RuntimeException: java.net.SocketException: Connection reset 
[21:02:02][exec] at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) 
[21:02:02][exec] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) 
[21:02:02][exec] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) 
[21:02:02][exec] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) 
[21:02:02][exec] at java.lang.Thread.run(Thread.java:662) 
[21:02:02][exec] Caused by: java.net.SocketException: Connection reset 
[21:02:02][exec] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) 
[21:02:02][exec] at java.net.SocketOutputStream.write(SocketOutputStream.java:136) 
[21:02:02][exec] at com.ning.compress.lzf.ChunkEncoder.encodeAndWriteChunk(ChunkEncoder.java:133) 
[21:02:02][exec] at com.ning.compress.lzf.LZFOutputStream.writeCompressedBlock(LZFOutputStream.java:203) 
[21:02:02][exec] at com.ning.compress.lzf.LZFOutputStream.write(LZFOutputStream.java:97) 
[21:02:02][exec] at org.apache.cassandra.streaming.FileStreamTask.write(FileStreamTask.java:227) 
[21:02:02][exec] at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:168) 
[21:02:02][exec] at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94) 
[21:02:02][exec] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) 
[21:02:02][exec] ... 3 more 
-- 

If I look at the server log at this time, we have 
--- 
ERROR [Thread-30] 2012-07-07 21:02:44,484 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[Thread-30,5,main] 
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: DecoratedKey(106448592537980973961347479329396275945, 6e6669677c323031322d30362<snip..., v long key)) != DecoratedKey(155376897532138582317079439091276375324, 444956334f6666696369616c2d5969656c644375727665737c323031322d30352d3331) in /opt/cassandra/data/dev_load_test2/journal/dev_load_test2-journal-hd-5-Data.db 
        at org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:136) 
        at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:202) 
        at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:103) 
        at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182) 
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78) 
-- 

From then on, we had different errors with the rest of the bulkLoad: 
-- 
ERROR [Thread-54] 2012-07-07 21:04:33,589 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[Thread-54,5,main] 
java.lang.AssertionError: We shouldn't fail acquiring a reference on a sstable that has just been transferred 
        at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:188) 
        at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:103) 
        at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182) 
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78) 
-- 

We would like to understand this error.  It's our believe that you could reload the same set of sstables without problem. 

Have we done anything wrong?  Many thanks for your help. 

- A 

Ivo Meißner | 10 Jul 08:20 2012

Re: BulkLoading sstables from v1.0.3 to v1.1.1

Hi,

there are some problems in version 1.1.1 with secondary indexes and key caches that are fixed in 1.1.2. 
I would try to upgrade to 1.1.2 and see if the error still occurs. 

Ivo





Hi 

As part of a continuous development of a system migration, we have a test build to take a snapshot of a keyspace from cassandra v 1.0.3 and bulk load it to a cluster of 1.1.1 using the sstableloader.sh.  Not sure if relevant, but one of the cf contains a secondary index. 

The build basically does: 
Drop the destination keyspace if exist 
Add the destination keyspace, wait for schema to agree 
run sstableLoader 
Do some validation of the streamed data 

Keyspace / column families schema are basically the same, apart from in the one of v1.1.1, we had compression and key cache switched on. 

On a clean cluster, (empty data, commit log, saved-cache dirs) the sstables loaded beautifully. 

But subsequent build failed with 
-- 
[21:02:02][exec] progress: [<snip ip_addresses>]... [total: 0 - 0MB/s (avg: 0MB/s)]ERROR 21:02:02,811 Error in ThreadPoolExecutorjava.lang.RuntimeException: java.net.SocketException: Connection reset 
[21:02:02][exec] at org.apache.cassandra.utils.FBUtilities.unchecked(FBUtilities.java:628) 
[21:02:02][exec] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) 
[21:02:02][exec] at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) 
[21:02:02][exec] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) 
[21:02:02][exec] at java.lang.Thread.run(Thread.java:662) 
[21:02:02][exec] Caused by: java.net.SocketException: Connection reset 
[21:02:02][exec] at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96) 
[21:02:02][exec] at java.net.SocketOutputStream.write(SocketOutputStream.java:136) 
[21:02:02][exec] at com.ning.compress.lzf.ChunkEncoder.encodeAndWriteChunk(ChunkEncoder.java:133) 
[21:02:02][exec] at com.ning.compress.lzf.LZFOutputStream.writeCompressedBlock(LZFOutputStream.java:203) 
[21:02:02][exec] at com.ning.compress.lzf.LZFOutputStream.write(LZFOutputStream.java:97) 
[21:02:02][exec] at org.apache.cassandra.streaming.FileStreamTask.write(FileStreamTask.java:227) 
[21:02:02][exec] at org.apache.cassandra.streaming.FileStreamTask.stream(FileStreamTask.java:168) 
[21:02:02][exec] at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:94) 
[21:02:02][exec] at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) 
[21:02:02][exec] ... 3 more 
-- 

If I look at the server log at this time, we have 
--- 
ERROR [Thread-30] 2012-07-07 21:02:44,484 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[Thread-30,5,main] 
java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: DecoratedKey(106448592537980973961347479329396275945, 6e6669677c323031322d30362<snip..., v long key)) != DecoratedKey(155376897532138582317079439091276375324, 444956334f6666696369616c2d5969656c644375727665737c323031322d30352d3331) in /opt/cassandra/data/dev_load_test2/journal/dev_load_test2-journal-hd-5-Data.db 
        at org.apache.cassandra.db.index.SecondaryIndexManager.maybeBuildSecondaryIndexes(SecondaryIndexManager.java:136) 
        at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:202) 
        at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:103) 
        at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182) 
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78) 
-- 

From then on, we had different errors with the rest of the bulkLoad: 
-- 
ERROR [Thread-54] 2012-07-07 21:04:33,589 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[Thread-54,5,main] 
java.lang.AssertionError: We shouldn't fail acquiring a reference on a sstable that has just been transferred 
        at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:188) 
        at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:103) 
        at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:182) 
        at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78) 
-- 

We would like to understand this error.  It's our believe that you could reload the same set of sstables without problem. 

Have we done anything wrong?  Many thanks for your help. 

- A 



rubbish me | 10 Jul 09:43 2012

Re: BulkLoading sstables from v1.0.3 to v1.1.1

Thanks Ivo. 

We are quite close to releasing so we'd hope to understand what causing the error and may try to avoid it where possible. As said, it seems to work ok the first time round. 

The problem you referring in the last mail, was it restricted to bulk loading or otherwise?

Thanks

-A

Ivo Meißner <info <at> overtronic.com> 於 10 Jul 2012 07:20 寫道:

Hi,

there are some problems in version 1.1.1 with secondary indexes and key caches that are fixed in 1.1.2. 
I would try to upgrade to 1.1.2 and see if the error still occurs. 

Ivo





Hi 

As part of a continuous development of a system migration, we have a test build to take a snapshot of a keyspace from cassandra v 1.0.3 and bulk load it to a cluster of 1.1.1 using the sstableloader.sh.  Not sure if relevant, but one of the cf contains a secondary index. 

The build basically does: 
Drop the destination keyspace if exist 
Add the destination keyspace, wait for schema to agree 
run sstableLoader 
Do some validation of the streamed data 

Keyspace / column families schema are basically the same, apart from in the one of v1.1.1, we had compression and key cache switched on. 

On a clean cluster, (empty data, commit log, saved-cache dirs) the sstables loaded beautifully. 

But subsequent build failed with 
-- 
[21:02:02][exec] progress: [<snip ip_addresses>]... [total: 0 - 0MB/s (avg: 0MB/s)]ERROR 21:02:02,811 Error in ThreadPoolExecutorjava.lang.RuntimeException: java.net.SocketException: Connection reset 
aaron morton | 12 Jul 06:11 2012

Re: BulkLoading sstables from v1.0.3 to v1.1.1

Do you have the full error logs ? Their should be a couple of caused by: errors that will help track it down where the original Assertion is thrown.

The second error is probably the result of the first. Something has upset the SSTable tracking. 

If you can get the full error stack, and some steps to reproduce, can you raise a ticket on https://issues.apache.org/jira/browse/CASSANDRA ? 

Thanks


-----------------
Aaron Morton
Freelance Developer
<at> aaronmorton

On 10/07/2012, at 7:43 PM, rubbish me wrote:

Thanks Ivo. 

We are quite close to releasing so we'd hope to understand what causing the error and may try to avoid it where possible. As said, it seems to work ok the first time round. 

The problem you referring in the last mail, was it restricted to bulk loading or otherwise?

Thanks

-A

Ivo Meißner <info <at> overtronic.com> 於 10 Jul 2012 07:20 寫道:

Hi,

there are some problems in version 1.1.1 with secondary indexes and key caches that are fixed in 1.1.2. 
I would try to upgrade to 1.1.2 and see if the error still occurs. 

Ivo





Hi 

As part of a continuous development of a system migration, we have a test build to take a snapshot of a keyspace from cassandra v 1.0.3 and bulk load it to a cluster of 1.1.1 using the sstableloader.sh.  Not sure if relevant, but one of the cf contains a secondary index. 

The build basically does: 
Drop the destination keyspace if exist 
Add the destination keyspace, wait for schema to agree 
run sstableLoader 
Do some validation of the streamed data 

Keyspace / column families schema are basically the same, apart from in the one of v1.1.1, we had compression and key cache switched on. 

On a clean cluster, (empty data, commit log, saved-cache dirs) the sstables loaded beautifully. 

But subsequent build failed with 
-- 
[21:02:02][exec] progress: [<snip ip_addresses>]... [total: 0 - 0MB/s (avg: 0MB/s)]ERROR 21:02:02,811 Error in ThreadPoolExecutorjava.lang.RuntimeException: java.net.SocketException: Connection reset 

Edward Capriolo | 13 Jul 02:23 2012
Picon

Re: BulkLoading sstables from v1.0.3 to v1.1.1

Historically you have not been able to stream stables between different file formats. Cassandra 1.0 creates files named hc . While 1.1 uses hd. Since bulk loading streams I am not sure this will work.

On Thursday, July 12, 2012, aaron morton <aaron <at> thelastpickle.com> wrote:
> Do you have the full error logs ? Their should be a couple of caused by: errors that will help track it down where the original Assertion is thrown.
> The second error is probably the result of the first. Something has upset the SSTable tracking. 
> If you can get the full error stack, and some steps to reproduce, can you raise a ticket on https://issues.apache.org/jira/browse/CASSANDRA ? 
> Thanks
>
> -----------------
> Aaron Morton
> Freelance Developer
> <at> aaronmorton
> http://www.thelastpickle.com
> On 10/07/2012, at 7:43 PM, rubbish me wrote:
>
> Thanks Ivo. 
> We are quite close to releasing so we'd hope to understand what causing the error and may try to avoid it where possible. As said, it seems to work ok the first time round. 
> The problem you referring in the last mail, was it restricted to bulk loading or otherwise?
> Thanks
> -A
>
> Ivo Meißner <info <at> overtronic.com> 於 10 Jul 2012 07:20 寫道:
>
> Hi,
> there are some problems in version 1.1.1 with secondary indexes and key caches that are fixed in 1.1.2. 
> I would try to upgrade to 1.1.2 and see if the error still occurs. 
> Ivo
>
>
>
>
> Hi 
>
> As part of a continuous development of a system migration, we have a test build to take a snapshot of a keyspace from cassandra v 1.0.3 and bulk load it to a cluster of 1.1.1 using the sstableloader.sh.  Not sure if relevant, but one of the cf contains a secondary index. 
>
> The build basically does: 
> Drop the destination keyspace if exist 
> Add the destination keyspace, wait for schema to agree 
>


Gmane