Zardosht Kasheff | 14 Apr 2012 17:25
Picon

crash we've seen in TC_LOG_MMAP::unlog

Hello all,

When we run our mysql-test suite with a large number of parallel
threads, sometimes we see a crash in TC_LOG_MMAP::unlog:

int TC_LOG_MMAP::unlog(ulong cookie, my_xid xid)
{
  PAGE *p=pages+(cookie/tc_log_page_size);

The problem is that for some mysterious reason, tc_log_page_size is
set to 0, causing a divide by 0 error. We never see this if we run the
test suite serially.

The test simply creates a transaction does inserts into two engines,
each of which support XA, and commits.

Has anyone seen this before? I am hoping this is a known issues, as I
understand very little about this code. Looking at it, I cannot see
how tc_log_page_size should be 0 in that function, leading me to
believe that there is some memory corruption somewhere.

Any thoughts?

Thanks
-Zardosht

--

-- 
MySQL Internals Mailing List
For list archives: http://lists.mysql.com/internals
To unsubscribe:    http://lists.mysql.com/internals
(Continue reading)

Sergei Golubchik | 14 Apr 2012 18:36
Favicon

Re: crash we've seen in TC_LOG_MMAP::unlog

Hi, Zardosht!

At the risk of repeating myself, I'll answer "it's fixed in MariaDB" :)
Here's the fix:

http://bazaar.launchpad.net/~maria-captains/maria/5.5/revision/sergii <at> pisem.net-20120229205553-2axfeiua968w7llz

Somehow it's only in 5.5, but the fix is trivial, we can backport it, if
necessary.

On Apr 14, Zardosht Kasheff wrote:
> When we run our mysql-test suite with a large number of parallel
> threads, sometimes we see a crash in TC_LOG_MMAP::unlog:
> 
> int TC_LOG_MMAP::unlog(ulong cookie, my_xid xid)
> {
>   PAGE *p=pages+(cookie/tc_log_page_size);
> 
> The problem is that for some mysterious reason, tc_log_page_size is
> set to 0, causing a divide by 0 error. We never see this if we run the
> test suite serially.
> 
> The test simply creates a transaction does inserts into two engines,
> each of which support XA, and commits.
> 
> Has anyone seen this before? I am hoping this is a known issues, as I
> understand very little about this code. Looking at it, I cannot see
> how tc_log_page_size should be 0 in that function, leading me to
> believe that there is some memory corruption somewhere.

(Continue reading)


Gmane