|
From mailling list of cygwin.
"Filipek, Stefan R." <sfilipek () mitre ! org> Wrote:
Hello,
I've been trying to track down a segmentation fault that a rather large application \
I'm working on has been experiencing. I have seemed to narrow it down to two threads \
that are opening, appending, and closing files quite often. I've created a very \
simple example program that suffers from the same condition.
Two threads are created, each open up their own unique file, followed by a close (no \
writing, in this example). The threads are both joined and then the processes is then \
repeated. After some indeterminate amount of time - millions of iterations - the \
application segfaults with no real useful information that I can see. Running in gdb \
doesn't seem to help (after compiling with debug flags, of course) as the backtrace \
is either corrupted or can't be followed because I'm not using a debug version of \
Cygwin.
Note that this problem seems to be accelerated by having the target directory open in \
explorer and/or having the files highlighted. I've come across situations where even \
ofstream.open() will throw an exception when doing the above. The exception has been \
seen in both C++ and python, which makes me think it's something fundamental in \
Cygwin, and possibly related to this as well.
Is there something inherently wrong with having different treads access different \
files at once? I have reproduced this issue across multiple machines.
Compile: g++ FileTest.cpp -lpthread -oFileTest
FileTest.cpp:
#include <fstream>
#include <string>
#include <iostream>
#include <pthread.h>
using namespace std;
struct ThreadData {
string fileName;
};
void *
FileThread(void *arg) {
try {
ofstream outfile;
ThreadData *td = (ThreadData*)arg;
string fileName = td->fileName;
try {
outfile.open(fileName.c_str(), ios_base::app);
} catch(...) {
cerr << "Exception during open()" << endl;
return NULL;
}
try {
outfile.close();
} catch(...) {
cerr << "Exception during open()" << endl;
return NULL;
}
} catch(...) {
cerr << "Exception while creating objects" << endl;
return NULL;
}
return NULL;
}
int
main(void) {
unsigned long long count = 0;
ThreadData td1;
td1.fileName = "temp1.txt";
ThreadData td2;
td2.fileName = "temp2.txt";
while(1) {
count++;
if(countP00 == 0) cout << "Iteration " << count << endl;
pthread_t thread1;
pthread_t thread2;
pthread_create(&thread1, NULL, FileThread, &td1);
pthread_create(&thread2, NULL, FileThread, &td2);
void *res = NULL;
pthread_join(thread1, &res);
pthread_join(thread2, &res);
}
// Not reached
return 0;
}
Stackdump:
Exception: STATUS_ACCESS_VIOLATION at eip=610B5FF2
eax=0D89466C ebx=006A02F0 ecx=61149C88 edx=0D89466C esi=61149C88 edi=006C05C8
ebp=0022CAC8 esp=0022CAB0 program=c:\Documents and \
Settings\sfilipek\test\FileTest.exe, pid 4344, thread main cs=001B ds=0023 es=0023 \
fs=003B gs=0000 ss=0023 Stack trace:
Frame Function Args
0022CAC8 610B5FF2 (006A02F0, 00000000, 0022CAE8, 006A02F0)
0022CAE8 610B8B0D (006A0298, FFFFFFFF, 0022CC98, 006B0508)
0022CC08 610B1E4B (0022CC20, 0022CC94, 0022CCE8, 610935A8)
0022CC18 610779F8 (006A0298, 0022CC94, 00401150, 0022CCA0)
0022CCE8 610935A8 (00000001, 6116B798, 006A0090, 0022CC70)
0022CD98 610060D8 (00000000, 0022CDD0, 61005450, 0022CDD0)
61005450 61004416 (0000009C, A02404C7, E8611021, FFFFFF48)
34 [main] FileTest 4344 _cygtls::handle_exceptions: Error while dumping state \
(probably corrupted stack)
Nothing was written to stderr in the end... just the segfault.
Any advice, workaround, etc. would be extremely helpful.
Regards,
Stefan Filipek
uname -a: CYGWIN_NT-5.1 [computer name] 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 \
Cygwin
Dave Korn <dave.korn.cygwin () googlemail ! com> Reply:
Filipek, Stefan R. wrote:
> Note that this problem seems to be accelerated by having the target
> directory open in explorer and/or having the files highlighted. I've come
> across situations where even ofstream.open() will throw an exception when
> doing the above. The exception has been seen in both C++ and python, which
> makes me think it's something fundamental in Cygwin, and possibly related
> to this as well.
Actually, that makes me think you have BLODA-style interference.
> Potential app conflicts:
>
> ZoneAlarm Personal Firewall
> Detected: HKLM Registry Key, Named file.
Which version is this? Is it the full version with antispyware and all
sorts of extra tricks built in? What anti-virus do you have?
The stack trace you posted suggests BLODA too:
[ minor munging of paths to simplify
~ $ addr2line --exe /bin/cygwin1.dbg
610B5FF2
/usr/src/cygwin-1.5.25-15/winsup/cygwin/thread.cc:1593
610B8B0D
/usr/src/cygwin-1.5.25-15/winsup/cygwin/thread.h:147
610B1E4B
/usr/src/cygwin-1.5.25-15/winsup/cygwin/thread.h:301
610779F8
/usr/src/cygwin-1.5.25-15/winsup/cygwin/pthread.cc:71
610935A8
??:0
610060D8
/usr/src/cygwin-1.5.25-15/winsup/cygwin/dcrt0.cc:956
61004416
/usr/src/cygwin-1.5.25-15/winsup/cygwin/cygtls.cc:73
Looking at the top of the stack there:
1590 pthread_mutex::~pthread_mutex ()
1591 {
1592 if (win32_obj_id)
1593 CloseHandle (win32_obj_id);
1594
1595 mutexes.remove (this);
1596 }
suggests that a plain call to win32.CloseHandle blew up in our faces. That
really does smack of a bad AV/PFW hook that's messing with handles.
> Is there something inherently wrong with having different treads access
> different files at once?
No, of course not; any failures are real bugs. However, nobody's going to
be very likely to fix it in 1.5, so you should definitely see if it reproduces
under 1.7. If you can't narrow it down to a BLODA, that is.
cheers,
DaveK
"Filipek, Stefan R." <sfilipek () mitre ! org> Reply:
> Which version is this? Is it the full version with antispyware and all
sorts of extra tricks built in? What anti-virus do you have?
It's using Symantec Endpoint Protection 11.0.2000.1567 - all the bells and whistles.
I blamed AVS at first, but the problem also persists on machines that do not have any \
AVS installed. I double checked and ran another test today on a different machine \
that also ended the same way (exact same spot too), but only at 100k iterations.
But, it should be noted that the stackdump has occurred at different points, in \
different threads (as opposed to main). I can try to provide those if anyone is \
interested.
> > Is there something inherently wrong with having different treads access
> > different files at once?
> No, of course not; any failures are real bugs.
Thank you for the sanity check.
I'll test with 1.7 but I just wanted to say that it's not AVS/PFW to blame here. \
Could it possibly be Windows Explorer itself? Again, it does seem to occur more often \
when an explorer window is open to the output directory, in which it refreshes the \
status of the files being opened/closed every so often.
It would also be nice to know if someone else can reproduce the issue; perhaps just \
leave it running on a computer overnight.
Regards,
Stefan
"Filipek, Stefan R." <sfilipek () mitre ! org> Reply:
Dave Korn wrote:
> you should definitely see if it reproduces under 1.7
I tried the latest 1.7 and it hangs instead of segfaulting (less than 500k \
iterations).
This seems like a pretty major problem for any intensive multithreaded application. \
Though infrequent, it has produced a rather large roadblock.
I may not have much time to devote to this issue, but please let me know if I can \
assist further, as I would like to have this resolved.
Regards,
Stefan
|