Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report status code of child process in term.status property #37

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

niallo
Copy link
Contributor

@niallo niallo commented Jul 29, 2013

Use waitpid(3) in a SIGCHLD handler to map the exit status to PID. This is then made available to Node via a getter.

For example:

var pty = require('./pty.js');

var term = pty.spawn('bash', [], {
  name: 'xterm-color',
  cols: 80,
  rows: 30,
  cwd: process.env.HOME,
  env: process.env
});

term.on('data', function(data) {
  console.log(data);
});

term.on('close', function() {
  console.log('exit status: %d', term.status);
});

term.write('ls\r');
term.resize(100, 40);
term.write('ls /\r');
term.write('exit 2\r');

console.log(term.process);

@chjj
Copy link
Owner

chjj commented Jul 30, 2013

You should probably pass WNOHANG to waitpid in case that exit code isn't available to read or you might hang the process.

I wrote something similar to this a while back, but I was hesitant as how to this may interfere with libuv's sigchld handler. I eventually dropped the code and started to wade through the libuv code, looking for an easy way to "register" a child pid with the libuv sigchld handler and read it from node. I eventually got distracted from it and forgot about it.

Has this been working well enough for you in practice? Say you start a terminal+shell with pty.js and then start a process with node's child_process module... do they interfere with one another once they exit?

@chjj
Copy link
Owner

chjj commented Jul 30, 2013

To expand upon my concern. This is the line in question that I believe could cause interference.

$ grep waitpid ~/node/ -r
~/node/deps/uv/src/unix/process.c:    pid = waitpid(-1, &status, WNOHANG);
pid = waitpid(-1, &status, WNOHANG);

Passing -1 to waitpid just means it will return any child exit status available. That means child processes not explicitly registered through libuv could pop up here and cause problems.

@niallo
Copy link
Contributor Author

niallo commented Jul 31, 2013

I originally tried it with WNOHANG but that didn't work, I think there is some strangeness with how that works inside the SIGCHLD handler.

You make a fair point about libuv itself running waitpid. But surely that race already exists with PTY.js, whether or not you attempt to harvest status at exit in the SIGCHLD handler? I am only calling waitpid for the PID of the process I'm interested in.

@chjj
Copy link
Owner

chjj commented Jul 31, 2013

But surely that race already exists with PTY.js, whether or not you attempt to harvest status at exit in the SIGCHLD handler?

I suppose it could cause problems right now anyway. Regardless, I still think a better approach might be to use the UV api to somehow make node aware of our processes and then using the same interface node uses (uv's sigchld handler) to grab the exit status.

I originally tried it with WNOHANG but that didn't work, I think there is some strangeness with how that works inside the SIGCHLD handler.

With the current code, if you attempt to access term.status before the process has exited, does node hang? If so, that's a serious problem and I wouldn't feel comfortable merging this code. edit: nevermind, I didn't pay close enough attention to the code. That was a stupid concern.
.
edit: Thinking about it even more, I suppose one situation that could break the process is: our child process exits, the uv sigchld handler gets called, reads the exit status with waitpid(-1, ...) so it's no longer available. The OS proceeds to execute the next sigchld handler, but now there is no exit status to read because it was already "eaten" by uv's handler. I wonder what would happen in that situation. Would waitpid simply return, or would it hang?

@niallo
Copy link
Contributor Author

niallo commented Jul 31, 2013

If you access term.status before the process has exited, node doesn't hang. It returns a 0 value. All that term.status does is look up the status code from a C++ map, it can't hang.

It would be great if we could use libuv for this, libuv doesn't support forkpty though - and other than using uv_spawn I don't see how we could make node aware of our processes:

https://github.com/joyent/libuv/blob/master/include/uv.h

@niallo
Copy link
Contributor Author

niallo commented Jul 31, 2013

edit: Thinking about it even more, I suppose one situation that could break the process is: our child process exits, > the uv sigchld handler gets called, reads the exit status with waitpid(-1, ...) so it's no longer available. The OS
proceeds to execute the next sigchld handler, but now there is no exit status to read because it was already
"eaten" by uv's handler. I wonder what would happen in that situation. Would waitpid simply return, or would it hang?

Well if that were to happen, waitpid would return -1 since the process no longer exists and the PID is therefore invalid.

Furthermore, that handler is only installed as a result of a uv_spawn. Which again can still happen today.

@chjj
Copy link
Owner

chjj commented Jul 31, 2013

Well if that were to happen, waitpid would return -1 since the process no longer exists and the PID is therefore invalid.

Alright, that's what I hoped. I'll review this and merge soon.

If you access term.status before the process has exited, node doesn't hang. It returns a 0 value. All that term.status does is look up the status code from a C++ map, it can't hang.

Right, which is why I immediately edited my post and removed that from my list of concerns.

Furthermore, that handler is only installed as a result of a uv_spawn. Which again can still happen today.

Assuming waitpid would hang on a child pid which existed previously (now that I know it doesn't do this, it's not a problem), uv catching all pids with a waitpid is only a problem if pty.js calls waitpid after, which does not happen currently.

@niallo
Copy link
Contributor Author

niallo commented Jul 31, 2013

Awesome, thanks for reviewing, look forward to this being in a release!

@niallo
Copy link
Contributor Author

niallo commented Aug 2, 2013

found a silly issue in my original PR. you can only have a single SIGCHLD handler at a time! i somehow missed this.

i added a commit to retrieve node/libuv's existing SIGCHLD handler before installing pty.js'. then at the end of pty.js' SIGCHLD handler we call node/libuv's.

also - do the sigaction calls in init() rather than each fork.

@paulgration
Copy link

Just been trying this out, as I trying to resolve this in a slightly different way with waitpid on Friday last week (while this was being worked on by you as well) although I hadn't quite managed to get it working 100% in my attempts. This looks like it's working nicely though, 0 and 1 exit statuses when expected, npm doesn't have the update yet but checked it out from here. Just curious, the status is -302 while the process hasn't exited and confirmed that in pty.cc but wondered why -302?
Thanks!

@niallo
Copy link
Contributor Author

niallo commented Aug 5, 2013

@pmgration this method kinda works but as as we've found in Strider-CD/strider#162 we will need to make changes to node/libuv to eliminate all race conditions and have this 100% reliable.

@paulgration
Copy link

@niallo thanks for the info.. is there an ETA for the libuv modifications? This will be great to use when confirmed as 100% reliable

@Aldaviva
Copy link

Very helpful, thanks @niallo.

@jamesrwhite
Copy link

Any update on this?

krishnasrinivas added a commit to butlerx/wetty that referenced this pull request May 3, 2014
Not sure if there is a better way to handle this but this works.
Related PR on pty.js : chjj/pty.js#37
@ioquatix
Copy link

+1

1 similar comment
@moul
Copy link

moul commented Oct 6, 2014

👍

heavyk added a commit to heavyk/pty.js that referenced this pull request Jan 12, 2015
Merge pull request chjj#37 from niallo/waitpid

Report status code of child process in `term.status` property

* niallo/waitpid:
	missed test file
	adding travis
	node 0.10 compatibility
	set SA_NOCLDSTOP flag for SIGCHLD handler
	must set exit status correctly
	fixing interoperability with native child_process. also added tests
	call node/libuv's SIGCHLD handler.
	zap status property which krept in.
	demonstrate fetching exit status
	use a PID -> exit code map to avoid race conditions.
	save exit status of child via SIGCHLD handler and waitpid(3) may be race-y due to always storing latest in global. might want to use a map of some sort.

Conflicts:
	lib/pty.js
	src/unix/pty.cc
@amshali
Copy link

amshali commented May 1, 2015

+1

@chjj
Copy link
Owner

chjj commented Jul 23, 2015

Will merge either this or #102. Just need to clean it up a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants