Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change polling to events #397

Open
hongaar opened this issue May 30, 2017 · 33 comments
Open

Change polling to events #397

hongaar opened this issue May 30, 2017 · 33 comments

Comments

@hongaar
Copy link
Member

hongaar commented May 30, 2017

Should we eventually move away from polling and instead use webhooks to respond to various events?

@amoffat
Copy link
Contributor

amoffat commented May 30, 2017

The main downside I see to this is if chaosbot is at the mercy of people DoSing the webhook endpoints it to stop it from doing anything useful.

@hongaar
Copy link
Member Author

hongaar commented May 30, 2017

@amoffat yeah that can be a problem, although we might already be vulnerable on port 80/8081?

@PlasmaPower
Copy link
Contributor

That's currently run by ngixnx, which will be a lot more sturdy. Maybe write the events handler in a lower level language like Rust?

@hongaar
Copy link
Member Author

hongaar commented May 30, 2017

Hmmm... Looks like the most important event we need is not supported (yet): reactions

@reddraggone9
Copy link
Contributor

like Rust

@eukaryote31 😉

@PlasmaPower
Copy link
Contributor

I've also done a lot with Rust. I'd be happy to add it to the project, if it fits the requirements. 😄

@davidak
Copy link
Contributor

davidak commented May 30, 2017

Implement some DOS protection with fail2ban?

Like IPs with more than x http requests per second get banned in the firewall.

@PlasmaPower
Copy link
Contributor

We could also just block non-GitHub IPs on that port.

@mark-i-m
Copy link
Contributor

mark-i-m commented Jun 2, 2017

I've also done a lot with Rust

Same here 👍

@andrewda
Copy link
Member

andrewda commented Jun 2, 2017

@PlasmaPower that makes the most sense IMO

@chaosbot
Copy link
Collaborator

chaosbot commented Jun 5, 2017

/vote close

This issue hasn't been active for a while. To keep it open, react with 👎

@chaosbot
Copy link
Collaborator

chaosbot commented Jun 5, 2017

/vote close This issue hasn't been active for a while. To keep it open, react with 👎

Time remaining: 2:10 - Vote status: failing ⛔

@chaosbot
Copy link
Collaborator

chaosbot commented Jun 5, 2017

/vote close This issue hasn't been active for a while. To keep it open, react with 👎

Vote Failed

@chaosbot
Copy link
Collaborator

chaosbot commented Jun 8, 2017

/vote close

This issue hasn't been active for a while. To keep it open, react with 👎

@chaosbot
Copy link
Collaborator

chaosbot commented Jun 8, 2017

/vote close This issue hasn't been active for a while. To keep it open, react with 👎

Vote Failed

@dbpokorny
Copy link

I want to keep the issue open, but somebody has to write the code to make use of webhooks. Does anyone want to volunteer?

@PlasmaPower
Copy link
Contributor

@dbpokorny I might try. Current plan:

  1. Switch DB to MySQL so multiple processes can use it (separate PR)
  2. Webhook handler in Rust
  3. Handler updates DB
  4. Current polling mechanisms in code can be reduced (although not totally eliminated).
  5. Some polling mechanisms will be changed from polling GitHub to polling the local DB
  6. In case we miss some events, GitHub polling will still run once on startup, and once before an action is taken (e.g. merging a PR or closing an issue).

@dbpokorny
Copy link

@PlasmaPower this sounds like a catch-all project for installing rust, exercising mysql, and working with github webhooks intstead (or in addition to) of polling. The webhooks approach can work as a pure python implementation that does not use mysql; in other words, I don't see the rationale for introducing either (A) a new language, rust or (B) some additional mysql config into this project at this time; adding a new language and/or additional db config is fine, but neither blocks progress on webhooks.

The way you've described the webhook interface defeats the point of driving some activity on the bot via an event on github; the way you've described it, the bot will not take action in response to a webhook event.

What you want is a python function to run in response to a webhook; this is accomplished using a web server. The bot can determine the correct action to take depending on how the event impacts its internal model of the voting, stored in memory, for example voting on a PR may close or merge the PR (but will not impact the voting on other PRs).

I believe the webhooks were working on botwillacceptanything, but I'm not 100% sure, you'd have to read the code.

@PlasmaPower
Copy link
Contributor

@dbpokorny Rust was chosen partially in account for the possibility of a DDoS attack. With a weak VPS like we currently have, it might be easy to bring down a Python HTTP server.

the bot will not take action in response to a webhook event.

Most events don't have a direct trigger though. PR created? Nothing should immediately happen, instead, everything is time based. The exception to this is issue commands. I'll think about how to handle these.

The way you've described the webhook interface defeats the point of driving some activity on the bot via an event on github

The goal here is to reduce GitHub API requests, not to speed things up.

@dbpokorny
Copy link

The goal here is to reduce GitHub API requests, not to speed things up

I'm surprised that giving the bot the ability to take action in response to a webhook call is not in the scope of this issue; as far as I can tell, the plan is to have the webhooks update a database and for the bot to poll the database. Is this correct?

DDoS attack

No DDoS attacker worth their salt would say "oh, Python HTTP servers are easy to take down, but a RUST SERVER?!?!? Oh my gosh, those things are built like tanks!" We are talking about a huge botnet vs. 1 server. The scale of computing power and financial resources is not the same. There is no way we are going to withstand a DDoS attack either way, so changing from Python to Rust does not change this fundamental economic equation.

@PlasmaPower
Copy link
Contributor

@dbpokorny That's the current plan. I have a couple of options in mind for signaling the Python code (e.g. unix sockets), I'll think about them. Either way we'll have to redesign the "cron" loops.

As for DDoS, I guess I was thinking more DoS not DDoS (not a professional attack, just something small).

@PlasmaPower
Copy link
Contributor

I have an idea for a compromise. The Rust server listens for incoming messages, and then validates them (I've already written the code for this). If they succeed validation, they are then forwarded through a unix socket to the handler on the Python end. This would allow for a system where the Python code polls for new events, eliminating concurrency problems, or a more traditional system where a Python thread waits for new events in the background. We could also have a different unix socket for each event, so that some events could be polled and others responded to in real time.

The one question I'm not sure about: who creates the web hooks and the GitHub signing key? Currently my Rust code creates the signing key, but doesn't setup the web hooks.

@dbpokorny
Copy link

@PlasmaPower the python web server code can be adapted to handle webhooks. I think adding rust is fine, but frankly I'd just fork the chaosbot to rustbot/rustbot and use the code there. A pure-webhook (no polling, a.k.a. "event driven" implementation) of a democracy bot is possible this way. In other words, it may be easier to port the voting code from python to rust than to graft the rust code you already have onto the chaosbot. Your solution will work, but at a cost to cohesion. Splitting the democracy projects on github by language (botwillacceptanything: javascript; chaosbot: python; rustbot: rust) seems reasonable since a lot of work on these projects consists of "oh look at this cute library I WANT IT NOW" and the mob goes "OH YES WE NEED IT NOW" and one thing leads to another and the library is now in the bot.

Frankly I think you're going to have a hard time finding actual functionality you need that is present in the rust world and lacking in the python world and vice versa; there is simply too much active development on both for there to be any serious obstacles to porting a utility for one to a utility for the other. I'm not talking about exact duplicates, but rather close substitutes.

I believe you have code that can be used as the blueprints for a rust democracy bot that operates using a pure event-driven architecture (without polling). The problem with introducing it into the chaosbot is that it puts an extra burden on chaosbot contributors who must now understand two languages (python and rust) instead of one (python alone).

If you really want a bilingual bot, rustychaos, that is another trajectory that may yield positive results; it all depends on which langauges are used. I think there may be an opportunity for that, but the risk of introducing bilingual status for the chaosbot is that it will alienate those (such as myself) who do not know rust and are not interested in learning more about the rust ecosystem.

To answer your question, the API for creating the webhooks is here: https://developer.github.com/v3/repos/hooks/#create-a-hook

Either @amoffat will have to create the webhooks by hand or the bot will create them with the API.

@PlasmaPower
Copy link
Contributor

I already knew how to create the webhooks, and I think they should be created automatically (so we can add new events without manual intervention). I was wondering if the Python code or the Rust code should set them up, assuming we go that route of course.

You'll only need to know Rust if you're working on the very low levels of webhooks, which I doubt we'll be doing often. With my proposed solution, stuff like adding a new event to webhooks would be completely in Python land. I guess that's another argument for creating the webhooks from Python.

@dbpokorny
Copy link

@PlasmaPower so it sounds like you want to make the webhook PR the PR that also turns the chaosbot into a bilingual democracy bot.

I'm not going to vote against turning the chaosbot into a bilingual chaosbot. If you do that, be sure to notify /r/rust since chaosbot got a few thousand points on reddit in /r/programming in the recent past. In fact, you could probably get the webhooks code passed right now (if you can get it working; it is quite a lot of work to set up a testing rig if you go the test-driven development route since you have to set up a development repository and a fork of a development repository) just by going to /r/rust and asking for votes on the PR; as far as I know, the reddit TOS does not prohibit asking for votes on democracy bot pull requests. The only question is: would such content go in /r/programming /r/github /r/rust or /r/python? The thinking goes like this

if you want votes, you will rally communities that are likely to support your cause. So, a monolingual python programmer probably wouldn't support adding a language he doesn't know to the project, but a rust programmer would like the "free advertising" the chaosbot provides, because the chaosbot community is going to be spending some of its time thinking in rust and making the chaosbot available to the rust community.

So, depending on how your "community-oriented thinking" or "advertising-oriented thinking" or "network-oriented" or whatever you want to call it turns out (in other words, thinking along the lines I've outlined in the quote above) you can choose none, one, or more than one place to advertise your work and ask for votes.

I think /r/rust would be delighted, and would be delighted to be a part of the process that turns the chaosbot into the bilingual chaosbot, so that's the only reason I bring it up.

@PlasmaPower
Copy link
Contributor

I'd still see it as a Python project, with Rust's influence being minimal.

@mark-i-m
Copy link
Contributor

I tend to side more with @dbpokorny on this one. Multi-language projects tend to get a bit hairy, though rust is a great language.

I also disagree that rust's influence would be minimal, since a large portion of all PRs made so far have required using some new github API. All of these would also touch the webhook code, I guess.

That said, if you make a PR, and it gets accepted, I wouldn't have any problems, but I probably wouldn't vote in favor myself.

@PlasmaPower
Copy link
Contributor

All of these would also touch the webhook code, I guess.

With my proposed compromise, none of them would. The webhooks get verified by Rust, and then passed on to Python to be parsed and used. The Rust code would not need to be modified even if a new webhook is added.

@mark-i-m
Copy link
Contributor

The webhooks get verified by Rust

What exactly does "verified" mean here? Is it worth not just doing it in python?

@PlasmaPower
Copy link
Contributor

PlasmaPower commented Jun 10, 2017

What exactly does "verified" mean here?

The HMAC gets verified (a hash + a signature). It's passed as a header: X-Hub-Signature. We send GitHub the key for the HMAC when we create the webhook.

Is it worth not just doing it in python?

The Rust version will be significantly faster. The reason I looked into Rust is that @amoffat said:

The main downside I see to this is if chaosbot is at the mercy of people DoSing the webhook endpoints it to stop it from doing anything useful.

I think the Rust version will be at least somewhat DoS resistant due to its speed.

@mdcfe
Copy link
Contributor

mdcfe commented Jun 10, 2017

@PlasmaPower The existing webserver runs in a separate process, so if it crashes, it should allow the main ChaosBot process to continue.

@PlasmaPower
Copy link
Contributor

@md678685 If functionality depended on it that could be very bad though.

@PlasmaPower
Copy link
Contributor

PlasmaPower commented Jun 11, 2017

In addition, it seems the current web server with nginx has some problems. #551

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants