200 lines
8.7 KiB
Org Mode
200 lines
8.7 KiB
Org Mode
|
:PROPERTIES:
|
||
|
:ID: 83d61eef-0781-46e0-b959-1a739cff5ea3
|
||
|
:END:
|
||
|
#+title: Stripe Poller
|
||
|
|
||
|
Identify webhook events tracked by Stripe that have not yet been processed by
|
||
|
our service, and replay them against it.
|
||
|
|
||
|
* Investigation
|
||
|
** Fetching events on behalf of each connected account
|
||
|
On the afternoon of [[id:8c85f055-9d0c-4b7b-991e-1e32905d38ba][2021-07-02]] I fetched the 860 connected accounts and fetched
|
||
|
webhook events from the previous two hours. Each request to Stripe took, on
|
||
|
average, 0.38 seconds, requiring one request per account, totalling 323.77
|
||
|
seconds, or nearly five and a half minutes. No degradation nor failure of Stripe
|
||
|
functionality was noted. From all of these requests, a total of 19 events were
|
||
|
retrieved.
|
||
|
** Alternatives
|
||
|
*** The Stripe dashboard
|
||
|
Viewing a particular webhook within the Stripe dashboard allows us to view the
|
||
|
events sent specifically to that webhook, independent of the account to which it
|
||
|
relates. Regrettably, this is fed from a dashboard-specific API which does not
|
||
|
appear to be exposed elsewhere for consumption.
|
||
|
*** Email notification
|
||
|
We are emailed when a webhook event fails to process. It may make sense to alert
|
||
|
when this occurs and correct the issue manually, or find a way to automate its
|
||
|
replay.
|
||
|
** Conclusion
|
||
|
Barring the release of an API to fetch all events sent to a single webhook,
|
||
|
ideally filtered by delivery status, fetching these events from Stripe is
|
||
|
terribly inefficient. Though we can process recent events in a timely manner, we
|
||
|
will likely be further slowed as more customers connect with Stripe, and there
|
||
|
is no good solution at this time for dealing with that problem. Given the
|
||
|
infrequency with which Stripe fails to send us an event via their retry
|
||
|
mechanisms, I expect we will be better served by reacting to notifications of
|
||
|
those failures than to routinely slam their API with inefficient requests.
|
||
|
* Tasks
|
||
|
** Build Poller
|
||
|
*** Prepare a rundeck job for the Stripe poller
|
||
|
- Create the project
|
||
|
- Build the rundeck job
|
||
|
*** Add a stripe-payments endpoint to fetch logged events
|
||
|
- Needs only return event ids
|
||
|
- Events are partitioned by date and sorted by time in the time-index GSI, which projects keys only
|
||
|
|
||
|
#+begin_src yaml
|
||
|
paths:
|
||
|
/stripe/events:
|
||
|
get:
|
||
|
summary: Fetch webhook events
|
||
|
tags:
|
||
|
- Webhooks
|
||
|
parameters:
|
||
|
- name: last_id
|
||
|
in: query
|
||
|
schema:
|
||
|
$ref: '#/components/schemas/EventId'
|
||
|
- name: limit
|
||
|
in: query
|
||
|
description: Number of events to return per page of results
|
||
|
schema:
|
||
|
type: integer
|
||
|
minimum: 1
|
||
|
maximum: 1000
|
||
|
default: 100
|
||
|
responses:
|
||
|
'200':
|
||
|
description: Event list
|
||
|
headers:
|
||
|
Link:
|
||
|
description: Pagination links
|
||
|
schema:
|
||
|
type: string
|
||
|
example: >-
|
||
|
<{{ base_url }}/stripe/events?last_id=evt_1234abcdef>; rel="next"
|
||
|
content:
|
||
|
application/json:
|
||
|
schema:
|
||
|
type: array
|
||
|
items:
|
||
|
$ref: '#/components/schemas/EventId'
|
||
|
components:
|
||
|
schemas:
|
||
|
EventId:
|
||
|
type: string
|
||
|
description: Unique identifier for the webhook event
|
||
|
example: "evt_1IHAsBIoFf3wvXpR7VLvGfae"
|
||
|
|
||
|
#+end_src
|
||
|
*** Add integrations endpoint enumerating connected Stripe accounts
|
||
|
*** Reprocess unlogged Stripe events
|
||
|
- Iterate processed event ids into memory
|
||
|
- Iterate over published events, sending them to the stripe-payments webhook
|
||
|
endpoint if they are not in the processed set
|
||
|
|
||
|
#+begin_src plantuml :file stripe-poller.svg
|
||
|
loop until last page of results
|
||
|
Poller -> StripePayments : GET /stripe/events?since=<timestamp>[&last_id=<last_id>]
|
||
|
StripePayments --> Poller : Return list of event IDs
|
||
|
end
|
||
|
loop until last page of results
|
||
|
Poller -> Integrations : Get list of connected accounts
|
||
|
Integrations --> Poller : Return list of Stripe account IDs
|
||
|
loop for each account
|
||
|
loop until last page of results
|
||
|
Poller -> Stripe : GET /v1/events?created[gte]=<timestamp>&types[]=<...>[&starting_after=<last_id>]
|
||
|
StripePayments --> Poller : Return list of event IDs
|
||
|
alt is an AWeber Ecommerce event and event not in processed
|
||
|
Poller -> StripePayments : POST /stripe/webhooks
|
||
|
end
|
||
|
end
|
||
|
end
|
||
|
end
|
||
|
#+end_src
|
||
|
|
||
|
#+RESULTS:
|
||
|
[[file:stripe-poller.svg]]
|
||
|
|
||
|
**** Determining how far back to look for events
|
||
|
Stripe events are retained for a maximum of thirty days, on their end and also
|
||
|
in our event database.
|
||
|
|
||
|
The job could use consul to store the time of the latest fetched event to be
|
||
|
referenced in subsequent runs. If it is set, it should collect all events newer
|
||
|
than one hour prior to that time (this window may also be configurable). If that
|
||
|
time is not set, it should collect all available events. This value should be
|
||
|
updated in consul only when the job completes successfully.
|
||
|
|
||
|
Hit the Rundeck API to get the time of the last successful execution.
|
||
|
|
||
|
**** Retrieving webhook events
|
||
|
https://stripe.com/docs/api/events/list
|
||
|
|
||
|
+ Events can be filtered by type, creation time, and whether they were
|
||
|
successfully delivered (the webhook endpoint returned a =200 OK= response)
|
||
|
- Wait, how does it know it was succesfully delivered to /our/ webhook
|
||
|
endpoint?
|
||
|
+ Should we set up another "client" with its own rate limiting with Stripe for
|
||
|
the poller's requests?
|
||
|
|
||
|
#+CAPTION: Support chat on [2021-07-01 Thu]
|
||
|
#+begin_quote
|
||
|
- My team is hoping to build an automated process to check via the Stripe API
|
||
|
for any events that weren't successfully delivered to our webhook endpoint and
|
||
|
see that they're handled appropriately. I've been trying out the events list
|
||
|
endpoint, and it does not seem to return all of the events that were sent to
|
||
|
our webhook, nor does there appear to be any way to identify a webhook to that
|
||
|
endpoint to find and filter events for it. We assume this is because the
|
||
|
events are triggered from connected accounts. We noticed that the Stripe
|
||
|
dashboard is able to show events sent to a webhook and their status, is there
|
||
|
some way to achieve this through the API?
|
||
|
+ /Hector Otero has joined./
|
||
|
+ Hi, there. I am glad to help.
|
||
|
- Hello. Are you able to see my previous message?
|
||
|
+ Yes, I am just reading through it.
|
||
|
+ Ok, so your main question is if there is a way to show events sent to a webhook and their status using the API, similar to how it is done on the Stripe dashboard.
|
||
|
+ Is that correct?
|
||
|
- Correct
|
||
|
+ Ok, my brief look into the documentation has not turned up anything regarding how to implement this. I am going to consult with my team members regarding whether any of them knows how to implement this functionality. Once I have more info I can reach out to you via email, is that alright?
|
||
|
- Yes, thank you.
|
||
|
+ Sure, no worries. Please keep an eye on your email inbox awapi@aweber.com for further correspondence regarding this issue.
|
||
|
+ Have a nice day! we will be in contact soon.
|
||
|
- Thanks!
|
||
|
+ /Hector Otero has left./
|
||
|
#+end_quote
|
||
|
|
||
|
**** Signing the webhook request
|
||
|
Webhook events [[https://stripe.com/docs/webhooks/signatures][must be signed using a secret key]], which we store [[http://consul.service.production.consul/ui/production/kv/services/cp/services/stripe-payments/stripe_webhook_secret/edit][in consul]]. The
|
||
|
following /should/ result in a valid signature header:
|
||
|
|
||
|
#+CAPTION: Stripe signature generation example
|
||
|
#+begin_src python :results output :exports code
|
||
|
import hmac
|
||
|
import hashlib
|
||
|
import time
|
||
|
|
||
|
unix_timestamp = int(time.time())
|
||
|
json_payload = '{ ... }'
|
||
|
secret_key = 'secret'
|
||
|
|
||
|
message = f'{unix_timestamp}.{json_payload}'
|
||
|
signature = hmac.new(bytes(secret_key, 'utf-8'),
|
||
|
msg=bytes(message, 'utf-8'),
|
||
|
digestmod=hashlib.sha256).hexdigest()
|
||
|
|
||
|
print(f'Stripe-Signature: t={unix_timestamp},v1={signature}')
|
||
|
#+end_src
|
||
|
|
||
|
#+RESULTS:
|
||
|
: Stripe-Signature: t=1625084385,v1=dce1ef0332969bce98fd76b5fd08d1b07af0d0fd5f9788d9f8435537e5c3cd12
|
||
|
*** Create playbook and dashboard for the Stripe poller
|
||
|
- Track how many events are resent for processing vs how many are already processed
|
||
|
** KILL Create an event processing endpoint
|
||
|
Create an endpoint in the Stripe Payments service for internal use that will,
|
||
|
given an event id, fetch that event from Stripe and process it as though it had
|
||
|
been sent to the webhook endpoint.
|
||
|
** TODO Document how to find and replay a Stripe event
|
||
|
- Use the Stripe UI to find failed events within the past 15 days
|
||
|
- Include steps for fetching an event from more than 15 days ago from the API
|
||
|
and sending that to the webhook endpoint.
|