RFC1: Org Batch Markdown Exporter Job

Abstract #

Allow org admins to export one Markdown file for each RFC in their organization as part of a single-click-initiated batch job.

Background #

Currently RFC Hub is a monorepo with a single scalable homogeneous process that handles both web rendering and API calls. Deployments are kicked off via GitHub actions.

Manually backing-up RFC content would require a lot of manual copying and pasting, visiting every RFC, and would be error-prone. Overall it wouldn't be a good user experience.

Problem #

Users of RFC Hub expect the ability to export RFCs in an industry-standard format to avoid vendor lock-in. Supporting this will boost confidence and hopefully increase adoption. Since the underlying representation for RFCs is already Markdown this RFC proposes that the export process generates individual Markdown files for each RFC.

Proposal #

It could possible degrade the overall performance of the rfchub.app service if exports were to happen synchronously as part of an HTTP request. Therefore we will perform the export process, file generation, and compression of RFCs in a separate process. This separate job process will be an hourly GitHub action. The job will read from a database table, perform the export operation, and email the RFCs. Access to database connection details will be provided via environment variables.

Note

We still need to figure out how to let the GitHub Action service bypass the database firewall restrictions. I think this can be done via an API call made to the cloud provider.

architecture-beta
    group api[Architecture]

    service db(database)[Exports Table] in api
    service github(server)[GitHub Action] in api
    service mailgun(disk)[Email] in api
    service server(server)[Monolith Server] in api

    db:L <-- R:server
    github:T --> B:db
    mailgun:L <-- R:github

Metadata will be attached by way of frontmatter which is a YAML preface to the Markdown document. This concept is universal and most good Markdown parsers will ignore the preface. Overall it should be a convenient format for users to have and can be easily transformed into different formats.

Frontmatter Content #

Filename: RFC123: Overall title of RFC.md

---
title: "Overall title of RFC"
created: "2025-09-13T13:37:00Z"
label: "rfc123"
author: "Thomas Hunter II"
status: review
visibility: public
tags:
  - database
  - performance
watchers:
  - "Thomas Hunter II"
  - "Rupert Styx"
reviewers:
  - user: "Thomas Hunter II"
    status: approved
links:
  - rfc: "rfc123"
    type: obsoletes
---

# Synopsis

Markdown content is everything after the set of `---` characters.

User Interface #

Display a button on the org interface if the currently logged in user is an org admin. Clicking the button will add a row to the database table signaling a backup needs to happen. Create a unique constraint in the table so that only a single export row can exist at once to avoid abuse. Once the export is complete the user receives an email with the export attached.

Note

Need to convey to admins visiting the page that an export is underway, that way they won't attempt a second one.

Definition of success #

A developer will deploy the feature to prod and test it to make sure it works.

Alternatives Considered #

Overly Complicated Export Architecture #

We also considered this horizontally scaling auto sharded worker queue based approach:

Considered horizontally scaling auto sharded worker queue architecture
Considered horizontally scaling auto sharded worker queue architecture

With this approach, upon having the user click the mass-export button, the Node.js-based monolith server adds a new entry into a MySQL database with details on the export job. It also looks up all of the RFC IDs and then inserts them into a Kafka Queue. Node.js worker jobs run in parallel and are triggered upon consuming messages from the queue. Data about the export operation is hydrated from the MySQL database. Once the ultimate task is complete the final worker kicks off the job to send an email via Mailgun.

This approach was abandoned because customer load isn't high enough to justify the expensive architecture.

Do Nothing #

Nobody likes a walled garden.

Export HTML #

While this format is more universal it isn't necessarily easier to convert to other formats. Conversion from Markdown would also incur additional CPU overhead.

Export .doc / .docx #

These formats are both proprietary to Microsoft but also compatible with tools like Google Docs and Open Office. Again, CPU conversion is a concern.

Export to Google Drive #

This is only useful to organizations that already have a Google account. It would also require an integration with Google APIs and require maintenance.

Perform Export in Main Server Process #

Presumably the export could be fairly fast. A user could theoretically click a link and have their download a few seconds later. That said, orgs with thousands of RFCs will bog down the system. This could be used for DDoS attacks.

Future Improvements #

Include linked materials such as images.

History Log

Complete History Log
Operation Instigator Revision When
Morticia Addams had their approved review reverted Thomas Hunter II r13
New RFC revision created: 13 Thomas Hunter II r13
Linked as "relate" to RFC7 Thomas Hunter II r12
Active version of RFC changed from 9 to 12 Thomas Hunter II r12
Active version of RFC changed from 12 to 9 Thomas Hunter II r9
New RFC revision created: 12 Thomas Hunter II r12
File "fake-export-diagram-1BOQ27I0.png" added to RFC Thomas Hunter II r11
New RFC version created: 11 Thomas Hunter II r11
New RFC version created: 10 Thomas Hunter II r10
An important comment by Gomez Addams was restored Thomas Hunter II r7