Building employee onboarding that runs in one click
A few months after I shipped the offboarding MCP server, I sat down to write the mirror — onboarding. "Same systems, opposite direction, different ordering rules." I had said in the offboarding post that this was the obvious next thing.
It was not the obvious next thing. It was a different problem entirely, and the answer was a different shape of tool.
The offboarding work happens at IT's keyboard. I trigger it from Claude Code. The MCP server is the right shape because the operator is me, and I'm fluent in the orchestration tool.
Onboarding is run by HR. Sometimes by an office manager. Occasionally by a branch manager who got handed the new hire on Tuesday and needs them in the ERP by Wednesday. None of those people are going to open Claude Code. What they need is a single page with a form on it. Type the new hire's name, role, branch, email-or-not. Hit submit. Walk away. Come back, pick up a printed welcome sheet, hand it to the new hire on day one.
Same systems. Same complexity underneath. Completely different surface.
This is how I built it.
#What "create a user" actually means
The cheap version of this problem looks like "create the directory account, send the welcome email, done." That's what the click-through admin portals make it look like.
The honest version is six systems and seventeen steps:
- Validate that the username isn't already taken in the directory
- Validate that it isn't already taken in the ERP
- Validate that the chosen taker password (a separate field, used for ERP-internal authentication) is unique
- Create the directory user via LDAPS, set the password using the right encoding, enable the account by setting the right userAccountControl bits, place it in the right organizational unit
- If email is enabled, wait for the on-prem-to-cloud directory sync to propagate the new user to the cloud — typically 30 to 90 seconds, sometimes longer, occasionally never
- Assign a cloud mail license via the directory API
- Add to the right cloud groups (company-wide groups, location-specific groups, role-specific groups)
- Create the ERP user — which is not one INSERT. It's four INSERTs across four tables, all of which need to succeed or all of which need to roll back.
- Copy the relevant security and configuration rows from a template user matching the new hire's role
- Generate the printed welcome sheet listing every credential the new hire will need on day one — their workstation login, their cloud email, their ERP username, and a handful of internal app accounts that all read from the same source of truth
- Print it
- Hand it over
There is no version of this problem that is one button on one screen. There is only one version of the user experience that is one button on one screen. The work behind it is what makes the experience real.
#The shape of the tool
Go, HTMX, Bootstrap, Alpine.js, no build step. Single binary deployed as a systemd service. A SQL connection (the ERP), an LDAP connection (the directory), and direct HTTP calls to the cloud directory API. One web app, three external systems, no intermediate brokers.
The form has eight fields. First name. Last name. Username (auto-suggested). Role (dropdown — fed from the ERP's role definitions). Branch location (dropdown). Email enabled (checkbox). Manager (dropdown — typeahead from existing directory users). Notes (free text, optional, lands in the welcome sheet).
The submit button kicks off a Go function that runs the seventeen-step provisioning pipeline. Each step's outcome streams back to the page over HTMX as it happens — so the operator watches the workflow run in real time:
✓ Validated directory username
✓ Validated ERP username
✓ Created directory user
… Waiting for cloud directory sync (3 of 12 polls)
✓ Synced to cloud mail
✓ Assigned standard mail license
✓ Added to All Company group
✓ Added to [Location] group
✓ Created ERP user (4 tables)
✓ Generated welcome sheet
[ Print Welcome Sheet ]
When everything's green, a print button appears. The PDF opens in a new tab, generated from a template using maroto (a Go PDF library), with the new hire's name on the header and every credential they'll need below.
The whole flow takes about ninety seconds when sync is fast and three minutes when it isn't. Either way, it's one screen of work for the human, and the human is doing nothing the whole time.
#The four-table insert that took two weeks
The single hardest piece of this whole project was the ERP user creation.
In the directory, "create a user" is a single LDAPS operation. In the cloud directory API, "create a user" is a single POST. In the ERP, "create a user" is four tables, each with their own constraints, foreign keys, and required column sets.
The four tables, roughly:
- users master — the user master row. ~70 columns. Most have defaults. Some don't. The primary key is generated server-side, returned via
OUTPUT INSERTED.users_uid. - user preferences — the user's display preferences and role binding. Five required columns.
- user-company mapping — the cross-reference between user and company. In a single-tenant deployment, one row.
- application security — the security model. Multiple rows, one per ERP module the user has access to. The shape of these rows is determined by the template user whose role the new hire is matching.
Two of those tables can fail their constraints in ways that won't be visible until you query the ERP afterward and find a half-created user — present in the master row but missing the security rows that make the user actually usable. The ERP's UI will show that user as "invalid configuration" and refuse to let them log in. Cleaning up takes manual SQL, intuition about which template the user was supposed to clone from, and the kind of working memory that gets exhausted at 4 PM on a Friday.
The fix was to wrap all four inserts in a SQL transaction:
tx, err := db.BeginTx(ctx, nil)
if err != nil {
return fmt.Errorf("begin tx: %w", err)
}
defer func() {
if p := recover(); p != nil {
tx.Rollback()
panic(p)
} else if err != nil {
tx.Rollback()
} else {
err = tx.Commit()
}
}()
usersUID, err := insertUsersRow(ctx, tx, hire)
if err != nil { return fmt.Errorf("insert users: %w", err) }
err = insertUserPrefs(ctx, tx, usersUID, hire)
if err != nil { return fmt.Errorf("insert user prefs: %w", err) }
err = insertUserCompany(ctx, tx, usersUID, hire.CompanyID)
if err != nil { return fmt.Errorf("insert user-company: %w", err) }
err = copyAppSecurityFromTemplate(ctx, tx, usersUID, hire.RoleTemplate)
if err != nil { return fmt.Errorf("copy app security: %w", err) }
return nil
The shape is unremarkable. "Begin a transaction, do the work, commit or roll back." What took two weeks wasn't the Go code. It was figuring out what each row was supposed to contain — which fields the ERP requires that aren't documented anywhere outside of internal vendor support tickets, which template-user assumptions break when a new hire's role doesn't have a clean template, which application security rows are even safe to copy and which ones encode the template user's specific identity in a way that needs translation.
You can't write that code from documentation. You can only write it from observation. Create a user through the ERP's official UI, query the database afterward, see what changed in each table, repeat for every role you intend to support, and build up the picture from there. The process is identical to reverse-engineering a closed protocol — except instead of a protocol, you're reverse-engineering what a vendor's stored procedure does across four tables.
The result is a createERPUser() function that has now created several dozen users with zero half-broken states. The ERP's official tooling has produced more half-broken states than my code has, in the same period.
#The directory password gotcha
The single best war story in this project is also the most embarrassing.
To set a password on a directory user via LDAPS, you set the unicodePwd attribute. Easy. Send the password as a string. Receive a 200. Move on.
Except: you don't send the password as a string. You send it as UTF-16LE-encoded bytes, surrounded by double-quote characters, with no BOM.
func encodeDirPassword(password string) []byte {
quoted := `"` + password + `"`
encoder := unicode.UTF16(unicode.LittleEndian, unicode.IgnoreBOM).NewEncoder()
encoded, _ := encoder.Bytes([]byte(quoted))
return encoded
}
Send a regular UTF-8 string and the directory rejects it with 0x00002077 — ERROR_INVALID_PASSWORD, which sounds like "the password is invalid" but actually means "the encoding is wrong." Send UTF-16LE without the surrounding quotes and you get a different error. Send it with a BOM and you get a third error.
I spent four hours on a Saturday on this one. The fix is fifteen lines of Go. The discovery process was reading three vendor KB articles, two Stack Overflow threads from 2009, and one helpful comment in an obscure ldap.h header file before I figured out the actual encoding requirement. The directory protocol was designed in the 1990s and the wire format still expects 1990s text encoding. Once you know, you know forever. Until then, the error messages all say the same misleading thing.
The lesson here is one I keep relearning: when an API rejects your input with a vague error, try a different encoding before you assume your data is wrong. Half the "invalid password" / "invalid name" / "invalid GUID" errors I've ever debugged have been encoding issues, not data issues. The data was correct. The wire format was wrong.
#The async sync nobody warns you about
Once the directory user is created, you'd think the next step is "call the cloud directory API and assign the mail license."
You can't. Not yet.
The on-prem directory user lives in one place. The cloud user lives in another. The bridge between them is a sync agent — a vendor-supplied process running on a server somewhere in your environment that synchronizes on-prem to cloud on a thirty-minute default cadence.
The default. The cadence is configurable. And the user can be triggered to sync immediately if you have access to run the right cmdlet on the sync server. Most onboarding code I've seen just waits.
The right shape is to poll the cloud directory API for the user's existence, with backoff:
func waitForCloudSync(ctx context.Context, dir *DirClient, upn string) error {
backoffs := []time.Duration{
2*time.Second, 5*time.Second, 10*time.Second, 15*time.Second,
20*time.Second, 30*time.Second, 30*time.Second, 60*time.Second,
60*time.Second, 60*time.Second, 90*time.Second, 90*time.Second,
}
for i, wait := range backoffs {
select {
case <-ctx.Done():
return ctx.Err()
case <-time.After(wait):
}
user, err := dir.GetUserByUPN(ctx, upn)
if err == nil && user != nil {
return nil
}
// log progress for the UI
notifyProgress(ctx, fmt.Sprintf("Waiting for sync (%d/%d)", i+1, len(backoffs)))
}
return errors.New("cloud sync timed out after ~7 minutes")
}
The user appears, on average, after about ninety seconds. Sometimes thirty. Sometimes four minutes. Once, during a failed sync run, never — and the operator got a clear timeout message, the on-prem account was already provisioned, the welcome sheet was generated with everything except the cloud password, and the operator could re-run only the cloud step the next morning.
That last part — "clear timeout message, partial state preserved, operator can re-run only the failing step" — is the actual feature. Multi-system provisioning is mostly not the happy path. It's mostly handling the seventeen ways things can fail and recover.
#The welcome sheet
The closing flourish is the printed welcome sheet. The new hire arrives on Monday and is handed a single piece of paper with everything they need:
- Their workstation username and temporary password (set to expire on first login)
- Their cloud email and temporary password
- Their ERP username and taker password
- The internal-only credentials for two other systems that share the same source-of-truth user record
- The web URLs they'll use day one
- The IT contact for "please reset my password because I just forgot it"
PDF generation is maroto v2. Go-native, declarative, prints clean on a regular printer. The welcome sheet is one of the parts of this project I'm most quietly proud of, because it doesn't exist before this tool runs — every previous version of new-hire onboarding involved an IT person typing those credentials into an email by hand, then a user printing that email and inevitably losing the printout. The printed sheet, generated programmatically from the same record that provisioned the user, eliminates the typing step and the lost-email step at the same time.
Small thing. Adds up.
#What I learned
The shape of the tool is the shape of the operator. I built the offboarding workflow as an MCP server because the operator is me. I built the onboarding workflow as a single-page web app because the operator is HR. Same underlying systems, completely different surface. When you're tempted to build one tool for two audiences, ask whether the audiences should be using two tools.
Atomicity at the boundary is your job, not the system's. The directory commits its changes when LDAPS returns success. The cloud directory API commits when its API returns 200. SQL commits when the transaction commits. None of those commits are aware of each other. If the directory account succeeds and the mail license fails and the ERP user is half-created, you have an atomicity problem the underlying systems will not solve for you. You have to define the boundary, run the work inside the boundary, and roll back coherently when something fails. This is true of every multi-system workflow I have ever shipped.
The 1990s are still in the wire format. The directory password encoding gotcha — UTF-16LE-quoted-with-no-BOM — is the kind of thing that exists only because the directory protocol was specified before the internet decided UTF-8 was the default. There are five or six of these landmines in the legacy enterprise stack, and you find them one Saturday at a time. Document them when you find them. Future you will thank present you.
Async dependencies are easier to expose than to hide. The cloud sync wait could have been hidden — fire-and-forget, hope it works, deal with failures by hand. Exposing it as a polled progress indicator in the UI made every operator immediately understand why onboarding takes 90 seconds, what the system is doing, and how to know when something has gone wrong. Hidden async work is the source of half the "is it broken?" questions in any production system.
The printed welcome sheet is the part the user remembers. I spent two weeks on the four-table insert. The new hire experiences ninety seconds of waiting and then a piece of paper. The piece of paper is the part of the project they will tell their friends about. The technical sophistication is invisible by design.
#What's next
The mirror of what's next in the offboarding post: this pattern keeps generalizing. Mid-employment changes — name changes after marriage, role changes when someone gets promoted, branch transfers — are the next obvious extensions. Same systems. Same atomicity. Different verbs.
For now, every new hire gets onboarded through this tool. The IT team's calendar reminders to "don't forget to remove from the All Company group" and "don't forget to set the taker password" and "don't forget to print the welcome sheet" — those reminders are gone. The reminder was the symptom. The tool is the cure.
That, plus the offboarding MCP, plus a pile of smaller automations, are the parts of my job I am most quietly proud of. They are not glamorous. They are not visible to anyone outside IT. They run thousands of times per year now and they will keep running. The reduction in operator load compounds forever. Most platform engineering looks like this when nobody is watching.