Revenue Cycle SQL Interview Questions
Hospital revenue cycle interviews are not generic SQL screens. They test whether you can read a charge / claim / remit chain, calculate days in A/R from open claims, isolate a denial rate by payer, and reason about the gap between what was billed, what was allowed, and what was paid. This guide walks through the SQL skills RCM analysts get asked about, with worked examples on a realistic hospital schema and a 25-mission path that drills each pattern end to end.
If you are interviewing for a healthcare revenue cycle data analyst role, an RCM analyst SQL test, or a payer-side claims analytics seat, the patterns below are the ones that show up in take-home assignments and live whiteboards.
What RCM data actually looks like
New to revenue-cycle data? The charge–claim–remit primer walks through the chain in plain English before you dive into the schema.
The whole hospital revenue cycle is a three-step chain: a clinician documents a service and a charge is created, billing submits a claim to the payer, and the payer eventually sends back a remit. Every interview question on this page reduces to a join across those three events. The caseSQL Hospital RCM database models that chain in three fact tables — fact_charges (12,000 rows), fact_claims (10,969 rows), and fact_remits (9,693 rows). The row counts shrink at each step on purpose: not every charge becomes a claim, and not every claim has been paid or denied yet. That gap is where the analytics live.
Around the facts sit four dimensions: dim_patient, dim_payer (with a plan_type and an expected_pct_of_billed contract rate), dim_provider (rendering provider with a department), and dim_cpt (procedure codes — see the AAPC primer on CPT codes if the vocabulary is new). fact_charges joins to all four dims; claims and remits hang off the charges through charge_id and claim_id. The mental model that interviewers want you to have is straightforward: charges are the clinical truth, claims are the financial ask, remits are the answer. Most hospital billing SQL questions ask you to compare two of those three.
Question patterns you’ll be asked
Across hospital and payer-side interviews, six patterns dominate. Each one has a distinct SQL shape, and once you can recognize the shape you can almost always write the query. The worked examples below are deliberately adjacent to the exercises in the mission path — same skill, different angle — so you can read the example, then go practice the variant.
Days in A/R by payer
“How long does it take you to collect from each payer?” is the most common opener in a hospital billing SQL interview. The answer is days in AR SQL: average days outstanding for claims that have not yet been resolved. The skill being tested is filtering open A/R correctly and computing a date difference — anti-joining against remits, or filtering on claim_status for claims that have not yet been paid.
A worked example, grouped by department instead of payer:
SELECT pr.department,
COUNT(*) AS open_claims,
ROUND(AVG(julianday('2026-04-30') - julianday(c.submission_date)), 1) AS avg_days_in_ar
FROM fact_claims c
JOIN fact_charges fc ON c.charge_id = fc.charge_id
JOIN dim_provider pr ON fc.provider_id = pr.provider_id
WHERE c.claim_status IN ('submitted', 'appealed')
GROUP BY pr.department
ORDER BY avg_days_in_ar DESC;Practice this in Mission #11 — same date-difference skill on payer-level grouping with the dollar value of the open balance.
Denial rate by payer or CPT
A denial rate SQL query is a conditional-aggregation drill. The interviewer wants to see SUM(CASE WHEN ... ) divided by COUNT(*), and they want you to remember that denial reasons live on remits — the claim status is just the summary flag.
Worked example, denial rate by plan_type with a high-volume floor:
SELECT p.plan_type,
COUNT(*) AS total_claims,
ROUND(100.0 * SUM(CASE WHEN c.claim_status = 'denied' THEN 1 ELSE 0 END)
/ COUNT(*), 1) AS denial_pct
FROM fact_claims c
JOIN dim_payer p ON c.payer_id = p.payer_id
GROUP BY p.plan_type
HAVING COUNT(*) >= 500
ORDER BY denial_pct DESC;Try this on Mission #9 — same conditional-aggregation pattern, but cut by CPT code with a 200-claim volume cutoff.
Claim-to-remit lag
The claim to remit lag SQL question is where window functions earn their keep. The event stream has two row types — submission and remittance — and you need the gap from one to the next on the same claim. LAG() partitioned by claim, ordered by date, gives you that adjacency.
Worked example using a simple two-table date subtraction (no UNION) to find the payer’s 90th-percentile lag:
SELECT p.payer_name,
MAX(julianday(r.posted_date) - julianday(c.submission_date)) AS slowest_lag_days,
ROUND(AVG(julianday(r.posted_date) - julianday(c.submission_date)), 1) AS avg_lag_days
FROM fact_remits r
JOIN fact_claims c ON r.claim_id = c.claim_id
JOIN dim_payer p ON c.payer_id = p.payer_id
WHERE c.claim_status = 'paid'
GROUP BY p.payer_id, p.payer_name
ORDER BY avg_lag_days DESC;See Mission #15 for the full setup — that one builds the same lag using a UNION ALL event stream and a LAG window function, which is the version you’ll see on advanced screens.
Clean claim rate per payer
Clean claim rate SQL is the operational mirror of denial rate: percentage of claims that paid on the first submission with no rework. The skill is the same conditional aggregation, but the framing is positive instead of negative — interviewers use the distinction to see whether you understand that paid and denied are not the only two outcomes.
Worked example by service line:
SELECT cpt.service_line,
COUNT(*) AS total_claims,
ROUND(100.0 * SUM(CASE WHEN c.claim_status = 'paid' THEN 1 ELSE 0 END)
/ COUNT(*), 1) AS clean_claim_rate_pct
FROM fact_claims c
JOIN fact_charges fc ON c.charge_id = fc.charge_id
JOIN dim_cpt cpt ON fc.cpt_code = cpt.cpt_code
GROUP BY cpt.service_line
ORDER BY clean_claim_rate_pct ASC;Practice on Mission #7 — same skill at the payer grain, ordered to surface the worst performer first.
Accounts receivable aging buckets
An accounts receivable aging SQL question wants you to bucket open claims into the classic 0–30 / 31–60 / 61–90 / 90+ ranges. The skill is a CASE expression on a date difference plus a careful filter for what counts as “open.” Resolved claims (paid or written-off) should be excluded.
Worked example with the aging cut by plan_type so you can see whether commercial or government balances are aging fastest:
SELECT p.plan_type,
SUM(CASE WHEN julianday('2026-04-30') - julianday(c.submission_date) <= 30 THEN 1 ELSE 0 END) AS bucket_0_30,
SUM(CASE WHEN julianday('2026-04-30') - julianday(c.submission_date) BETWEEN 31 AND 60 THEN 1 ELSE 0 END) AS bucket_31_60,
SUM(CASE WHEN julianday('2026-04-30') - julianday(c.submission_date) BETWEEN 61 AND 90 THEN 1 ELSE 0 END) AS bucket_61_90,
SUM(CASE WHEN julianday('2026-04-30') - julianday(c.submission_date) > 90 THEN 1 ELSE 0 END) AS bucket_90_plus
FROM fact_claims c
JOIN dim_payer p ON c.payer_id = p.payer_id
WHERE c.claim_status IN ('submitted', 'appealed')
GROUP BY p.plan_type
ORDER BY p.plan_type;Practice on Mission #10 — same bucketing logic delivered in long format with the open-balance dollar amount per bucket.
First-pass yield with NTILE
First pass yield SQL questions test ranking and bucketing at the payer level. After you compute FPY for each payer, the follow-up is almost always “put them into quartiles so we can route the bottom quartile to a recovery team.” That is NTILE(4) in a CTE.
Worked example computing FPY by department and partitioning into quintiles:
WITH dept_fpy AS (
SELECT pr.department,
COUNT(*) AS total_claims,
ROUND(100.0 * SUM(CASE WHEN c.claim_status = 'paid' THEN 1 ELSE 0 END)
/ COUNT(*), 1) AS fpy_pct
FROM fact_claims c
JOIN fact_charges fc ON c.charge_id = fc.charge_id
JOIN dim_provider pr ON fc.provider_id = pr.provider_id
GROUP BY pr.department
)
SELECT department,
total_claims,
fpy_pct,
NTILE(5) OVER (ORDER BY fpy_pct DESC) AS fpy_quintile
FROM dept_fpy
ORDER BY fpy_pct DESC;Drill this on Mission #12 — same NTILE shape at the payer grain with quartiles instead of quintiles.
Underpayment detection
Every payer contract specifies an expected percentage of billed charges. Underpayment detection is a join of fact_remits against dim_payer.expected_pct_of_billed with a filter for paid amount under the contracted threshold. Interviewers use this question to see whether you understand that allowed amount is not paid amount and that patient responsibility is its own column.
Worked example aggregated by payer with a 10% shortfall threshold and a count of offending claims:
WITH expectations AS (
SELECT c.payer_id,
c.submitted_amount * p.expected_pct_of_billed AS expected_amount,
r.paid_amount
FROM fact_remits r
JOIN fact_claims c ON r.claim_id = c.claim_id
JOIN dim_payer p ON c.payer_id = p.payer_id
WHERE r.paid_amount > 0
)
SELECT p.payer_name,
COUNT(*) AS underpaid_claims,
ROUND(SUM(expected_amount - paid_amount), 2) AS shortfall_dollars
FROM expectations e
JOIN dim_payer p ON e.payer_id = p.payer_id
WHERE e.paid_amount < 0.90 * e.expected_amount
GROUP BY p.payer_id, p.payer_name
ORDER BY shortfall_dollars DESC;Practice on Mission #14 — same expected-vs-paid skill at the individual claim grain with a per-claim shortfall percentage.
Payer concentration with HHI
Senior-track interviews bring in the Herfindahl–Hirschman Index to score how concentrated revenue is across payers — a 10,000 means one payer owns the segment, a low score means it is fragmented. The skill being tested is two-stage aggregation: shares first, then sum-of-squares.
Worked example computing HHI per department instead of service line, with a column flagging where one payer alone makes up more than 40 percent of claims:
WITH dept_payer AS (
SELECT pr.department, c.payer_id, COUNT(*) AS claim_count
FROM fact_claims c
JOIN fact_charges fc ON c.charge_id = fc.charge_id
JOIN dim_provider pr ON fc.provider_id = pr.provider_id
GROUP BY pr.department, c.payer_id
),
totals AS (
SELECT department, SUM(claim_count) AS total FROM dept_payer GROUP BY department
),
shares AS (
SELECT dp.department,
100.0 * dp.claim_count / t.total AS share_pct
FROM dept_payer dp JOIN totals t ON dp.department = t.department
)
SELECT department,
ROUND(SUM(share_pct * share_pct), 1) AS hhi,
MAX(share_pct) > 40 AS dominant_payer
FROM shares
GROUP BY department
ORDER BY hhi DESC;Mission #18 has the canonical setup — same HHI calculation grouped by service line with a distinct-payer count.
Common gotchas
Four traps catch candidates who are otherwise solid SQL writers. Each one comes from assuming RCM data behaves like a generic transactional system; it does not. Most interviewers do not state these as rules — they watch which assumption you make and then ask follow-up questions to see whether you noticed. Knowing the traps in advance is most of the battle.
Denial reasons live on remits, not claims
A common reflex is to filter fact_claims for claim_status = ‘denied’ and call it done. That gives you the count, but not the why. The denial code and denial reason are on fact_remits (denial_code, denial_reason) — the same vocabulary as the CMS Claim Adjustment Reason Codes standard. If your interviewer asks for top denial reasons by department, you must join through to the remit.
A claim with no remit is in A/R, not denied
If you LEFT JOIN fact_claims to fact_remits and treat NULL on the remit side as “denied,” you have just miscategorized your entire open A/R. Claims without a matching remit are still pending — they belong in the aging report, not the denial report. Check claim_status before you decide what the absence of a remit means.
Allowed, paid, and patient-responsibility are three different columns
On a real remit, the payer says “we allow $X, we are paying $Y, the patient owes $Z, and we wrote off $W.” Those are four different numbers and they do not equal each other. If a question asks for collections, you want paid_amount. If it asks for the contracted reimbursement, you want allowed. The fastest way to fail a payer-side interview is to add paid and patient responsibility together and call it revenue.
CPT modifiers can hide five-figure swings
Two charges with the same CPT code can pay very differently if one carries a modifier (a global-period flag, an assistant-surgeon indicator, a bilateral procedure marker). Aggregating on cpt_code alone hides the variance. When a question is about “why are we collecting less than expected on procedure 99213,” the modifier column is usually the answer. The same logic applies to write-off categorization: a single denial reason code can map to several distinct operational fix categories, and rolling everything up to the CPT level erases the signal a billing director needs to staff the right team.
Charge capture lag is silent revenue loss
The gap between the date a service was rendered and the date it was charged is called charge capture lag. Long lags push you up against timely-filing limits — once you miss a payer’s window the claim is auto-denied with no appeal path. When an interviewer asks for a charge-lag distribution by department, what they are actually asking is “which clinical area is at risk of writing off revenue we already earned?” Compute the lag from the service date to the charge-creation date, bucket it the same way as AR aging, and flag anything past the payer’s shortest timely-filing limit.
Schema diagram
The three-fact chain at a glance. Charges feed claims, claims feed remits; each fact joins out to dimensions for patient, payer, provider, and CPT.
Solid arrows show the event flow (charge → claim → remit). Dashed lines show dimension joins; dim_payer is referenced from fact_claims.payer_id and dim_patient / dim_provider / dim_cpt from fact_charges.
Full mission list
All 25 Hospital RCM missions are listed above ↑ on this page, ordered from starter SQL through the capstone. Each mission opens an interactive editor wired to a real SQLite copy of the schema described in this guide — same fact tables, same row counts, same dimension columns. Pick the pattern you want to drill, click in, and the briefing tells you exactly what KPI you are computing.
FAQ
What SQL skills do RCM analyst interviews actually test?
The recurring set is conditional aggregation (denial rate, clean claim rate), date-difference math (days in A/R, claim-to-remit lag), bucketing with CASE (AR aging), and ranking with window functions like NTILE, RANK, and LAG. Senior screens add HHI for payer concentration and contract-vs-paid underpayment logic.
How do you calculate days in A/R in SQL?
Filter fact_claims to open statuses (typically submitted and appealed), then average the date difference between today and submission_date grouped by payer or department. The trap is including paid claims; once a claim is paid it is no longer in receivables. Mission #11 walks the full version.
What is the difference between denial rate and first-pass yield?
Denial rate is the share of claims that came back denied. First-pass yield is the share that paid on the first submission with no rework. They are not strict opposites: a claim can be pending, partially paid, or appealed and not be in either bucket. Operationally, denial rate is a quality alarm; FPY is a cash-flow forecast.
How do you handle claim-to-remit lag with window functions?
Build a UNION ALL event stream that emits one row per submission and one per remittance, all stamped with the claim id and event date. Order by event date inside a PARTITION BY claim_id, then use LAG() to pull the prior event onto each row. The lag in days is the difference between the current and prior event. Mission #15 has the canonical setup.
What is accounts receivable aging in SQL?
AR aging buckets open balances by how long they have been outstanding — usually 0–30, 31–60, 61–90, and 90+ days. In SQL it is a CASE expression on a date difference, summed inside a single GROUP BY. The reporting value is in the 90+ bucket: that is the dollar amount most at risk of write-off.
How do hospital billing SQL questions differ from generic SQL interviews?
Generic SQL screens test your mechanics on a synthetic schema. Hospital billing SQL questions test whether you understand a specific data flow — charge to claim to remit — and the operational meaning of the numbers you produce. The same SUM(CASE WHEN...) that gets full marks on LeetCode will lose points here if you compute denial rate from claim status alone and miss that the reasons live on remits.
What does an RCM analyst SQL test look like?
Most are 60–90-minute take-homes against a star-schema mock of a hospital billing system. Expect three or four tasks: a daily KPI roll-up, an aging or lag report, a denial breakdown by payer or reason, and one stretch question that involves a window function or a subquery against expected reimbursement.
Where do healthcare revenue cycle data analyst interviews typically focus?
For provider-side roles, the emphasis is days in A/R, denial rate, and clean claim rate — the operational metrics that the CFO sees on Monday. For payer-side roles, the emphasis shifts to claim adjudication timeliness, network leakage, and medical-loss ratios. The SQL skills overlap heavily; the framing changes. Either way, fluency with the charge / claim / remit chain is the price of admission.
Ready to drill the patterns?
Twenty-five missions, real SQLite schema, instant feedback. Walk from active payers by plan type through claim-to-remit window functions and the capstone patient-segmentation query.