How to Become a Data Analyst

Typical comp: $65,000–$180,000 (median $95,000)

The Data Analyst role is one of the most well-established positions in modern data work — pre-dating the “data scientist” framing by a decade, surviving the rise and fall of multiple tooling generations (Excel → SQL → BI tools → notebooks → BI-and-notebooks-and-AI), and remaining the modal entry point into the data career path for most practitioners. The 2026 version is shaped by two forces: AI-assisted authorship has compressed the boilerplate parts of analytical work (routine SQL, descriptive statistics, dashboard authoring), shifting the value of the role toward judgment about what to analyze and how to communicate findings; and the boundary with Data Scientist has tightened in some orgs (where Data Analysts now own predictive modeling work that previously sat with DS) and softened in others (where DS work is consolidating into ML Engineering, leaving Data Analysts as the dominant non-engineering data role).

This guide covers what Data Analysts actually do day-to-day, how the role differs from Data Scientist and Business Intelligence positions, the skills that actually predict performance, what compensation looks like in 2026, and how AIEH’s calibrated assessments map onto role-readiness for the position.

What a Data Analyst actually does

A Data Analyst’s job is to translate business questions into data queries, produce defensible answers, and communicate those answers to stakeholders who will act on them. The work spans the full analytical cycle: scoping the question with the requester, pulling the relevant data, validating its quality, performing the descriptive or comparative analysis, building visualizations or dashboards if the answer is recurring, writing up findings, and following up when the stakeholder’s next question surfaces.

Day-to-day work breaks roughly into five recurring activities. The first is question scoping. Stakeholders rarely ask precisely the question their actual decision needs answered — “is our conversion rate falling?” might really mean “should we keep this ad campaign running?” The Data Analyst who can re-state the underlying decision question and find the analysis that supports it is more valuable than one who answers the literal question asked.

The second is SQL authoring and data-pulling. Most Data Analyst work happens through SQL — joins across multiple tables, window functions, careful indexing intuition for performance, and the validation step where the analyst confirms the data they’ve pulled actually represents what they think it does. Tukey’s foundational work on exploratory data analysis (Tukey, 1977) named the discipline of looking at data before reasoning from it; modern data work still rewards practitioners who do this systematically, especially given the modern data environment’s tendency to surface plausible-but-wrong tables across the warehouse.

The third is descriptive and comparative analysis. Computing means, distributions, year-over-year changes, segment cuts, and the basic statistical machinery (confidence intervals, hypothesis tests) that lets the analyst defend “this difference is real, not noise.” Most Data Analyst work is descriptive; the small fraction that’s predictive uses simpler methods (linear/logistic regression, basic time-series) than ML Engineering work. Wickham’s tidy-data principles (Wickham, 2014) and the broader R/Python data-tooling literature give the field a vocabulary for shaping data into analyzable form.

The fourth is visualization and dashboard authoring. The analyst’s output is often a chart, a dashboard, or a slide rather than a number. Choosing the right visualization for the question (line chart for trend, bar chart for category comparison, scatter for correlation, distribution for spread) and the right visualization tool (Tableau, Looker, Metabase, Mode, AI-augmented notebook outputs) is part of the craft. Senior Data Analysts make visualization choices that anticipate the follow-up question the stakeholder will ask.

The fifth is stakeholder communication. Most analyses need to be written up — a one-page summary, a deck, a Slack thread. The analyst who buries the headline finding in paragraph 4 and walks through methodology first loses the room. The analyst who leads with the answer, supports it with two or three numbers, and closes with a clear “what to do next” gets promoted faster.

How this role differs from Data Scientist and Business Intelligence

Data Analysts sit between Data Scientists and Business Intelligence specialists, and the role’s shape is mostly defined by what it owns differently from each:

  • vs. Data Scientist. Data Scientists go deeper on predictive modeling, experimental design (causal inference, A/B test design), and the boundary between exploratory analysis and applied research. Data Analysts focus more on descriptive and comparative work, with predictive modeling appearing as occasional rather than core. The org-chart distinction varies substantially — at smaller employers, “Data Analyst” and “Data Scientist” titles are nearly interchangeable; at larger employers, Data Scientists own model development pipelines while Data Analysts own dashboard, reporting, and ad-hoc analytical work. The compensation gap reflects the org-chart distinction; the skill-set gap is narrower than the title difference suggests.
  • vs. Business Intelligence (BI) Developer. BI Developers go deeper on the data-warehouse layer — building dimensional models, owning the ETL/ELT pipeline, defining metric layers (LookML, dbt models, semantic layers in Cube or AtScale). Data Analysts consume BI infrastructure rather than build it, spending their time on analyses that the warehouse already supports. At smaller orgs the BI and Analyst roles merge.
  • vs. Analytics Engineer. Analytics Engineering is a newer specialization (popularized by dbt’s growth post-2018) that sits between BI Developer and Data Analyst — owning the semantic layer, the metric definitions, and the warehouse-side models that make analytical work fast and consistent. Data Analysts consume Analytics Engineering output; the boundary is fluid at smaller employers and increasingly distinct at larger ones.

There’s a quieter difference in cadence. Data Scientists work in weeks-to-months cycles on model development; Data Analysts work in days-to-weeks cycles on stakeholder questions, with constant incoming requests interrupting deeper work. Senior Data Analysts develop time-allocation patterns (deep-work blocks, ticket triage, scheduled “office hours”) that protect the analytical work from the incoming-request load.

Skills the role demands

Data Analysis is a horizontal-depth role — you need real working competence across at least four of the five skill areas below, and real depth on at least one or two. Listed in order of leverage for most product-team Data Analyst hires:

  • SQL fluency. This is the highest-leverage skill. Most Data Analyst work happens through SQL, and the difference between competent and strong SQL fluency translates directly into analytical throughput. Strong analysts can read a 200-line query and spot the join that’s silently fanning out the row count, design a window-function query without reaching for documentation, and trade off readability against performance with intent. The AIEH AI-Augmented SQL family targets exactly this skill (along with the AI-assistance overlay that’s increasingly necessary in 2026 work).
  • Descriptive statistics and basic experimentation. Means, distributions, sampling intuition, hypothesis testing, confidence intervals, basic A/B test reads, the difference between observational and experimental data. You don’t need causal-inference depth (that’s Data Scientist territory), but you need enough statistical literacy to defend “this difference is real” without overclaiming.
  • Visualization craft. Choosing the right chart for the question, executing it cleanly (no chartjunk, no misleading axes, no unnecessary 3D), and anticipating the follow-up question. Senior Data Analysts know when a table is more honest than a chart and when an animation surfaces a finding a static chart hides.
  • Stakeholder communication. Written and verbal — the one-page summary, the slide deck, the Slack thread, the five-minute exec brief. Most Data Analyst work fails at the communication step rather than the analysis step; analyses are usually correct, but the finding doesn’t reach the decision-maker in actionable form.
  • Python or R for analytical work. SQL handles most data pulling, but analytical work past simple aggregation (mixed modeling, time-series with seasonality, exploratory ML) benefits from a notebook environment. Python with pandas / statsmodels / scikit-learn is the modal 2026 choice; R with tidyverse remains strong in academia and some industries. Junior Data Analysts can ship without notebook fluency; senior Data Analysts almost always have it.

A sixth skill that doesn’t tier with the above but matters disproportionately at senior levels: business-domain understanding. The senior Data Analyst who can run an analysis of churn drivers without first asking “what does churn even mean in our product?” is the analyst who anticipates the stakeholder’s real question and produces a useful answer. Domain understanding compounds slowly across a career and is most of what separates senior Data Analysts from mid-level ones with similar technical skill.

Typical compensation

US-based Data Analyst compensation as of early 2026 ranges roughly from ~$65,000 to ~$180,000 in total annual compensation, with median around ~$95,000. The distribution is wide because the title spans substantially different jobs across employer tier, seniority, and industry — finance and tech employers pay significantly above retail and non-profit employers for nominally-similar Data Analyst roles.

Data Notice: Compensation, role descriptions, and skill weightings reflect the most recent available data at time of writing and may shift as the labor market evolves. Verify compensation with current sources before negotiating.

Three reference points:

  • levels.fyi publishes the most-detailed publicly available compensation distributions for “Data Analyst” and adjacent titles. As of early 2026, US-based base compensation for non-management IC Data Analyst roles at established tech employers clusters in the upper-double-digits to mid-six-figure range, with significant equity at public-tech employers pushing senior IC total comp meaningfully higher. Senior Data Analyst roles at tech employers can reach total comp in the range typical of mid-level Software Engineers at the same company. Verify against the live levels.fyi distributions before negotiating — the numbers shift quarter-to-quarter, and Data Analyst titles at non-tech employers (finance, consulting, retail, non-profit) follow substantially different distributions.
  • The US Bureau of Labor Statistics does not publish a dedicated Standard Occupational Classification code for “Data Analyst” specifically. The closest existing match is SOC 15-2041 (Statisticians); Data Analyst work also overlaps with SOC 13-1161 (Market Research Analysts) and SOC 15-2031 (Data Scientists) depending on the specific role surface. BLS Occupational Outlook projects substantially above-average growth across all three adjacent categories — well outpacing the all-occupation baseline. The role’s lack of a single dedicated SOC code reflects how cross-industry the function is.
  • Geographic and industry adjustment. Built In and levels.fyi geographic breakdowns show meaningfully lower total comp — typically a quarter to a third less — for Data Analysts in non-coastal US markets versus the SF/NYC/Seattle cluster. Industry adjustment is even larger: a Data Analyst at a frontier-tech employer can earn double-or-more what an equivalently-skilled analyst at a retail or non-profit employer earns, even in the same metro. Industry choice often matters more for Data Analyst compensation than geographic choice does.

Equity composition is more variable for Data Analyst than for engineering roles because non-tech employers (finance, consulting, healthcare) typically don’t offer equity packages, while tech employers do. Treat any single comp number as a midpoint; actual offers cluster within roughly ±25% of the published medians at comparable employers, with wider variance across industries.

How candidates demonstrate readiness on AIEH

AIEH’s role-readiness model for Data Analyst weights five assessment families, ordered here by predictive relevance for the role:

AI-Augmented SQL (relevance 0.95). This is the highest-leverage signal. Data Analyst work happens through SQL, and SQL fluency augmented by AI assistance is the more useful skill to measure than pure-SQL fluency in 2026 — the analyst who knows when to author a query directly, when to use AI assistance well, and how to recognize when AI-generated SQL is subtly wrong on schema-specific edge cases is the analyst who ships fast without shipping wrong answers. The family is on the launch roadmap (see tests catalog for current availability) and will be takeable shortly.

Data Analysis (relevance 0.90). The Data Analysis family targets descriptive statistics, distributions, hypothesis tests, A/B test reading, and the common pitfalls (Simpson’s paradox, selection effects, multiple-comparisons inflation). It complements the SQL family by measuring what the analyst does with the data they’ve pulled. Like AI-Augmented SQL, the family is on the roadmap and launches shortly.

Communication (relevance 0.80). Data Analysts communicate across product, marketing, finance, operations, and executive audiences more than the title implies, and the analyst who can write a clear one-page finding or deliver a five-minute exec brief gets promoted faster. The free 5-scenario Communication sample is takeable today and provides a fast calibration check.

Python Fundamentals (relevance 0.55). Most Data Analysts use Python (or R) for analytical work past basic SQL aggregation — notebook-based exploratory analysis, statistical modeling, visualization beyond what BI tools natively support. The free 5-question Python Fundamentals sample is takeable today; the full 50-question assessment probes the language depth that distinguishes senior Data Analyst from junior.

Big Five Personality (relevance 0.45). Personality contributes a small secondary signal. Conscientiousness predicts performance across nearly every analytical role studied (Barrick & Mount, 1991), and openness to experience predicts adaptability to the constant tooling shifts the role faces. For an extended treatment, see the Big Five in hiring overview.

The full lineup is browsable on the tests catalog, and the underlying calibration that maps each test family score to the common 300–850 Skills Passport scale is documented on the scoring methodology page.

A candidate aiming for a Data Analyst role should target AI-Augmented SQL and Data Analysis first when those families launch — these are the role-defining assessments, and the bundle is heavily weighted toward them. In the interim, build a Skills Passport baseline with the Communication and Python samples that are takeable today; both contribute meaningfully and demonstrate momentum on the Passport before the SQL/Data-Analysis families ship.

Where Data Analysts come from

Data Analyst is the most well-established data role and has the most-defined entry paths. Three are most visible in 2026 hiring:

  • Quantitative academic background — common, often the largest cohort. Statistics, economics, mathematics, operations research, or related quantitative degrees. The fastest path: build a portfolio of analyses on public datasets, gain SQL fluency past a tutorial baseline, and apply directly to junior Data Analyst roles where domain-specific business knowledge isn’t yet a barrier.
  • Business-side analyst transitioning into technical analysis — a substantial minority. Marketing analysts, financial analysts, operations analysts, or business intelligence practitioners who absorbed SQL and Python over time and shifted into a technical Data Analyst seat. The transition is well-supported because the business-domain knowledge is genuinely valuable, and the technical skills can be built incrementally on the job.
  • Bootcamp or self-taught from the start — a growing minority. Data analytics bootcamps (separate from data science bootcamps) have matured substantially since 2020, especially via project-based learning paths. Strongest at the junior tier; the senior tier still skews toward analysts with one of the first two origins and lateral expansion.

The specific entry path matters less than the demonstrated ability to translate stakeholder questions into defensible analyses and communicate the findings — which is exactly what the AIEH Data Analyst bundle (AI-Augmented SQL + Data Analysis + Communication, with Python and Big Five complements) measures.

What you do next

If you’re moving toward this role, start by building a Skills Passport baseline with the assessments that are takeable today. The free Communication sample is a 5-scenario, 1-minute calibration that contributes meaningfully to the role bundle (relevance 0.80). The free Python Fundamentals sample contributes a baseline-competence signal at relevance 0.55 — take the full 50-question assessment when you’re ready to commit a real Skills Passport contribution on Python.

Track the tests catalog for the AI-Augmented SQL and Data Analysis family launches — those are the role-defining assessments and will dominate role-readiness once they ship.

For hiring managers building a Data Analyst bundle, the five assessments above with the published relevance weights are a defensible starting baseline. Adjust the weights for your specific loop based on the role’s surface composition (BI-dashboard-heavy vs. ad-hoc-analysis-heavy vs. predictive-modeling-leaning), seniority target (junior weights Communication and SQL higher; senior weights Communication and domain judgment higher), and industry. The published defaults reflect a balanced product-analytics Data Analyst hire — a useful starting point, not a universal answer.


Sources

  • Barrick, M. R., & Mount, M. K. (1991). The Big Five personality dimensions and job performance: A meta-analysis. Personnel Psychology, 44(1), 1–26.
  • Built In. (2026). Salary data for Data Analyst titles, US employers, retrieved 2026-Q1. https://builtin.com/salaries/
  • levels.fyi. (2026). Data Analyst compensation distributions, US sample, retrieved 2026-Q1. https://www.levels.fyi/
  • Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley.
  • US Bureau of Labor Statistics. (2026). Occupational Outlook Handbook, SOC 15-2041 (Statisticians) and SOC 13-1161 (Market Research Analysts). https://www.bls.gov/ooh/
  • Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10), 1–23.