Inspired by the SAE J3016 standard used for self-driving cars, this paper suggest similar standards for data-agents.
The levels are:
L0 — No Autonomy. All data work is entirely human-driven. The agent doesn’t exist yet.
L1 — Assistance. Stateless, prompt-response helpers. Think of an LLM that answers a single SQL question or suggests a database configuration — it responds to queries but can’t perceive or interact with the environment. Humans still orchestrate and execute everything.
L2 — Partial Autonomy. The agent gains environmental perception — it can connect to databases, run code, use APIs, and refine its outputs through feedback loops. But it still operates within human-designed pipelines and workflows. Most current research systems live here.
L3 — Conditional Autonomy. The critical leap: the agent independently orchestrates entire data pipelines (from management to preparation to analysis) rather than just executing pre-defined steps. Humans shift from operator to supervisor. The paper notes no system has fully achieved this yet, though “Proto-L3” efforts like JoyAgent, AgenticData, and industry tools from Snowflake, Google BigQuery, and Databricks are emerging.
L4 — High Autonomy. The agent proactively identifies problems worth investigating — monitoring data lakes, detecting anomalies, and initiating analyses without being asked. Humans become passive recipients of insights.
L5 — Full Autonomy. A visionary level where agents invent novel methods, theories, and paradigms for data science itself, pushing beyond existing techniques. It will be critical to decide if an agent should be given full autonomy.
The paper argues the field is currently navigating the L2-to-L3 transition, which is the most consequential shift — moving from procedural execution to autonomous orchestration. Four major gaps block progress toward true L3: limited autonomy in pipeline orchestration (most systems still rely on predefined operators), incomplete coverage of the full data lifecycle, deficiencies in advanced reasoning (tactical fixes rather than root-cause analysis), and inability to adapt to dynamic, evolving data environments.
References
Leave a comment