Java engineering teams can now leverage Google’s Agent Development Kit to streamline complex tooling and architecture.
AI is transitioning from Python experiments into mature, production-grade deployments. This maturation demands frameworks that respect existing system architectures rather than forcing entirely new paradigms.
Recognising this requirement, Google has officially released version 1.0 of the Agent Development Kit for Java. This framework provides a structured approach for software engineers authoring smart creations, stepping away from scattered scripts to a mature multi-language ecosystem that now fully supports Python, Java, Go, and TypeScript.
Historically, integrating large language models into highly structured backend services caused immense friction. Engineering units often faced integration issues related to dependency management and continuous integration bottlenecks when bridging Python-heavy data science workflows with established Java backend services. The 1.0.0 release provides a native pathway for these teams to embed intelligence directly into their existing services.
Centralising architectural control with app and plugins
Managing the lifecycle of AI interactions within complex environments typically results in sprawling, difficult-to-maintain codebases. Previous approaches often relied on applying callbacks at the level of each individual agent and sub-agent, creating technical debt as applications scaled. The new release addresses this friction through a centralised plugin architecture anchored by a new App container.
The App class acts as the top-level container for an application, holding global configurations and managing application-wide plugins for global execution control. For developers, this aspect-oriented design allows them to intercept and modify the behaviours of large language models, agents, and tools across the entire system without duplicating logic.
Google provides several out-of-the-box solutions, including a LoggingPlugin for structured, detailed logging of errors, tool calls, and model requests. For standardising behaviour, the GlobalInstructionPlugin applies consistent instructions (e.g. safety rules or identity parameters) dynamically to all agents. Teams can also extend the BasePlugin abstract class to enforce their own architectural rules and custom guardrails.
Centralising telemetry via the LoggingPlugin ensures that observability platforms can ingest uniform data, simplifying the debugging of complex chains of thought. Dealing with technical debt requires this exact kind of standardisation, moving away from fragmented implementation details towards coherent, system-wide governance.
Context engineering and optimising token constraints
In conversational architecture, managing the context window often presents a difficulty. Exhausting token limits directly degrades performance, inflates operational costs, and increases response latency. The 1.0 release tackles this head-on through enhanced context engineering capabilities, specifically introducing event compaction.
Configured via the eventsCompactionConfig method on the App container, event compaction allows developers to control the size of an agent’s history. Rather than abruptly truncating conversations, the system maintains a sliding window of recent events and summarises older data. This mechanism effectively prevents context windows from exceeding token limits, directly reducing latency and lowering compute costs during long-running sessions.
Platform engineers have precise control over this process, tuning parameters like the compaction interval, overlap size, token thresholds, and event retention size. When the default LlmEventSummarizer is insufficient for highly specific domains, teams can implement the BaseEventSummarizer and EventCompactor interfaces to completely customise how events are evaluated and discarded.
Implementing this summarisation correctly ensures that models retain 100 percent of the context required for accurate reasoning, without carrying the heavy token payload of verbatim historical chat logs. It demands a sophisticated understanding of data maturity, forcing engineering teams to carefully map out which pieces of session history dictate future model accuracy.
Tooling integration and safe code execution
Agents require connection to external data sources and execution environments to deliver value beyond their intrinsic knowledge.
The updated toolkit introduces powerful new tools to facilitate this grounding. For example, the UrlContextTool allows a model to fetch web content directly from URLs provided in a prompt, eliminating the need for developers to build and maintain separate web fetching pipelines.
Additionally, the GoogleMapsTool integrates location-based data directly into responses using Gemini 2.5. A user querying a grounded agent about dining options near the Eiffel Tower would receive detailed information about the Jules Verne restaurant, including its 4.5-star rating.
Furthermore, prompting the model to summarise a specific blog post URL about Nano Banana 2 yields a detailed overview of the advanced image generation model, highlighting its subject consistency capabilities.
Executing code safely presents another difficulty for security-conscious teams. The framework features dedicated code execution tools, specifically the ContainerCodeExecutor and VertexAiCodeExecutor. These abstractions allow developers to run generated code locally against Docker containers or cloud-natively within Google Cloud’s Vertex AI.
By isolating execution, these executors integrate cleanly with existing continuous integration and deployment pipelines while maintaining strict security boundaries. Developers looking to automate interface interactions can also utilise the ComputerUseTool abstraction to drive a real computer or web browser, which requires implementing a BaseComputer integration via tools like Playwright.
State persistence and cloud integration
Stateless agents are of limited use in complex corporate ecosystems. The framework now defines clear contracts for state management, history, and file handling across multiple conversations. Rather than forcing a specific database paradigm, the toolkit offers multiple session services that can be configured directly in the runner loop.
For local development and testing, engineers can use the lightweight InMemorySessionService. For production environments, the framework provides a VertexAiSessionService backed by the managed Vertex AI Session API, alongside a scalable FirestoreSessionService backed by Google Cloud Firestore, which was contributed by the community.
Long-term conversational memory operates on similar contracts. By attaching a LoadMemoryTool to an agent, the system automatically queries the configured Memory Service – such as the persistent FirestoreMemoryService or the local InMemoryMemoryService – for historical context.
Handling large data payloads, like PDFs or images exchanged during a session, is managed via artifact services. Developers can utilise the GcsArtifactService for persistent, versioned management using Google Cloud Storage, ensuring that teams never miss details from past sessions or lose track of exchanged files.
Keeping a human-in-the-loop
Models frequently require human approval to validate certain actions, comply with company processes, or avoid executing dangerous operations. To support these governance requirements, the framework implements human-in-the-loop workflows built around the ToolConfirmation concept.
When a registered tool requires manual intervention, it accesses its ToolContext and calls the requestConfirmation method. This action automatically intercepts the run, pausing the execution flow until human input is received.
Once a user provides approval, along with any optional payload data via a ToolConfirmation, the framework automatically resumes the flow. The system cleans up intermediate events and explicitly injects the confirmed function call back into the subsequent request context.
This context management ensures the model understands the action was approved without falling into an infinite loop. For example, a report assistant equipped with a search tool can be explicitly programmed to request user confirmation before initiating the search agent to compile a report.
Interoperability via Agent2Agent collaboration
Modern systems are rarely monolithic, and the AI ecosystem is evolving to reflect this. The new Java release natively supports the official Agent2Agent (A2A) protocol, enabling seamless collaboration between remote agents across entirely different frameworks and languages.
The framework leverages the official A2A Java SDK Client. Engineers can resolve an AgentCard, which represents the identity, communication preferences, and abilities of a remote agent via a specific endpoint. After constructing the client, it can be wrapped in a RemoteA2AAgent and placed directly into the local agent hierarchy. This remote agent acts exactly like a local entity, natively streaming events back to the runner.
Teams can also expose their internal creations to the wider ecosystem by wrapping them in an AgentExecutor. This exposes the agents via a JSON-RPC REST endpoint, instantly making them accessible to other services. By adopting these standard protocols, engineering teams can build sprawling ecosystems of interoperable components, discovering new integrations along the way.
Google encourages developers to consult Agent Development Kit’s official documentation to explore these capabilities further, file bug reports on GitHub, and submit pull requests according to the contribution guidelines.
See also: Google tests internal AI agent for coding tasks and workflows
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events including the Cyber Security & Cloud Expo. Click here for more information.
Developer is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.



