I'Looking at artificial intelligence and ordering my groceries. Armed with my shopping list, enter each item into the search bar of the supermarket website, then click using your cursor. When you see what looks like a digital ghost, this is usually a mundane task that is mysteriously fixed. “Are you not just Indians?” my husband asks, peering over my shoulder.
I'm trying operatorOpenai's new AI “agent” is the manufacturer of ChatGpt. It was made available to UK users last month and has a similar text interface and conversation tone as ChatGpt, but rather than answering questions, it actually does do Things – if they involve navigating a web browser.
Soon after the large language model, AI agents are trumpeted as the next big thing, and you can see the appeal. Similar to Openai's offering, humanity introduced the “computer use” feature in Claude Chatbot towards the end of last year. Perplexity and Google have also released the “agent” feature for AI assistants, with more companies developing agents targeting specific tasks such as coding and research.
While there is debate about what is accurately counted as an AI agent, the general idea is that you need to be able to take action with a certain degree of autonomy. “As soon as you start performing an action outside the chat window, you'll be an agent from a chatbot,” says Margaret Mitchell, a leading ethics scientist at AI Company.
It's early. Most commercial agents still come with experimental disclaimers. Openai describes the operator as a “research preview.” Dozen eggs $31 Or you're trying to Return the groceries to the store They bought them. Depending on who you ask, agents are just the dawn of the future of AI that can shake up the next exaggerated high-tech or labor, rebuild the internet and change our lives.
“In principle, they're amazing because they can automate many drunk people,” says Gary Marcus, a scientist and skeptical linguistic model scientist at large. “But I don't think they'll work anytime soon, and it's partly an investment in hype.”
I sign up to the operator to see for myself. Grocery shopping seems like a good first job as there is no food at home. Once you enter your request, you will be asked if there is a shop or brand you like. I tell them to go with the cheapest person. A window will appear to display your web browser and search for “UK Online Grocery Delivery.” The mouse cursor selects the first result: ocado. Starts searching for requested items and filters the results by price. Select the product and click Add to trolley.
I was impressed by the operator's initiative. If only a description of a simple item such as “salmon” or “chicken” is given, it doesn't ask me any questions. Searching for eggs will help you pass through several non-egg items that appear as special offers. My list is looking for “several different vegetables.” Choose a broccoli head and ask if you want something else specific. I tell them to choose two more, and it goes for carrots and leeks – perhaps I chose myself. Encourage me, I ask you to add “sweet sweets” and literally watch as you type “sweet snacks” into the search bar. I don't know why I'm choosing 70% chocolate, but certainly not the cheapest option, but I don't like dark chocolate and I'll trade it for a Galaxy Bar.
When the operator realized that there was a minimum spend on Ocado, we bumped into a scratch. So, add more items to the list. You will then be logged in and the agent will encourage you to intervene. While users can take over the browser at any point in time, Openai says operators are designed to require “when entering sensitive information into the browser, such as login credentials and payment information.” Operators usually take constant screenshots to “see” what it is doing, but Openai says that they don't do this when the user controls it.
At checkout, you will be asked to complete the payment and test the water. But when I respond by asking for details of my card, I get the reins back. I have already provided Openai with payment info (operators need a ChatGPT Pro account that costs $200 a month), but I find it uncomfortable to share this directly with AI. I've ordered it and waited for next day delivery. But it doesn't solve dinner. Give the operator a new task. Can I order a cheeseburger and chips from a local highly rated restaurant? It asks for my postcode and then loads the Derveoo website and searches for “Cheeseburger”. Again, there is a pause when you need to log in, but Derveoo already stores the card details, so the operator can proceed to pay directly.
The restaurant it chooses is local and highly rated as a fish and chip shop. I'll end up with a big bag of total cheeseburger and chippy style chips. It's not what I imagined, but it's not I'm wrongeither. However, I am regretted when I realized that the operator was skipping the delivery rider conversion. I secretly take my food and add generous tips after the fact.
Of course, seeing operators hold actions will beat the time saving points of using AI agents for online tasks. Instead, you can keep it working in the background, focusing on other tabs. While drafting this piece, I make another request: Can it be booked for gel nail polish at a local salon?
Operators are struggling with this task more. I go to Frasha, a beauty booking platform, but when I was prompted to log in, I find myself choosing to book an hour or more by car, a week behind my house in East London. I point out these issues and it finds a slot for the right date, but it's still far away from Leicester Square. Only then will it ask my location and I recognize that it should not retain this knowledge between tasks. By this point I might have already booked my own. The operator will ultimately propose a proper appointment, but I will abandon the task and choke it up as a team human victory.
It is clear that this first generation AI agent has limitations. It requires a considerable amount of human monitoring to stop and log in. However, operators store cookies so that users can continue to log in to the website on subsequent visits (Openai requires closer supervision on “particularly sensitive” sites, such as email clients and financial services). The results are usually accurate, but not necessarily my own. When my groceries arrived, I see that the operator ordered smoked salmon rather than fillets, and was twice as many with yogurt as a special offer. I interpreted “some fish cakes” as 3 packs (I intended only one), and saved the insult of buying chocolate milk instead of plain because the product was out of stock. To be fair to the bots, I had the opportunity to review the order. You will get better results if you get more specific at the prompt (“Pack of two raw salmon fillets”), but these additional steps will also undermine the saved effort.
Despite the current flaws, my experience with the operator feels like a glimpse of what's coming. As such systems improved and reduced costs, I was able to easily see them embedded in everyday life. You may already have written your shopping list on the app. Why doesn't it place an order? Agents also permeate workflows beyond the realm of personal assistants. Openai CEO Sam Altman predicts that AI agents will be able to “join the workforce” this year.
Software developers are one of the early adopters. Coding Platform Github Recently added agent features For AI Copilot tools. Github CEO Thomas Dohmke says developers are used to some degree of automated assistance. The difference between AI agents is the level of autonomy. “Not only gives the answer by asking a question, but you'll have a problem and then repeat it with the code you can access,” he says.
GitHub is already working on a more autonomous agent called Project Padawan ( Star Wars (a term used to refer to Jedi apprentice). This allows AI agents to work asynchronously rather than requiring constant monitoring. Developers can report the agent's team to them and write code for review. Dohmke says he doesn't think the developer's work is at risk. “I argue that the amount of work that AI has added to most developers' backlogs is higher than the amount of work it takes over,” he says. Agents can also create coding tasks that are more accessible to non-technical people, such as building apps.
Outside of software development, Dohmke envisions a future where everyone has their own personal Jarvis. Iron Man. Your agent will learn your habits and be customized to your tastes, making it more convenient. He used him to book holidays for his family.
But more autonomous agents have greater risks than they pose. Mitchell, from her hugging face, I co-authored the paper Warning against the development of fully autonomous agents. “Completely autonomously means that human control has been completely transferred,” she says. Rather than working within a set boundary, an agent that is completely autonomous can access things that don't notice or work in unexpected ways, especially if they can write their own code. If your AI agent makes a mistake in ordering takeout, that's not a big deal, but what if you start sharing your personal information or posting under the name of scary social media content on a scam website? High-risk workplaces can implement particularly dangerous scenarios. What if I have access to the missile command system?
Mitchell hopes engineers, legislators and policymakers will encourage guardrails to mitigate such cases. For now, she foresees that the agent's abilities will become more refined for certain tasks. Immediately, I watch the agent interact with it. For example, an agent could work with my agent to set up a meeting.
This surge in agents could potentially rebuild the internet. Currently, much of the information online is specialized in human language, but this can change if AIS is increasingly interacting with websites. “Through the Internet, you're seeing more and more information that agents need to act on, although not directly in human language,” says Mitchell.
Dohmke echoes this idea. He believes that the concept of homepages will lose importance and design interfaces with AI agents in mind. Brands may begin to compete for AI attention over the human eyeballs.
One day, the agent even escapes the computer range. You can see AI agents embodied in robots, which will open up a world of physical tasks for them to help. “My prediction is to see agents who can do our laundry, cook and cook for us,” says Mitchell. “Don't give us access to the weapon.”
Source: www.theguardian.com