Oct. 4, 2023
We have raised $20M in funding led by Khosla Ventures, with additional participation from our existing investors, including Synergis Capital, and strategic partners like Kakao Investment. This new round of financing will help us expand the team, further develop our operating system and hardware, and prepare for the launch of our product.
Computers, since their inception, have been designed as non-intuitive tools. From punch cards and command-lines to graphical user interfaces, many attempts have been made to make them more accessible and user-friendly. The ways in which we perceive and react to the outputs of computers are governed by bundled software that are called "operating systems," or as we would like to describe it, "a way to interact with computers.”
However, even with the recent boom of advanced artificial intelligence, our relationship with the computer remained unchanged, as that of a passive one. Computers still do not understand your intentions but rather your commands; do not understand the interfaces they are presenting; and cannot trigger actions on your behalf. rabbit is launched as an effort to reverse this relationship, to envision a world where technology enhances the human experience. We believe that the best way to realize our vision is through a personalized operating system, where the primary interface is driven by natural language.
The cornerstone of what we are building is ‘Large Action Model’ (LAM), a new class of foundation models designed to make AI systems see and act on the Web and in the physical world in the same ways we humans do. Just as a large language models (LLM) aims to understand language, an LAM aims to understand actions, from the perspectives of why, through which, and how. It attempts to solve the three challenges current systems have:
Human intentions are deeply personal, have layers, may be incomplete, and could change on a whim. Intention empowers LAM to translate natural language requests into actionable steps and responses that the operating system could leverage in real time. Thanks to recent developments in more powerful and faster LLMs, we are able to achieve this functionality with relative ease.
As opposed to application programming interfaces (APIs), LAM understands and operates human-oriented interfaces across all kinds of desktop and mobile environments. The approach we have taken here is rooted in imitation: LAM observes a human using the interface and aims to reliably replicate the process, even if the interface is presented differently or is changed slightly. Over time, LAM accumulates knowledge from the aggregate demonstrations to grasp every element of an arbitrary interface and forms a “conceptual blueprint” of the service behind it.
Not only does LAM know how to interact with applications to achieve an objective, it also learns to achieve the objective in a humanizing way. Through Interaction, LAM will be put on a guardrail to produce behaviors that are safe, efficient, and indistinguishable from human behavior, making them a comfortable choice to delegate your interactions to.
LAM does not live in a vacuum. The eventual mature form of models that we will be developing will work with data that we have yet to collect and will be evaluated on benchmarks that we have yet to design. This is why we are not building the model alone, but the full stack of necessary apparatus in the operating system to support it.We believe that even if a generalized LAM or other powerful AIs capable of controlling any software exist today, delivering this AI to the end user remains challenging, just as challenging as developing the model from scratch. We are focused on several key elements, in addition to our core model development, to support our products:
We believe that by being vertically integrated, we can empower every user of our operating system, regardless of their professional skill set, to become a contributor towards a safer, more intuitive, friendly, and powerful LAM that enhances their own experience.
To truly make this happen, we have adopted a consumer- and product-first approach, rather than a demo- or ideation-first one. We hold ourselves responsible for all aspects of a delightful user experience, starting with LAM, and expanding to all software in an operating system, the product design, the cloud infrastructure, and the dedicated hardware to host the operating system. We believe that the best way to realize the value of cutting-edge research is by focusing on the end users and deploying hardened and safeguarded systems into production quickly.
This has been the goal of our team members long before the creation of rabbit. Jesse founded Raven Tech in 2013 and built out the first natural language-powered operating system in the world. After its acquisition, his operating system was deployed into one of the best-selling smart speakers. The form factors and software designs are still used as some of the industry standards today. Our researchers and engineers were among the first to train large models in a business context. We have been operating HPC clusters to train large transformers among other architectures as early as 2021, along with some of the more well-known startups in the industry. Now, we are united in our vision and working to make it a reality.
Since our founding moment, we have come a long way. Our team currently consists of nine people and is steadily growing, working in person in the beautiful city of Los Angeles.
We have finished the development of major components of our operating system and hope to release our first device to the public as early as next year. Our existing artifacts can already speed up multimedia language model applications by an order of magnitude and enable real-time, API-free agency on any platform.
We hope that this is the beginning of an incredible journey, where lifelong and meaningful connections can be made as a result of the experiences we create. If you identify with our vision and would like to help create the future with us, we are incredibly excited about the opportunity to collaborate.