Computer-aided design (CAD) is the most popular method for designing most physical products today. Engineers use CAD to convert 2D sketches into 3D models, which they can test and refine before sending the final version down the production line. But the software is extremely complex to learn, with thousands of commands to choose from. Becoming truly proficient in software requires a lot of time and practice.
MIT engineers want to ease the process of learning CAD with an AI model that uses CAD software like a human. Given a 2D sketch of an object, Model instantly creates a 3D version by clicking buttons and file options, just like an engineer uses software.
The MIT team has created a new dataset called VideoCAD, which contains more than 41,000 examples of how 3D models are created in CAD software. By learning from these videos, which show how to create different shapes and objects step-by-step, the new AI system can now operate the CAD software just like a human user.
With VideoCAD, the team is moving toward an AI-enabled “CAD co-pilot.” They envision that such a tool could not only create 3D versions of designs, but also work with a human user to suggest next steps, or automatically complete building sequences that would otherwise be tedious and time-consuming to click through manually.
“AI has the opportunity to increase the productivity of engineers as well as make CAD more accessible to more people,” says Gadi Nehme, a graduate student in MIT’s Department of Mechanical Engineering.
“This is important because it lowers the barrier of entry for design, helping people without years of CAD training create 3D models and use their creativity,” says Faiz Ahmed, associate professor of mechanical engineering at MIT.
Ahmed and Nehme, along with graduate student Brandon Mann and postdoc Firdous Alam, will present their work at the Conference on Neural Information Processing Systems (NeurIPS) in December.
click by click
The team’s new work expands on recent developments in AI-powered user interface (UI) agents – devices that are trained to use software programs to complete tasks, such as automatically gathering information online and organizing it into an Excel spreadsheet. Ahmed’s group wondered whether such UI agents could be designed to use CAD, include many more features and functions, and involve far more complex tasks than the average UI agent.
In their new work, the team aims to design an AI-powered UI agent that takes over the CAD program to create a 3D version of a 2D sketch, click by click. To do this, the team first looked at an existing dataset of objects that were designed in CAD by humans. Each object in the dataset contains a sequence of high-level design commands, such as “Sketch Line,” “Circle,” and “Extrude,” that were used to create the final object.
However, the team realized that these high-level commands alone were not enough to train the AI agent to actually use the CAD software. A real agent must also understand the details behind each action. For example: Which sketch area should it select? When should it zoom in? And which part of the sketch should it exclude? To bridge this gap, researchers developed a system to translate high-level commands into user-interface interactions.
“For example, let’s say we made a sketch by drawing a line from point 1 to point 2,” says Nehme. “We translated those high-level actions into user-interface actions, meaning we say, go from this pixel location, click, and then go to another pixel location, and click when the ‘line’ operation is selected.”
In the end, the team produced more than 41,000 videos of human-designed CAD objects, each described in real time in terms of specific clicks, mouse-drags and other keyboard actions that humans originally performed. They then fed all this data into a model they developed to learn the connections between UI actions and CAD object generation.
Once trained on this dataset, which they call VideoCAD, the new AI model can take a 2D sketch as input and directly control the CAD software by clicking, dragging and selecting tools to create a full 3D shape. The complexity of the objects ranged from simple brackets to more complex house designs. The team is training the model on more complex shapes and envisions that both the model and the dataset could one day enable CAD co-pilots for designers in a variety of fields.
“VideoCAD is a valuable first step toward AI assistants that help engage new users and automate repetitive modeling work that follows familiar patterns,” says Mehdi Atai, who was not involved in the study, and is a senior research scientist at Autodesk Research, which develops new design software tools. “It’s an initial foundation, and I’d be excited to see successors that extend across multiple CAD systems, richer operations like assembly and constraints, and more realistic, messier human workflows.”