What Nvidia’s new text-to-3D means for engineering and product design


tl;dr Doctor: Generative AI is developing at an encouraging pace. Nvidia’s latest algorithm converts text to 3D meshes twice as fast as projects published two months ago. This means that the technical possibilities are now beyond our ability to work with them.

Last week’s paper by Nvidia scientists demonstrated the exponential rate at which the generative AI space is evolving. This explosion of activity – especially visible in the past 9 months – will impact every area of ​​life, not least product design, engineering and manufacturing. The changes will free the industry from structural constraints in how ideas are communicated, enable faster innovation cycles and ultimately enable it to deliver on its sustainability promises.

Sample meshes from Nvidia Research’s Magic 3D algorithm, along with the prompts used to generate them.

Nvidia Deep Imagination Research

After years of hearing that AI would fundamentally change the way we work, few expected the creative industry to be the first casualty. The arrival of GPT-3’s humanoid text generator in 2020 brought the possibilities into greater focus. It’s been a wild ride since then: DALL-E (text-to-image), Whisper (speech recognition), and more recently Stable Diffusion (text-to-image) have not only enhanced the capabilities of speech and visual AI tools. only increased, but also reduced the resources required to use them (from 175 million parameters for GPT-3 to 900 million for stable propagation).

Fixed spread size means less than 5 GB of disk space – suitable for use on any laptop. Not only this; Unlike OpenAI (which is primarily funded by Microsoft and publishes GPT-3, DALL-E, and Whisper), Stable Propagation is open source, meaning others can more easily build on the lessons. That means we’re just seeing the beginning of the innovation cycle – there’s a lot more to come, as Nvidia’s paper now shows.

Also read  Diablo 4 beta update means testing coming soon

Supporters of Stable Diffusion ( are advancing this trend by providing technical and financial grants to other teams exploring new directions. In addition, a plethora of projects make the tools available to an ever-widening range of users. Among them are plugins for Blender, an open-source design tool, and Adobe’s own Photoshop equivalent. Full API access to tools is funded with big Venture Capital dollars, meaning that hundreds of millions of software developers, not just a few hundred data engineers, will now build their own tools on these algorithms.

Speech, images and text are among the first domains to be disrupted by these technologies. But 3D isn’t far behind either. Besides generative niche art, cartoons are an obvious first point of application. There is already a Pokemon generator based on Stable Diffusion. Visual effects and movies are next. But many other industries are likely to be disrupted, including interior design.

In all this excitement, applying innovations to design and engineering seems like an afterthought. Still, it will probably be the most affected area in the end. Of course there are teething problems: First, static diffusion and its compatriots are not very precise yet. This isn’t a problem for cartoons, but it’s a major challenge for any attempt to convert text into full 3D geometry used in industrial contexts. This is an area in which there is some interest (a project called bits101 was launched in Israel in 2015). This may be the holy grail of the industry, but there are many intermediate challenges that are much easier to solve. These include better object recognition (the YOLO algorithm is already being used with great success), which will lead to better citations and annotations – improving quality and reducing errors. Plugins should also make it easy to use generative AI to develop basic designs (primitives), which can be further edited in the design tool to improve tolerances as needed. This is the approach already used in Altair’s Inspire, which uses finite element analysis for this. These primitives can also serve as synthetic databases of annotated models, which the 3D CAD industry lacks. Physna’s CEO and founder explains in an article about his own efforts to use these new methods to create detailed 3D designs, also highlighting the many pitfalls of using synthetic data to drive these algorithms. exposes. Creating 3D designs from 2D images is another possible application. field, as an intelligent tool wear CAM feed library to determine the best machining strategies.

Also read  No Man's Sky 4.0 Waypoint update overhauls the game's design and balance

These challenges are important and fascinating to tackle on their own. But their main impact will be to help evolve the idea-to-design journey by reducing reliance on 3D designs to convey ideas. Design, whether 2D or 3D, has served as the primary means of translating customer requirements into final products. This limits the industry because these designs act as a black box that stores all that valuable customer insight, production constraints and business objectives that cannot be isolated, much less identified. Hey. This means that when something changes, it is virtually impossible to adjust the design. This is why manufacturing innovations like 3D printing take so long and frustrate short-term investors time and time again. Despite a productive life of more than 20 years, the components that make up an aircraft are “set” from the moment they are designed. There is almost no room for innovation – these have to wait for the next generation launch.

Being able to change a constraint and allow algorithms such as steady diffusion to reorganize design and manufacturing parameters will greatly accelerate adoption of new innovations and allow us to build lighter, higher performing products faster. As in Formula 1 or systems design, engineers of the future will act as managers able to express in words and in terms of data sources what the objectives and limitations of a product are.

In this way without speeding up the engineering process for new and existing products, we have almost no means to achieve the ambitious sustainability goals we have to set ourselves. To do this, we first need to agree on a language we can use to communicate outside of design. This new semantic model is a clear departure from the innovations mentioned above. Several companies are already experimenting with it, such as with their concepts of N Topology Fields. And yet the pace of change is slow, unlike the algorithms that will feed the semantic model. Nvidia’s new algorithm is reportedly twice as fast as DreamFusion published less than 2 months ago. Product and engineering companies need to work on capturing their ideas in new, future-proof ways to maximize the potential of this explosion of generative AI. The speed of change in algorithms has once again shown that Morse’s law applies wherever devices are digitized. Despite the urgency of the task, our human inability to adapt to this change and to deploy new communication methods that can unlock their potential remains.

Also read  The secrets of the Apple AirPods Pro 2 are revealed, including why there's no lossless audio




Leave a Comment