I'm having trouble understanding why you would want to do this. A good interface between what I want and the model I will make is to draw a picture, not write an essay. This is already (more or less) how Solidworks operates. AI might be able to turn my napkin sketch into a model, but I would still need to draw something, and I'm not good at drawing.
The bottleneck continues to be having a good enough description to make what you want. I have serious doubts that even a skilled person will be able to do it efficiently with text alone. Some combo of drawing and point+click would be much better.
This would be useful for short enough tasks like "change all the #6-32 threads to M3" though. To do so without breaking the feature tree would be quite impressive.
Those are the kind of high level questions that an LLM with a decent understanding of CAD and design might be able to deal with soon and it will help speed up expensive design iterations.
A neat trick with current LLMs is to give them screenshots of web pages and ask some open questions about the design, information flow, etc. It will spot things that expert designers would comment on as well. It will point out things that are unclear, etc. You can go far beyond just micro managing incremental edits to some thing.
Mostly the main limitation with LLMs is the imagination of the person using it. Ask the right questions and they get a lot more useful. Even some of the older models that maybe weren't that smart were actually quite useful.
For giggles, I asked chatgpt to critique the design of HN. Not bad. https://chatgpt.com/share/6809df2b-fc00-800e-bb33-fe7d8c3611...
Completely agree.
We get waves of comments on HN downplaying model abilities or their value.
Many people don’t seem to explore and experiment with them enough. I have 3 screens. The left one has two models on it. The right one has a model & a web browser for quick searches. I work on the largest middle screen.
Extreme maybe, but I use them constantly resulting in constant discovery of helpful new uses.
I web search maybe 10% of what I did six months ago.
The quirks are real, but the endless upsides models deliver when you try things were unobtainium, from humans or machines, until LLMs.
>I web search maybe 10% of what I did six months ago.
Me too, though this is more driven by the total cliff-fall of web search result quality
> It will point out things that are unclear, etc. You can go far beyond just micro managing incremental edits to some thing.
When prompted an LLM will also point it out when it's perfectly clear. LLM is just text prediction, not magic
Yes, indeed.
But:
Why can LLMs generally write code that even compiles?
While I wouldn't trust current setups, there's no obvious reason why even a mere LLM cannot be used to explore the design space when the output can be simulated to test its suitability as a solution — even in physical systems, this is already done with non-verbal genetic algorithms.
> LLM is just text prediction, not magic
"Sufficiently advanced technology is indistinguishable from magic".
Saying "just text prediction" understates how big a deal that is.
Having to test every assertation sounds like a not particularly useful application, and the more variables there are the more it seems to be about throwing completely random things at the wall and hoping it works
You should use a tool for it's purpose, relying on text prediction to predict clarity is like relying on teams icons being green to actual productivity; a very vague, incidentally sometimes coinciding factor.
You could use text predictor for things that rely on "how would this sentence usually complete" and get right answers. But that is a very narrow field, I can mostly imagine entertainment benefiting a lot.
You could misuse text predictor for things like "is this <symptom> alarming?" and get a response that is statistically likely in the training material, but could be completely inverse for the person asking, again having very high cost for failing to do what it was never meant to. You can often demonstrate the trap by re-rolling your answer for any question a couple times and seeing how the answer often varies mild-to-completely-reverse depending on whatever seed you land.
Here on HN we often see posts insisting on the importance of "first principles".
Your embrace of "magic" - an unknown black box who does seemingly wonderful things that usually blow up to one's face and have a hidden cost - is the opposite of that.
LLMs are just text prediction. That's what they are.
>Why can LLMs generally write code that even compiles?
Why can I copy-paste code and it compiles?
Try to use LLM on code there is little training material about - for example PowerQuery or Excel - and you will see it bullshit and fail - even Microsoft's own LLM.
It reads like a horoscope to me.
That's a mega-yikes for me.
Go ahead and do something stupid like that for CEO or CTO decisions, I don't care.
But keep it out of industrial design, please. Lives are at stake.
I suspect the next step will be such a departure that it won't be Siemens, Dassault, or Autodesk that do it.
In the text to CAD ecosystem we talk about matching our language/framework to “design intent” a lot. The ideal interface is usually higher level than people expect it to be.
Most parts need to fit with something else, usually some set of components. Then there are considerations around draft, moldability, size of core pins, sliders, direction of ejection, wall thickness, coring out, radii, ribs for stiffness, tolerances...
LLMs seem far off from being the right answer here. There is, however, lots to make more efficient. Maybe you could tokenize breps in some useful way and see if transformers could become competent speaking in brep tokens? It's hand-wavy but maybe there's something there.
Mechanical engineers do not try to explain models to each other in English. They gather around Solidworks or send pictures to each other. It is incredibly hard to explain a model in English, and I don't see how a traditional LLM would be any better.
Don't dismiss an AI tool just because the first iterations aren't useful, it'll be iterated on faster than you can believe possible.
What works is asking them to implement micro feature that you will specify well enough at first try, not to ask them writing the entire piece of software from top to bottom. The tech is clearly not there yet for the latter.
The main difference between Code and CAD is that code is language you're writing to the machine to execute already, so it's pretty natural to just use a more abstract/natural language to ask it instead of the formal one of code, whereas CAD is a visual, almost physical task, and it's more pleasant to do a task than describe it in depth with words.
With vague specifications like these, you'd get garbage from a human too.
What works for software, and I suspect for other technical fields like CAD too, is to treat it like a junior developer who has an extreme breadth of knowledge but not much depth. You will need to take care to clearly specify your requirements.
You'll never have better input than this at the beginning of any project from the person that brings the use-case. That's a full job to help them define the needs more accurately. And if you always work with clear specifications it's just because there's someone in front of you that has helped write the spec starting from the loose business requirement.
> You will need to take care to clearly specify your requirements
Yes, but as I discussed above, for such tasks it's going to be very frustrating and less efficient than doing things by yourself. The only reason why you'd accept to go through this kind of effort for an intern is that because you expect him to learn and become autonomous at some point. With current tech, an LLM will forever remain as clueless as it started.
That's as may be, but again, it's not much different to being a software developer.
Someone might ask you to create a website for their business. It's your job, as the expert, to use the available tools - including AI - to turn their requirements into working code. They might say "put a button for the shopping cart into the top right". It's then your job, as as the technical expert, to get that done. Even the laziest of devs wouldn't expect to just copy/paste that request into a AI tool and get a working result.
It takes time to learn to use these tools.
When I'm using AI to help me write code, depending on the complexity of what I'm working on, I generally write something very similar to what I'd write if I was asking other developers for help (although I can be much terser). I must specify the requirements very clearly and in technical language.
Usually I keep a basic prompt for every project that outlines the technical details, the project requirements, the tools and libraries being used, and so on. It's exactly the same information I'd need to document for another human working on the project (or for myself a year later) so there's no wasted work.
For some reason they imagine it as a daunting, complicated, impenetrable task with many pitfalls, which aren't surmountable. Be it interface, general idea how it operates, fear of unknown details (tolerances, clearances).
It's easy to underestimate the knowledge required to use a cad productively.
One such anecdata near me are highschools that buy 3d printers and think pupils will naturally want to print models. After initial days of fascination they stopped being used at all. I've heard from a person close to the education that it's a country wide phenomena.
Back to the point though - maybe there's a group of users that want to create, but just can't do CAD at all and such text description seem perfect for them.
I miss the TechShop days, from when the CEO of Autodesk liked the maker movement and supplied TechShop with full Autodesk Inventor. I learned to use it and liked it. You can still get Fusion 360, but it's not as good.
The problem with free CAD systems is that they suffer from the classic open source disease - a terrible user interface. Often this is patched by making the interface scriptable or programmable or themeable, which doesn't help. 3D UI is really, really hard. You need to be able to do things such as change the viewpoint and zoom without losing the current selection set, using nothing but a mouse.
(Inventor is overkill for most people. You get warnings such as "The two gears do not have a relatively prime number of teeth, which may cause uneven wear.")
I very much want Solvespace to be the tool for those people. It's very easy to learn and do the basics. But some of the bugs still need to get fixed (failures tend to be big problems for new users because without experience its hard to explain what's going wrong or a workaround) and we need a darn chamfer and fillet tool.
Probably not. "Copyright 2008-2022 SolveSpace contributors. Most recent update June 2 2022."
One thing that is interesting here is you can read faster than TTS to absorb info. But you can speak much faster than you can type. So is it all that typing that's the problem or could be just an interface problem? and in your example, you could also just draw with your hand(wrist sensor) + talk.
As I've been using agents to code this way. Its way faster.
Most of the mechanical people I've met are good at talking with their hands. "take this thing like this, turn it like that, mount it like this, drill a hole here, look down there" and so on. We still don't have a good analog for this in computers. VR is the closest we have and it's still leagues behind the Human Hand mk. 1. Video is good too, but you have to put in a bit more attention to camerawork and lighting than taking a selfie.
You would be amazed at how much time CAD users spend using Propriety CAD Package A to redraw things from PDFs generated by Propriety CAD Package B
"An aerodynamically curved plastic enclosure for a form-over-function guitar amp."
Then you get something with the basic shapes and bevels in place, and adjust it in CAD to fit your actual design goals. Then,
"Given this shape, make it easy to injection mold."
Then it would smooth out some things a little too much, and you'd fix it in CAD. Then, finally,
"Making only very small changes and no changes at all to the surfaces I've marked as mounting-related in CAD, unify my additions visually with the overall design of the curved shell."
Then you'd have to fix a couple other things, and you'd be finished.
For the guitar amp, ok. Maybe that prompt will give you a set of surfaces you can scale for the exterior shell of the amp. Because you will need to scale it, or know exactly the dimensions of your speakers, internal chambers, electronics, I/O, baffles, and where those will all ve relative go eachother. Also...Do you need buttons? Jacks/connectors/other I/O? How and where will the connections be routed to other components? Do you need an internal structure with an external aesthetic shell? Or are you going to somehow mold the whole thing in one piece? Where should the part be split? What kind of fasteners will join the parts and where should they be joined? What material is the shell? Can it be thinner to save weight? Or need ribs or thickness for strength? Where does it need to be strong?
These are the issues from 30 seconds of thinking about this. AI (as suggested) could maybe save me from surfacing an exterior cosmetic cover, given presice constraints and dimensions, but at that point, I may as well just do it myself.
If you have a common, easy, already solved an mechanical design problem (hinge e.g.), then you buy an off the shelf component. For everything else, it is bespoke, and every detail matters. Every problem is a "wine glass full to the brim"
In MCAD it’s less of a problem because all the big vendors like Misumi, McMaster, et al have extensions or downloadable models but anything custom could probably benefit from LLMs (I say this as someone who is generally skeptical of their vision capabilities). I don’t think vibe CADing will work because most parts are too parametrized but giving an AI a bunch of PDFs and a size + thickness is probably going to be really productive.
For instance: My modelling abilities are limited. I can draw what I want, with measurements, but I am not a draftsman. I can also explain the concept, in conversational English, to a person who uses CAD regularly and they can hammer out a model in no time. This is a thing that I've done successfully in the past.
Could I just do it myself? Sure, eventually! But my modelling needs are very few and far between. It isn't something I need to do every day, or even every year. It would take me longer to learn the workflow and toolsets of [insert CAD system here] than to just earn some money doing something that I'm already good at and pay someone else to do the CAD work.
Except maybe in the future, perhaps I will be able use the bot to help bridge the gap between a napkin sketch of a widget and a digital model of that same widget. (Maybe like Scotty tried to do with the mouse in Star Trek IV.)
(And before anyone says it: I'm not really particularly interested in becoming proficient at CAD. I know I can learn it, but I just don't want to. It has never been my goal to become proficient at every trade under the sun and there are other skills that I'd rather focus on learning and maintaining instead. And that's OK -- there's lots of other things in life that I will probably also never seek to be proficient at, too.)
I don't get your point (and yes I use CAD programs myself).
I said this below, but most of the mechanical people I've met are good at talking with their hands. "take this thing like this, turn it like that, mount it like this, drill a hole here, look down there" and so on. We still don't have a good analog for this in computers. VR is the closest we have and it's still leagues behind the Human Hand mk. 1. Video is good too, but you have to put in a bit more attention to camerawork and lighting than taking a selfie.
Oh wait, that's CAD.
Cynical take aside, I think this could be quite useful for normal people making simple stuff, and could really help consumer 3D printing have a much larger impact.
https://seanmcloughl.in/3d-modeling-with-llms-as-a-cad-luddi...
It gets pretty confused about the rotation of some things and generally needs manual fixing. But it kind of gets the big picture sort of right. It mmmmayybe saved me time the last time I used it but I'm not sure. Fun experiment though.
>I went with my colleague Keith Bradsher to Zeekr, one of China’s new car companies. We went into the design lab and watched the designer doing a 3D model of one of their new cars, putting it in different contexts — desert, rainforest, beach, different weather conditions.
>And we asked him what software he was using. We thought it was just some traditional CAD design. He said: It’s an open-source A.I. 3D design tool. He said what used to take him three months he now does in three hours.
[0] https://www.nytimes.com/2025/04/15/opinion/ezra-klein-podcas...
Not that I don't believe it's possible. I just think the alternative (that it's bullshit) is more likely.
Unfortunately I tried to generate OpenSCAD a few times to make more complex things and it hasn't been a great experience. I just tried o3 with the prompt "create a cool case for a Pixel 6 Pro in openscad" and, even after a few attempts at fixing, still had a bunch of non-working parts with e.g. the USB-C port in the wrong place, missing or incorrect speaker holes, a design motif for the case not connected to the case, etc.
It reminds me of ChatGPT in late 2022 when it could generate code that worked for simple cases but anything mildly subtle it would randomly mess up. Maybe someone needs to finetune one of the more advanced models on some data / screenshots from Thingiverse or MakerWorld?
For mechanical design, 3D modeling is highly integrative, inputs are from a vast array of poorly specified inputs with a high amount of unspecified and fluid contextual knowledge, and outputs are not well defined either. I'm not convinced that mechanical design is particularly well suited to pairing with LLM workflow. Certain aspects, sure. But 3D models and drawings that we consider "well-defined" are still usually quite poorly defined, and from necessity rely heavily on implicit assumptions.
The geometry of machine threads, for example. Are you going to have a big computer specify the position of each of the atoms in the machine thread? Even the most detailed CAD/CAM packages have thread geometry extremely loosely defined, to the point of listing the callout, and not modeling any geometry at all in many cases.
It would just be very difficult to feed enough contextual knowledge into an LLM to have the knowledge it needs to do mechanical design. Therein lies the main problem. And I will stress that it's not a training problem, it's a prompt problem, if that makes sense.
They will get you to 80% fast, The last 20% to match what is in your head are hard.
If you never walked the long path you you probably won’t manage to go the last few steps.
The call to action at the end is: "Try out Text-to-CAD in our Modeling App" But that's like the last thing I want to do. Even when I'm working with very experienced professionals, it's really hard to tell them what exactly I want to see changed in their 3D CAD design. That's why they usually export lots of 2D drawings and then I will use a pencil to draw on top of it and then they will manually update the 3D shape to match my drawn request. The improvement that I would like to see in affordable CAD software is that they make it easier to generate section views and ideally the software would be able to back-propagate changes from 2D into the 3D shape. Maybe one day that will be possible with multimodal AI models, but even then the true improvement is going to be in the methods that the AI model uses internally to update the data. But trying to use text? That's like bringing a knife to a gunfight. It's obviously the wrong modality for humans to reason about shapes.
Also, as a more general comment, I am not sure that it is possible to introduce a new CAD tool with only subscription pricing. Typically, an enclosure design will go through multiple variations over multiple production runs in multiple years. That means it's obvious to everyone that you need your CAD software to continue working for years into the future. For a behemoth like Autodesk, that is believable. For a startup with a six month burn rate, it's not. That's why people treat startups with subscription pricing like vaporware.
Maybe there could be a mating/assembly eval in the future that would work towards that?
Therefore im working on LuaCAD (https://github.com/ad-si/LuaCAD), which is a Lua frontend for OpenSCAD.
Once that is done then ask LLM to create a prompt and compare outputs etc..
I much prefer the direction of sculpting with my hands in VR, pulling the dimensions out with a pinch, snapping things parellel with my fine motor control. Or sketching on an iPad, just dragging a sketch to extrude is to it's normal, etc etc. These UIs could be vastly improved.
I get that LLMs are amazing lately, but perhaps keep them somewhere under the hood where I never need to speak to them. My hands are bored and capable of a very high bandwidth of precise communication.
If the model could plan ahead well, set up good functions, pull from standard libraries, etc., it would be instantly better than most humans.
If it had a sense of real-world applications, physics, etc., well, it would be superhuman.
Is anyone working on this right now? If so I'd love to contribute.
Hard to beat the mindshare of OpenSCAD at the moment though.
Good to hear that newer models are getting better at this. With evals and RL feedback loops, I suspect it's the kind of thing that LLMs will get very good at.
Vision language models can also improve their 3D model generation if you give them renders of the output: "Generating CAD Code with Vision-Language Models for 3D Designs" https://arxiv.org/html/2410.05340v2
OpenSCAD is primitive. There are many libraries that may give LLMs a boost. https://openscad.org/libraries.html
I can see AI being used to generate geometry, but not a text based one, it would have to be able to reason with 3d forms and do differential geometry.
You might be able to get somewhere by training an LLM to make models with a DSL for Open Cascade, or any other sufficiently powerful modelling kernel. Then you could train the AI to make query based commands, such as:
// places a threaded hole at every corner of the top surface (maybe this is an enclosure)
CUT hole(10mm,m3,threaded) LOCATIONS surfaces().parallel(Z).first().inset(10).outside_corners()
This has a better chance of being robust as the LLM would just have to remember common patterns, rather than manually placing holes in 3d space, which is much harder.The long prompts are primarily an artifact of trying to make an eval where there is a "correct" STL.
I think your broader point, text input is bad for CAD, is also correct. Some combo of voice/text input + using a cursor to click on geometry makes sense. For example, clicking on the surface in question and then asking for "m6 threaded holes at the corners". I think a drawing input also make sense as its quite quick to do.
I had the same thought recently and designed a flexible bracelet for pi Day using openscad and a mix of some the major AI providers. I'm cool to see other people are doing similar projects. I'm surprised how well I can do basic shapes and open scad with these AI assistants.
Being just a domestic 3d printer enthousiast I have no idea what the real world issues are in manufacting with CNC mills; i'd personally enjoy an AI telling me which of the 1000 possible combinations of line width, infill %, temperatures, speeds, wall generation params etc. to use for a given print.
I wonder if the models improved image understanding also lead to better spatial understanding.
This is interesting. As foundational models get better and better, does having proprietary data lose its defensibility more?
I took measurements.
I provided contours.
Still have a long way to go. https://github.com/itomato/EmateWand
curious if the real unlock long-term will come from hybrid workflows, LLMs proposing parameterized primitives, humans refining them in UI, then LLMs iterating on feedback. kind of like pair programming, but for CAD.
Yes to your thought about the hybrid workflows. There's a lot of UI/UX to figure out about how to go back and forth with the LLM to make this useful.
I think it's correct that new workflows will need to be developed, but I also think that codeCAD in general is probably the future. You get better scalability (share libraries for making parts, rather than the data), better version control, more explicit numerical optimization, and the tooling can be split up (i.e. when programming, you can use a full-blown IDE, or you can use a text editor and multiple individual tools to achieve the same effect). The workflow issue, at least to me, is common to all applications of LLMs, and something that will be solved out of necessity. In fact, I suspect that improving workflows by adding multiple input modes will improve model performance on all tasks.