CEO @Outerport (YC S24), ex Research Scientist @NVIDIA, AI for design and engineering

San Fransokyo
TOPPAN様との協業をご紹介いただきました! 世界最強のHuman-in-the-loopのデータ構造化基盤を作っていきます💪
3
5
78
19,798
「ぱっと見AIだとわからない画像の見破り方」 AI画像生成は3D幾何学的な理解を苦手とするのでこんな感じに障害物の裏で続いてるはずの線を書くと矛盾が生じているのが一目で分かる。実はRANSACとハフ変換といった学部レベルのコンピュータビジョン技術である程度自動的に見破れる!
"写ルンです" 感、StableDiffusionで出せるかなと思っていろいろ試したら予想以上にそれっぽくなってしまい、もはや怖い... これ初見でAI画像だと思わんだろ... #StableDiffusion #AIImage #写ルンです
81
6,593
24,307
4,217,229
Using a vision transformer and a speaker to vibrate a bottle to see inside of it
55
323
3,180
176,927
Anthropic 東京拠点の新設を記念して、イベントをやるそうなので興味ある方はご連絡ください〜
353
104
1,312
241,797
こうやって皆で寄ってたかって叩くのもどうかと思うなあ。結果が間違ってるなら“これだから日本はダメなんだ”とか“バカ研究”とか無駄に中傷を混ぜて拡散させるよりも直に著者に間違いを教えてあげればいいと思う。別分野の研究者が積極的に深層学習を取り入れようと頑張ってるのは喜ばしいことなのに。
Replying to @yutakashino
京大のプレスリリース: kyoto-u.ac.jp/ja/research/re… 原著: frontiersin.org/articles/10.… 物理過程が一切記述されてない衝撃の論文ですし,LeNetの利用もクラクラきますし,ヒートマップのカラースキームをいじるとか笑うしかないですし….もう,こういうのに税金が使われているのは怒りしかないですね.
5
319
1,186
CESで今日発表されたNVIDIAの「世界基盤AI」がもうすでにオープンソース(Apache-2)で出ている!
3
178
976
130,168
CAD-Llama 🦙⚙️ Converts the DeepCAD dataset into an LLM-friendly text format and instruction fine-tunes a Llama-8B model for CAD generation and editing. Accuracy increases 4x relative to vanilla GPT4, 2x relative to instruction fine-tuning on a more 'raw' CAD representation
18
186
941
81,425
📢 Our new @NVIDIAAI paper is up on arXiv! I'm happy to share Neural Geometric Level of Detail, which enables the first ⚡real-time rendering of high quality neural #3d SDFs with a sparse octree feature volume and a tiny MLP! nv-tlabs.github.io/nglod/
25
193
847
創業した会社がY Combinatorに採択されました!!NVIDIAを退職してから色々と波瀾万丈な人生ですがこれからもよろしくお願いします。 @outerbasisは「AIモデル管理」の会社です。社内用LLMの読み込みを高速化したり、セキュリティ対策を万全にします。社会インフラ構築を目指して頑張ります!!
21
57
865
139,248
Dataset of 100k STL files and G-code for 3D printing
13
76
793
57,692
I had a day to procrastinate over the holidays, so I made an open source research-oriented portfolio website template in Next.js and React- where you can update everything just by editing TypeScript objects and looks good on mobile
20
41
770
86,910
Great slides on verifiers for reasoning LLMs
5
53
685
102,555
Adding horizontal lines to images improves VLM (vision language model) performance of tasks like counting, visual search, spatial understating, scene understanding, and more
17
72
649
82,263
もうそろそろGaussian Splat+三次元生成が出てくるだろうと先週、話してたらまさに今出てきた: dreamgaussian.github.io/ (画像からメッシュ生成が2分でできる!)
1
140
611
97,469
カーネギーメロン大学の「LLMシステム基盤」の講義。 GPUとかトランスフォーマの基礎から始まって、分散学習、ロングコンテキスト、推論的デコーディングなどもカバー。スライドも全部公開しているのが嬉しい! (個人的には推論系論文もっと取り上げてほしい) llmsystem.github.io/llmsyste…
1
75
635
40,295
Yeah document parsing is cool, but what about CAD drawing parsing... (but with documents too 🥺)
34
45
604
94,332
APT 🌹: an adaptive quadtree-like patching scheme for more efficient vision transformers. Speeds up inference and training by ~40% (with higher relative accuracy compared to just resizing)
11
56
588
38,026
Drawing2CAD🖌️ Transformer-based AI model that converts *vector* drawings (SVGs) to a DeepCAD sequence. The use of vector input over raster (image) inputs results in ~7% higher accuracy
4
65
579
34,975
Creating CAD directly from images (dimension sketches), point cloud, and text. This is a model from Spectral Labs - I've had the chance to experiment with their model and it's great stuff 😎
22
66
514
84,763
大学生エミュするために久しぶりにカフェで勉強(仕事)しに行ったらChatGPTを開きながら作業する学生を大量に観測した
1
117
471
104,014
Jensen revealed the @NVIDIAAI text-to-3D (mesh models!) strategy at the GTC Keynote today. I'm super proud to be a part of this epic effort alongside my amazing coworkers 😊 Find out more here: nvidia.com/en-us/gpu-cloud/p…
11
83
474
91,944
"3D Object Reconstruction and Representation using Neural Networks" published in GRAPHITE 2004 by Peng et al. proposed neural implicit surfaces ages before anyone else did. Visionary work (even had a cat!) which doesn't seem to get the credit it should!
3
96
473
📢 Our new SIGGRAPH 2022 paper is on arXiv!! I'm happy to share Variable Bitrate Neural Fields, which lets you download ultra compact & high fidelity & fast NeRFs and neural fields at ⚡ blazing speeds with progressive LOD! 🌐: nv-tlabs.github.io/vqad/ 🎥: piped.video/watch?v=Lh0CoTRN…
5
88
469
【📢NEWS 】三次元スキャンなどに有用な高速かつ高性能なニューラル場 #neuralfields を構築することの出来るPyTorch研究ライブラリのNVIDIA Kaolin Wispをリリースしました!!Instant -NGPやNGLODなどをPyTorchでインタラクティブに学習・ぐりぐり動かすことが出来ます。 github.com/NVIDIAGameWorks/k…
2
116
474
LLMをツイッターのバズツイートを追加学習させると頭が悪くなり(リーズニング力が落ちる)と読解力(ロングコンテキスト理解力)が落ちることを示す論文
6
114
481
43,248
Detecting 3D lines from gaussian splats
8
48
439
25,406
テキストから3Dモデルのテキスチャを生成する論文が出てきた。いつかは出てくると思ってたけど相変わらず研究のペースが早い! texturepaper.github.io/TEXTu…
1
105
406
46,584
Robots that navigate grocery stores with semantic 3D maps created by language-augmented gaussian splats
13
48
403
48,825
トロントで人気の大麻屋の名前は、なぜか“トーキョースモーク”
3
45
372
日本語版はまだないけど最近はHands-on Large Language Modelsを読むのが手っ取り早い気がする oreilly.com/library/view/han…
こんなセリフ吐き捨てて放置も良くないかなと思ったので、具体的に何をどれくらいやったら良いと思うのかも書いときますね。 まずゼロから作るDeep Learningの1,2を読む。これでニューラルネットの基本的な学習と推論の仕組みが分かります。ニューラルネットワークの仕組みを扱った書籍としては多分最も平易で、数式も出来る限りセーブされてます。高校数学を何となくでも覚えてれば問題なく理解できるはず…。2はRNNを扱ってるのでTransfomerじゃないんですけど、Embeddingの歴史とか言語モデルの理解は助けてくれる。 oreilly.co.jp/books/97848731… oreilly.co.jp/books/97848731… 上が理解出来たら↓の3本の動画を見ましょう。 Transfomerを解説したものとしてはかなり分かりやすく、ビジュアルでも理解を助けてくれます。特に大事なのはSelf Attention周りと推論の流れですね。 piped.video/KlZ-QmPteqM?si=8lis… piped.video/j3_VgCt18fA?si=0tVm… piped.video/mmWuqh7XDx4?si=vBIA… 裏の仕組みを想像できると、新しい機能とか技術が出てきてもどんな形で実装されてるか何となくイメージがつきます。イメージがつくと、その機能で実現されることの限界が何となく推測できるので、手を動かすときも答え合わせが早くなるんですね。踊らされて時間を無駄に浪費せずに済む。 ビジネスサイドでも、これだけ取り敢えず把握しておけばエンジニアとの認識齟齬がだいぶ減ると思うので今後結構長い期間役に立つと思います。ぜひ。
47
386
47,001
Predicting manufacturing costs directly from 2D CAD drawings using XGBoost on geometric features
12
62
379
30,917
📢 Our new library is on GitHub!! I'm happy to share NVIDIA Kaolin Wisp, a PyTorch library which provides building blocks and interactive apps to build your own complex neural fields. Presented with a nerf snippet of wisp development. 💻: github.com/NVIDIAGameWorks/k…
Are you interested in #NeRFs? NVIDIA Kaolin Wisp provides a framework and the building blocks for neural field research. nvda.ws/3Qkg7WT
5
77
363
📢Our new @NVIDIAAI paper is out! I'm happy to share Compact NGP with Learned Hash Probing, which offers NeRFs that are 3.5x smaller than Instant NGP at a ~1.26x training cost and equivalent (or faster) inference speeds! 🌐: nv-tlabs.github.io/compact-n… 🎥: piped.video/watch?v=3TEry8zL…
5
69
359
39,276
Preferred Networks受かった!てなわけで今年は初めて日本に住む事になりそう。
12
26
336
📢Here's something cool we've been working on: efficient sparse octree primitives for 3D deep learning with PyTorch. This enables training NGLOD 3x faster with 30x less memory, which also means it can fit 30x more complex models! Code available here: github.com/nv-tlabs/nglod
Just released: 3D DL researchers can build on the latest algorithms to simplify and accelerate workflows using Kaolin PyTorch Library. Learn more: nvda.ws/3zKvgsD
2
54
318
I left my job at NVIDIA last year... and now I co-founded @outerbasis and started a position as ceo sararīman!! (also we're part of the current Y Combinator S24 batch 😎) With new tech come new formats, and with new formats come new distribution tech. We've seen this with cameras ➡️ JPEGs ➡️ social media and cgroups ➡️ containers ➡️ orchestration. Just like how cameras capture a slice of the photon stream around us, generative AI is a 'data camera' that allows us to snapshot the society that we live in. The new format from this capture device are model weights, which require tailored distribution solutions for them. @outerbasis is a registry for model weights that lets you store model weights and orchestrate the distribution of them onto GPU servers. The system uses tech optimized for streaming weights into the servers as fast as possible- so researchers can deploy updates without downtime and ops admins can orchestrate hot swapping between models. I've been super excited about streaming, compression, and neural networks for the last N years... but now I'm excited to build a system to support of this evermore exciting ecosystem of foundation model weights!! Please reach out if this sounds interesting / useful to you!! (and you can find more details on our YC launch: ycombinator.com/launches/La9…) (The video is a small demo of our model hotswapping)
14
28
321
51,277
Something cool that doesn't get talked enough is that we can do open vocabulary segmentation of 3D space
8
32
321
24,033
CVPR deadline life pro tip: you can directly generate LaTeX tables from Pandas pandas.pydata.org/pandas-doc…
2
52
303
Seek-CAD 🐳 DeepSeek R1-32B model for CAD generation _without_ fine-tuning through in-context learning and self-refinement via VLM (Gemini 2.0) feedback on the R1 CoT. Also a dataset with more ops than DeepCAD- chamfer, revolve, fillets, etc (but unfortunately closed source?)
1
52
306
19,866
Giving vision-language models the ability to reference the input image with segmentation masks
7
29
293
22,997
You Can't Manufacture a NeRF - an ICML position paper from CMU that discusses what it means to generate a 3D object that can be manufactured. Nothing too controversial and feels a bit more like a survey- but good reference to have for explaining this concept 🙂
2
30
281
23,432
Diffusion model to generate CAD sketches
3
32
267
18,740
5 years ago, we collected a small dataset of 3D SDFs. SDFs encode 3D surfaces via code that generate scalar fields- back then we ignored the "code" part and converted them into vector blobs. Today it feels painfully obvious that the real value was actually in the code.
3
18
266
15,384
🚨The source code for our #CVPR2021 Oral paper, Neural Geometric Level of Detail is now available on GitHub under the MIT License! GitHub: github.com/nv-tlabs/nglod Project Web: nv-tlabs.github.io/nglod/
📢 Our new @NVIDIAAI paper is up on arXiv! I'm happy to share Neural Geometric Level of Detail, which enables the first ⚡real-time rendering of high quality neural #3d SDFs with a sparse octree feature volume and a tiny MLP! nv-tlabs.github.io/nglod/
1
40
253
Finding objects in 3D space using 3D segmentation masks and VLMs operating on multi-view image sequences.
5
29
257
13,924
End-to-end design and optimization of mechanical components with AI code generation in Rhino Grasshopper (with topology optimization, constraint setting)
5
35
259
18,422
🦖🦖🦖 Rex-Omni: an open-weight foundation model for text-to-bounding-box that can do object detection from prompt, or from examples.
4
33
248
15,798
We have openings for a research engineering internship at NVIDIA to work on open source software (eg github.com/NVIDIAGameWorks/k…) with us. If you want to contrib to cutting edge research for 3D generative modeling, editing and nerfs, send me a DM/email! Undergrads welcome :)
6
52
235
51,219
🗽MiCADangelo🗽: AI-assisted reverse engineering of physical objects (3D scan -> CAD). Employs a pretty unique approach via 1. CNN which selects sketch planes from proposals and 2. CNN for sketch parameterization (pretrained on SketchGraphs) and 3. differentiable extrusions
4
34
245
15,452
There's actually a lot of companies in East Asia that offer LiDAR hardware solutions alongside GS with BIM integration- CAD overlayed with well-calibrated / unwarped Gaussian Splats for aesthetic viewing. I seldom hear about these things in the US though
Palantir is investing in 3D Gaussian Splatting 👀
5
28
230
22,430
Compact NGP was my last project at NVIDIA. It's been a wild 3.5 years (almost 5 incl. internships) and I already miss my truly amazing coworkers- but here's to fun new exciting things in 2024. 😄 (now I need to buy GPUs 😭)
13
4
212
21,954
VideoCAD: dataset of screen recordings of OnShape models being created on UI to train "browser agents" (colloquial term) for CAD. Feels like the next year ish will be a battle of "just use existing UIs for humans" vs. "make LLM-native connectors / scaffolding"
3
26
219
13,289
Cad-MLLM 🗿 Augments the DeepCAD dataset by including chamfered data (chamfers not preserved though 🥺) and adding extrusion subtrees as separate data points. Vicuna-7B with Michaelangelo point cloud encoder for multimodal embedding for text + point cloud -> CAD
1
32
213
14,090
Martian World Models
2
15
208
13,998
Sekai: a video dataset for world exploration Random childhood dream I always had was to make a simulator of the entire earth. During COVID I spent a lot of time doomwatching "Walking in X" videos . Making that into a dataset seems like a fun step towards that...
1
28
209
11,836
アメリカのコンピュータサイエンス学部生の数、凄い勢いで伸びている。いくら需要があってもこのままだと供給が追いつきそうな雰囲気…
1
61
191
We had a blast at #SIGGRAPH2022 last week demoing our new modular open source #neuralfields NeRF library, Kaolin Wisp. In case you missed the session... the video recording of the overview talk and the Wisp tutorial from @OrPerel is available online! nvidia.com/en-us/on-demand/s…
4
44
198
今年は起業したりYCに採択されてSFに引っ越したり仕事で大変な事が本当に色々あったけれども話のネタとして1番面白いのは渋谷でアイドル(STU48)にインタビューされた事かもしれない
8
8
203
47,940
Generating 3D meshes of buildings from satellite imagery + coarse geometry.... using Cities: Skylines (the game) as the dataset
1
28
195
10,967
Neural LOD has been accepted to CVPR 2021 as an oral presentation!! 🥳
📢 Our new @NVIDIAAI paper is up on arXiv! I'm happy to share Neural Geometric Level of Detail, which enables the first ⚡real-time rendering of high quality neural #3d SDFs with a sparse octree feature volume and a tiny MLP! nv-tlabs.github.io/nglod/
1
21
192
If you ever visit Japan, don't go to all the generic places on the tour guides. Go to Shimanami Kaido, a bike path where you can bike across the numerous islands of the Setouchi through giant bridges. Marvel in the beauty of the built world.
6
11
184
8,145
ブラウン大学・MIT・グーグル・フェイスブック・TUM等の研究者さん達と共同で書いた”ニューラル場の手法と応用”のサーベイ論文がArXivに出ました! ついでに”ニューラル場”の解説とサーベイに至る経歴をブログ記事として日本語でも書いたので良ければご一緒に読んで下さい! yongyuanxi.medium.com/%E3%83…
📢Are you looking for a consolidated overview of coordinate-based neural networks in 3D reconstruction, view synthesis, shape/appearance, etc? We ‘plowed through’ 250+ papers to write a review of ‘neural fields’ in visual computing and beyond: neuralfields.cs.brown.edu (1/10)
35
182
ニューラル場(入力として座標が使われるニューラルネット)の最適化において、座標にノイズを注入するだけで最適化が早くなる事を示す論文。物理シミュレーションのPINNとかでもそのまま簡単に実証できそう
Our #Siggraph25 work found a simple, nearly one-line change that greatly eases neural field optimization for a wide variety of existing representations. “Stochastic Preconditioning for Neural Field Optimization” w/ @merlin_ND @_AlecJacobson @nmwsharp
1
26
183
19,215
Parsing CAD drawings is one thing but parsing architectural plans takes the craziness up a notch
6
18
183
15,246
Over the years I've been collecting papers on "Compression for 3D" (broadly defined) on GitHub which helps for making related works. Maybe it would be useful for you too & it's likely missing lots of exciting works so PRs are always welcome 😁 github.com/tovacinni/awesome…
2
27
182
22,661
CAD-Assistant: AutoGen agent with CAD tools The most interesting result IMO is the table comparing different serialization *and* parameterization formats of CAD data- JSON with point-based representation works best. Serialization formats matter a lot for agents!
4
26
178
12,050
CVPR'19に採択された論文のデータを纏めている。取り敢えず研究機関ごとの論文数トップ30(複数人著者がいる論文は機関ごとに一度だけカウントされる):
3
66
173
🌿 Thyme: an open source dataset & model on training visual reasoning models that leverage image manipulation tools (like cropping, zooming, image processing)to aid the reasoning process
2
38
176
13,728
Good reminder of the aha moment I got when I read CycleGAN (also scary how long ago CycleGAN is now!)
I can't* fathom why the top picture, and not the bottom picture, is the standard diagram for an autoencoder. The whole idea of an autoencoder is that you complete a round trip and seek cycle consistency—why lay out the network linearly?
3
15
176
17,030
Automatically determining machining (milling) sequences from only the final part geometry, with a transformer AI model
2
25
172
11,603
We had a huge showing (500+ people!) for our SIGGRAPH course on Neural Fields. Thanks to everyone who came despite the early morning slot and also s/o to my awesome co-instructors @jtompkin @psyth91 @alexyu00 😄
6
10
167
13,059
We've spent a lot of thought about why the "neural" part of "neural fields" matters at all. It seems to boil down to 3 things: 1. Compression 2. Inductive biases that makes fitting ('learning') easy 3. Flexible input space -> creative applications at low cost
Wild guess: in a year or so we will forget about the "neural" part of the not-so-"neural" radiance fields.
4
32
158
アメリカの大学生がインターンも就活もしないで夜遅くまで勉強しているという謎のイメージは一体何処から来るのだろうか。むしろアメリカの学生の方がインターンも就活も積極的にしています。
1
38
159
CAD-Coder Yet another work that demonstrates better LLM-generation of CAD through conversion into a more LLM-readable format (Python scripts to generate DXF). Fine-tunes DeepSeek-R1-Distill-Llama-8B on a dataset of DXF Python scripts
2
29
159
12,774
共著論文がCV系のトップ国際学会(ECCV)に採択されました!SFU、UBC、DeepMindなどとの共同研究です。 流体などにおけるオイラー法(格子)とラグランジュ法(点群)の強みを組み合わせたニューラル場を使うことで三次元モデルをコンパクトに表現できます。
📢📢📢 𝐋𝐚𝐠𝐫𝐚𝐧𝐠𝐢𝐚𝐧 𝐇𝐚𝐬𝐡𝐢𝐧𝐠 𝐟𝐨𝐫 𝐂𝐨𝐦𝐩𝐫𝐞𝐬𝐬𝐞𝐝 𝐍𝐞𝐮𝐫𝐚𝐥 𝐅𝐢𝐞𝐥𝐝 𝐑𝐞𝐩𝐫𝐞𝐬𝐞𝐧𝐭𝐚𝐭𝐢𝐨𝐧𝐬 at #ECCV2024 Project led by: Shrisudhan Govindarajan (@Shrisudhan001) and Zeno Sambugaro (@zenosambu) theialab.github.io/laghashes arxiv.org/abs/2409.05334
18
152
31,451
今、CV研究界隈では微分可能ボリュームレンダリングとニューラル場を使った三次元GANがアツい! matthew-a-chan.github.io/EG3… yudeng.github.io/GRAM/
27
155
PDF図面を画像処理とVLMで解析して別システム(ERP、検索、LLMなど)に組み込むためのプラットフォーム
Yeah document parsing is cool, but what about CAD drawing parsing... (but with documents too 🥺)
4
15
153
19,708
CAD-Coder (no it's not the same one as last week, yet another paper with the same name) VLM trained on a dataset of CadQuery Python code to directly generate CAD from images. What's interesting in this paper is that Qwen2.5 VL actually does pretty well as is.
1
20
155
12,132
PhD application season is coming up, so here's a #LumaFlythroughs of the DGP Lab at University of Toronto - a massive lab space that unites numerous students and faculty from graphics, vision, imaging, HCI, and more. My time here has been spectacular- reach out if you have Qs!
2
14
146
28,722
今日からまたNVIDIA社で働き始めました。自動運転車開発、CV研究、CG研究と初出社を重ねる毎に分野もグループも変わりますが今回も引き続き頑張っていきたいと思います!
8
145
Jupyter NotebookをCLIからスクリプトとして走らせて、コンフィグ管理コードを書かなくてもCLIからパラメーター管理できて、再現可能なコンフィグと変換されたJupyter Notebookが実験毎のフォルダに出力されて、グリッドサーチまでCLIからできるツールを開発しました!!! github.com/haipera/haipera
1
32
143
21,379
Training industrial vision models with synthetically generated data (research by Accenture)
4
21
141
9,557
CADReview: Dataset of 1500 OpenSCAD programs collected from the internet, augmented with manually created errors and "feedback" to fix the errors. The actual building blocks for CAD code editing agent uses a couple of different fine-tuned models (based on LLaVA-OV 7B)
2
17
142
8,726
.@NvidiaAIで取り組んでいた論文が公表されました! バウンダリを形のプロキシとして学習し、セグメンテーションとバウンダリに双方向のlossを課す事でscale-invarianceやboundary alignmentを改善し、セグメンテーションでもDeeplab v3+を越えてSOTAを達成する手法です。😆 nv-tlabs.github.io/GSCNN/
Excited to share our @NvidiaAI work on semantic segmentation called GSCNN, which significantly outperforms DeepLabV3+ on Cityscapes benchmark. @yongyuanxi @davidjesusacu @jampani_varun @NvidiaAI paper: arxiv.org/abs/1907.05740 project: github.com/nv-tlabs/GSCNN Code coming soon
28
137
From: Learning to See Inside Opaque Liquid Containers using Speckle Vibrometry arxiv.org/abs/2507.20757
3
18
136
6,509
NeRF Wars III: Revenge of the Triangle arxiv.org/abs/2311.05607
12
130
13,762
overheard at CVPR: “no one codes in cuda anymore” 🤨
12
10
129
42,468
#つぶやきGLSL void main(){vec2 p=gl_FragCoord.xy;vec4 k=vec4(1,.3,.8,0);vec2 n=5*(p+p-r)/r.y;n*=2/(.5+n.y);n.y-=t*2;n=.05/sqrt(abs(fract(n*.4)-.5));c+=vec4(1,.2,0,0)*-(length(p-r/2)-r.y/4)*.01+p.y/r.y*k*.8;if(p.y>r.y/2.2)c=(n.x+n.y)*k+vec4(0,0,.1,0);c+=sin(p.y/2.+sin(t)*60)*.02;}
1
36
124
とりあえず起業している、もしくは起業しようと思っている方はYCの秋バッチに応募しましょう!日本からの応募数が圧倒的に少ないらしいです!!
YCombinatorの日本での認知度があまり高くないので発信していきたいけどどういうコンテンツが求められているのかがわからない・・・!
1
9
128
31,823
you can just manufacture a NeRF (in solid resin)
Radiance Fields
2
12
126
7,681
I think a lot about lines these days (in the form of computer vision on CAD drawings to count and measure things). It's probably a curse since my first ever paper in computer vision was also about detecting lines. He who rules the line rules the world 📐😎📏
6
8
123
7,480
GeoCAD: selective editing of CAD - based on the FlexCAD text format and a selected sketch loop. The instruction fine-tuning process is interesting since it's based on masking out a sketch loop and making it predict the sketch shape based on a natural language description
1
12
121
7,590
Cadmium 🌊 - Ansys paper on synthetic labeling of DeepCAD dataset with GPT 4.1 generated annotations from the ground truth + multi view images to train a better text-to-CAD model with Qwen2.5-Coder . Doesn't actually outperform human labeling (Text2CAD) - but gets close!
1
18
118
6,280
ちなみに二次元の画像ではなく「人間の3D幾何形状」を直に生成する最新技術もあります! (前職で少しだけ手伝わせてもらった論文です) research.nvidia.com/labs/nxp…
27
113
37,442
DPO RL fine-tuning for image-to-CAD generation, diffusion transformer for generation and CAD code compiler as reward model (curious why not fine-tune an LLM- maybe just being able to train on a single RTX6000?)
1
18
114
12,672
CReFT-CAD: GOT-OCR2.0 reinforcement fine-tuned on parameter prediction tasks from orthographically projected images of CAD. So much of "CAD" data is actually just in PDFs- so super important line of work that explores 3D reasoning from orthographic 2D.
2
15
108
6,866
リクルート財団の奨学金受けてみようかと思ったけど、応募条件が”分野の世界トップ20大学に通っていること”な癖に”推薦所が外国語な場合日本語に訳せ”って一体どういうことなの… 推薦状はそもそも生徒に見せるものじゃないしそもそも”グローバル”は一体どこに…
2
18
105
Gaussian Splat+三次元生成の論文が一つどころか二つ同時に出ているのが戦国時代感をさらに醸し出している。コードが両方すでにオープンソース公開されているのが素晴らしい! gsgen3d.github.io
もうそろそろGaussian Splat+三次元生成が出てくるだろうと先週、話してたらまさに今出てきた: dreamgaussian.github.io/ (画像からメッシュ生成が2分でできる!)
17
103
20,669
AIで自分を3D化してみたらつむじはげにされた
6
2
102
28,310