here’s how we spent 1T tokens at ramp 200 tokens - receipt processing 350 tokens - invoices 790 tokens - reimbursements 999,999,998,660 tokens - accidentally pushed the api key to github
94
182
7,688
494,899
Anthropic's AI Engineer source code is fully public / there is no server there is no separate backend. they just use the same api.anthropic.com/v1/message… api in a loop with tool use all packaged into a single file: gist.githubusercontent.com/1… tools available: 1. dispatch_agent - Creates a specialized agent for searching files and code with access to GlobTool, GrepTool, LS, View, and ReadNotebook tools. 2. Bash - Executes bash commands in a persistent shell session. Can't use search commands like find/grep or read tools like cat/ls. 3. GlobTool - Fast file pattern matching using glob patterns like "**/*.js". 4. GrepTool - Searches file contents using regular expressions. 5. LS - Lists files and directories in a given absolute path. 6. View - Reads files from the filesystem (up to 2000 lines). 7. Edit - Edits parts of files by replacing text with new text. 8. Replace - Overwrites entire files with new content. 9. ReadNotebook - Reads Jupyter notebook files (.ipynb). 10. NotebookEditCell - Edits specific cells in Jupyter notebooks. 11. StickerRequest - Displays a shipping form for Anthropic/Claude stickers.
Introducing Claude 3.7 Sonnet: our most intelligent model to date. It's a hybrid reasoning model, producing near-instant responses or extended, step-by-step thinking. One model, two ways to think. We’re also releasing an agentic coding tool: Claude Code.
24
129
1,648
368,014
Problem: Getting LLMs to output valid JSON in the format you want is hard Solution: ONLY generate values, feed model with keys and JSON structure. Constrain outputs with custom sampling New project: Jsonformer: A Bulletproof Way to Generate Structured JSON from Language Models!
41
130
1,238
378,956
got @anthropicai Claude Code working with OpenAI models lol i set up an proxy server that mimics the anthropic /v1/messages api, forwards requests to OpenAI maps: - Sonnet 3.7 -> 4o - Haiku 3.5 -> 4o-mini Sonnet3.7 is still better than gpt-4o at agentic and coding tasks, able to run longer sessions, follow instructions more closely, and bring tasks to completion. Will be testing 4.5 and o3-mini today source below:
59
46
1,029
147,865
added gemini 2.5 pro support to Claude Code feels faster and smarter than Sonnet 3.7 my go to local coding assistant now link below ⬇️
38
42
892
97,745
openai has to tell codex which codex it is to avoid confusion 😭 spotted in the codex system prompt
how about we fix our model naming by this summer and everyone gets a few more months to make fun of us (which we very much deserve) until then?
17
33
735
71,412
at @tryramp we use LLMs to find the 5 most valuable mins of audio from the 1000+ customer calls we make every day narrated by TTS + compiled into a 5 min podcast sent to the entire team
as a product owner it'd be nice to have an llm summary of everything my users did yesterday calling out cool success stories or troublesome error states i should reach out to debug has anyone tried such a thing? i am thinking about prototyping it with public val town data
22
31
539
237,894
🤔 extracted the full ~5000 token claude3.5sonnet claude.ai system prompt: gist.github.com/1rgs/b31a1de… this is a great template for function calling / tool use notes: artifacts: seem to be a fully in-context abstraction, model not finetuned for it allowed types: markdown, html, svg, react (tailwind, lucide-react, recharts, shadcn/ui), mermaidjs. 8 fewshot examples, all types + example of not using an artifact good artifacts are >15 lines, modifiable, self-contained, external use avoid artifacts: simple, explanatory, conversational, context-dependent one artifact per message unless requested prefer in-line content artifact steps: think in <ant_thinking>, wrap in <ant_artifact> with identifier, title, type artifact types: application/vnd.ant.code (code, specify language), text/markdown, text/html, image/svg+xml, application/vnd.ant.mermaid, application/vnd.ant.react complete content, no truncation err on not creating artifact if unsure claude: created by anthropic, current date, 2024, knowledge updated april 2024 claude: no urls or videos, provides info regardless of views, sensitive topics carefully, helps with analysis, coding, writing, teaching, discussion, uses markdown for code claude: face-blind in images, describes without identifying humans claude 3 family: haiku (fast), opus (writing/complex tasks), 3.5 sonnet (most intelligent) claude: thorough for complex, concise for simple, responds in user's language full prompt: gist.github.com/1rgs/b31a1de…
14
54
412
53,291
introducing 🪄genweb: the first software 2.0 web framework 🪄 genweb is a new way of building web apps: instead of a frontend and backend codebase, an LLM is the backend and the frontend it interprets user actions and dynamically generates UI in real-time welcome to the simulation genweb.rahul.gs/app?appId=e3… jensen said apps will be generated not rendered, this is the start
100% Fully Software 2.0 computer. Just a single neural net and no classical software at all. Device inputs (audio video, touch etc) directly feed into a neural net, the outputs of it directly display as audio/video on speaker/screen, that’s it.
20
23
359
100,523
we're hiring full stack engineers to work on llms at ramp if interested, dm me with examples of real things you've built come work on real deployments and learn how to drive enterprise value we're a small and mighty team what we've worked on in the last year: - multi-step agents for document extraction / ocr (sota accuracy, probably). llm agents + constraint solvers - low-latency next action prediction in our web app (more soon) - ramp tour guide: nitter.app/tryramp/status/1792659… - web agents for solving c*ptchas - codebase import cycle removal with ast parsing/graph cutting algorithms + llms in our python monolith backend - sales outbound automation and lead scoring agents - llm model routing between third party providers (+per feature cost tracking) - llm infra: embedding/reranking/generation finetuning and on-prem deployment/inference - structured extraction (github.com/1rgs/jsonformer) - customer feedback extraction from meeting recordings / routing to marketing - internal tools for: underwriting team/product team/sales/customer support teams - global search / function calling copilot - receipt matching (retrieval) - sms llm interface (function calling) - suggested memos - automated accounting coding - natural language report generation + more
Introducing Ramp Tour Guide: an AI Agent that can show you how to do anything on Ramp! Today, we'd like to share a sneak peek of Ramp's near future. As Ramp grows in functionality, we want to make all of it easily accessible to all of our customers. To do that, we're demoing a first-of-its-kind AI agent that can show and tell people how to accomplish anything with Ramp. The Ramp Tour Guide knows Ramp inside and out. You can ask it how to do something on our platform and it'll walk you through every step of the way.
18
20
326
82,496
🎉 new project: Clarity! A reading app that offers a fresh approach to consuming text. Instead of the traditional linear reading style, Clarity allows you to read depth-first, diving into the details that interest you most.
15
26
304
56,625
10
4
246
26,455
claude.ai generates CoT tokens within <antThinking> tags, hidden from user on the server
16
11
225
53,895
got Devin to fix bugs in OpenDevin github.com/OpenDevin/OpenDev…
8
8
214
24,216
vibe deleting stuff to clear up space with claude code
13
6
171
40,136
Excited to announce that we're joining forces with one of our customers, @tryramp, where we will help build the future of AI + finance
Exclusive: @tryramp makes its 2nd acquisition, scooping up Cohere.io, which has built out an AI-powered customer support tool. techcrunch.com/2023/06/26/as…
12
3
150
37,966
I kept having to debug prompt issues with open models So I built OpenAI's Tokenizer page for all tokenizers on HuggingFace: Llama, Mistral, GPT2, MPT, Persimmon, T5 etc github.com/1rgs/tokenwiz check it out here: tokenwiz.rahul.gs
8
22
141
21,536
jsonformer + openai! openai.com/index/introducing… "deterministic, engineering-based approach to constrain the model’s outputs to achieve 100% reliability"
Problem: Getting LLMs to output valid JSON in the format you want is hard Solution: ONLY generate values, feed model with keys and JSON structure. Constrain outputs with custom sampling New project: Jsonformer: A Bulletproof Way to Generate Structured JSON from Language Models!
3
8
143
26,776
Uber runs 16,000 MySQL nodes. Actual scale. uber.com/en-JO/blog/upgradin…
3
139
11,580
at @sequoia ai ascent last week i spoke about why ai agents, even from companies like @microsoft @apple @google, fail + how to solve it with a simple fix
10
3
142
33,924
me: can you pass the water homebrew: updating homebrew
2
1
69
My favorite thing to do on Modal - running massively parallel GPU finetune jobs At Ramp, we’ve trained hundreds of LLMs *at the same time* without the infra hassle - Modal allows us to move insanely fast (1/2)
Modal is generally available today, and we also raised a Series A! modal.com/blog/general-avail…
3
6
123
53,757
literally free productivity, most people one shot themselves by working or sleeping in rooms with high levels of CO2 buy a monitor that tracks levels over time so you can see how high it went when you were asleep, aranet makes a good one
I asked Erik @bernhardsson why high CO2 levels in your office are such a big deal: "I'm not a health nut. But one of the things I've been radicalized on is CO2 levels. There's a real relationship between CO2 levels, productivity, and cognitive performance. And CO2 levels are usually way too high in offices and schools. Normal air CO2 levels are 300-500 PPM. In offices it often hits 1,000 or 2,000. Airplanes can get up to 2,500. You start getting brain damage at 5,000. So I went and bought a bunch of CO2 monitors for the office. And I look at them every day. And I open windows anytime we get too high."
5
12
128
55,722
this is my quant
1
118
5,755
that’s a lot of stars 🤯
3
1
114
11,314
Looking at @anthropic-ai/claude-code source from NPM there's a sticker tool! ask claude code for free anthropic stickers
Introducing Claude 3.7 Sonnet: our most intelligent model to date. It's a hybrid reasoning model, producing near-instant responses or extended, step-by-step thinking. One model, two ways to think. We’re also releasing an agentic coding tool: Claude Code.
2
2
110
11,762
look at this linkedin dm i just got
3
4
103
10,616
Shoutout to Ramp engineer Andrew Gu who was on the coaching staff. Congratulations to Team USA for winning IMO 2024!
Congratulations, welcome to the Ramp engineering team.
4
89
16,449
the netflix documentary is gonna go crazy
1
2
80
8,229
"jobs not finished" - @eglyman
2
1
80
7,132
our computer use agent to eliminate finance busywork, first of many to come
Meet Agent Fill, our agentic form filler. Built for finance & ops teams that don't want to waste time filling out PDFs. Available today in alpha.
5
2
81
13,986
Generate perfect schema-conforming JSON, every time: github.com/1rgs/jsonformer
3
6
79
10,255
thanks to modal sandboxes, everybody gets a free vm on os.rahul.gs each tab is it's own FULL VM running ubuntu dont mine crypto pls
It's true – @modal has raised a $87M Series B at a $1.1B valuation to advance the future of AI infrastructure.  Thank you to @Lux_Capital, @Redpoint, @AmplifyPartners, and others. Now more than ever, AI demands a complete reinvention of traditional compute infrastructure
7
4
81
19,935
2
76
28,618
with the o1 release, reminder that claude.ai has been using thinking tokens for several months now openai.com/index/introducing…
claude.ai generates CoT tokens within <antThinking> tags, hidden from user on the server
2
2
68
9,874
Jsonformer supports a subset of JSON Schema, including number, boolean, string, array, and object types. It's built on top of the HuggingFace transformers library, making it compatible with any model that supports the HuggingFace interface. Try it — github.com/1rgs/jsonformer
4
7
63
7,562
Honored to be part of the 2022 Forbes 30U30 list with my cofounder @yunyu_l for @CohereHQ
2
2
60
new prompting technique - ask chatgpt to lock in
6
1
59
5,706
New complex schema generation example live With just a tiny 3b model (databricks/dolly-v2-3b) github.com/1rgs/jsonformer/b…
2
5
55
6,411
Introducing our Vendor Search Tool! No more generic SEO lists. No more fake reviews. No more wasting time on a bunch of different sites. Just real, detailed information sourced from all across the internet — all in one place, all powered by Ramp Intelligence. Find the right vendors for your business: buy.ramp.com
Finding the best vendors used to be hard. Not anymore. Introducing Ramp’s Vendor Search Tool. See pricing, compliance, and growth trends, all in one place. Powered by Ramp Intelligence. Find the best software for your business: buy.ramp.com
3
1
51
7,015
i achieved 100% accuracy on 0.007% of swe bench
Sweep achieves 15.7% on SWE-bench! Hi everyone, we’re building Sweep, an open-source AI developer that handles the easiest 30% of software tasks. We’re thrilled to announce our results on SWE-Bench! We evaluated Sweep on a random 10% subset of the data. Sweep correctly completed 15.7% of issues (1.9% more than Devin)!
3
48
9,279
Generating JSON is probably a common enough use case that hosted model providers should probably support an JSON only API thoughts? @gdb @aidangomezzz @AnthropicAI
3
1
47
6,896
I finetuned an LLM on all my iMessages, try it on yours! releasing code with sql queries, data processing, finetuning with PEFT and a chat CLI github.com/1rgs/MeGPT
3
1
47
7,331
what really excites me is how many approaches to a problem i can try in parallel if i'm not sure, i just ask devin to try all of them by creating other devins
Introducing Devin 2.0: a new agent-native IDE experience. Generally available today starting at $20. 🧵👇
1
1
46
5,538
what did you get done in the last hour
14
46
6,030
range anxiety
1
3
46
3,763
✅ agent that works ✅ general availability ✅ a real demo
We built a B2B SaaS sales company and here’s what it taught us about B2B SaaS sales 🧵👇 (but actually) Today we’re launching  Rox, the first publicly available AI agent swarm for the top sales teams, and in the private beta it already helped reps grow their books 30%. 2025 is going to be a huge year for growth. Enterprises are doubling next year’s revenue goals, but no one is doubling team sizes. Every rep will have to bring in more, and AI can help them do that. But there’s a wrong way and a right way. Much of today’s AI aims to replace low-value work. But sales follows a power law: 90% of revenue comes from the top 15% of enterprise sales reps. The greatest gains will come from supercharging the highest value work — raising the ceiling, not the floor. Rox equips the very best with a swarm of AI agents, acting as an army of analysts to help them plan, prioritize, research, engage, and keep up with their customers. Over 35 of the best-performing enterprise sales teams have adopted Rox virally. For example, Ramp has rolled it out to their AE and AM teams, and we are now integrating their internal data systems with Rox. The Enterprise AE team alone gains 225+ hours per week to boost pipeline execution activities. Rox is now in public beta. No barriers. No need to request a demo. Try it now for free → rox.com
5
1
43
4,561
jobs not finished
we have work to do
5
43
5,547
openai: with structured mode vs without in my benchmark, structured extraction mode is 13% slower, samples about the same number of tokens code: gist.github.com/1rgs/4790c32…
6
5
39
4,031
💪
4
2
40
5,359
huge
❓How to get models to generate structured output? JSONFormer (by @rahulgs) and RELLM (by @mattrickard) are two novel approaches for this, now with (experimental) integrations to LangChain JSONFormer Integration python.langchain.com/en/late… RELLM Integration python.langchain.com/en/late…
5
40
11,413
Problem: Generating structured JSON from language models is challenging. Current approaches like prompt engineering, fine-tuning, and post-processing often fail to produce syntactically correct JSON.
2
36
8,361
when i was at superhuman it would bother me immensely when people called us SuperHuman we've come full circle
3
38
it’s time to cook
1
2
35
6,249
Replying to @AravSrinivas
finetune it to call tools like search and code interpreter within the thinking process
35
9,187
Worked long and hard on this one - incredibly hard to get this just right!
3
35
Solution: Jsonformer: A wrapper around HuggingFace models that only generates content tokens and fills in fixed tokens during the process. This makes it more efficient and bulletproof than existing methods
2
33
7,618
what the
1
1
34
Jack Ma 5 years ago: “hate that AI is called Artificial Intelligence, I call it Alibaba Intelligence” @elonmusk: “damn might end up being true”
JUST IN: Alibaba, $BABA, has released a new AI that it says is better than $META, OpenAI, and DeepSeek.
1
2
31
9,021
A++ customer support from @will_ye_ sign up @CohereHQ and we'll write you a haiku
3
1
33
Source: github.com/1rgs/clarity-read… Based off of @andy_matuschak's amazing Evergreen notes and @OpenAI's "Recursively Summarizing Books with Human Feedback" (arxiv.org/abs/2109.10862) And huge thanks to @yunyu_l @thesephist for feedback
2
32
1,808
literally everyone under the age of 25 who has invested in Cohere has asked if they can Venmo me the money
4
33
Replying to @shaig

ALT клоун GIF

29
936
when I was making this graphic in Figma @yunyu_l told me to change to the latex font so it looks more academic
Thought this said jensenformer
3
30
3,515
this is just the beginning, excited to be a supporter
Today we're excited to introduce Devin, the first AI software engineer. Devin is the new state-of-the-art on the SWE-Bench coding benchmark, has successfully passed practical engineering interviews from leading AI companies, and has even completed real jobs on Upwork. Devin is an autonomous agent that solves engineering tasks through the use of its own shell, code editor, and web browser. When evaluated on the SWE-Bench benchmark, which asks an AI to resolve GitHub issues found in real-world open-source projects, Devin correctly resolves 13.86% of the issues unassisted, far exceeding the previous state-of-the-art model performance of 1.96% unassisted and 4.80% assisted. Check out what Devin can do in the thread below.
29
5,195
🤯 our most requested feature is out!
Announcing Cohere Voice! The same frictionless experience, now with audio and video. It’s that easy. 🎤📹 cohere.so/voice
28
me: looks like a 30 minute feature, quick n easy also me 5 hours later:
1
28
here’s an example (voices and quotes altered)
2
1
28
5,618
called it
Replying to @AravSrinivas
finetune it to call tools like search and code interpreter within the thinking process
1
27
5,243
was an honor to drop some hot takes on stage @aiDotEngineer thanks for having me @swyx!
THE BITTER LESSON APPLIED TO AGENTS (aka how to not be steamrolled by GPTNext) Ramp just hit a $13b valuation and "every surface of Ramp is infused with AI" TL;DR of @rahulgs' very well constructed @aidotengineer talk as a syllogism 1. systems that scale with compute beat systems that don't 2. you should build systems such that they improve with more compute 3. exponentials are rare: when you find one, -actually- hop on for the ride (instead of subconsciously fighting it out of habit/fear) 4. therefore allow the agent to flex tools and self augment/improve rather than constrain it has everything: single message, real life usecase from a major company, and LIVE DEMO (on conference wifi lol) do not miss
1
1
27
3,783
this is how you talk to your users at scale
26
6,408
didn't get access to copilot x yet so I wrote my own with gpt4 try bropilot, a rust cli that helps you write terminal commands github.com/1rgs/bropilot/tre…
2
27
2,234
“In 15 words: deep learning worked, got predictably better with scale, and we dedicated increasing resources to it.” - Gandhi
one of the hardest of parts of building a good agentic UX is integrating to the user's context. automation only works when we can make intelligent decisions without requiring a user to put in extra work. proud to have led this project alongside many others at @tryramp 🤝
2
27
3,637
agree, 100% a mistake
Klarna using AI to rip out Salesforce and Workday is pretty magical at first glance.... but I've also seen this before: - company sees 7-fig Datadog bill - kicks off internal build to "save millions of dollars!" - staffs up team of eng - 6 months later, realizes their mistake 🧵
2
24
3,829
every visit to a genweb app goes straight to an llm, which renders the initial page in html all “code” is in natural language, which is “interpreted” by an llm real time user interactions are piped back into llm, which "rerenders" the page every user session is a multi-turn LLM conversation. here's an example:
1
24
3,300
Replying to @reallyrawn
actually yeah
1
25
987
Replying to @will__ye
psa: this screenshot is PHOTOSHOPPED
23
thank you @rememberlenny, this is why we do what we do
1
22
next few years are going to be crazy if you're curious how many tokens are in your codebase: github.com/1rgs/token-trekke… A bunch of our repos fit in one context window 🤯
Introducing 100K Context Windows! We’ve expanded Claude’s context window to 100,000 tokens of text, corresponding to around 75K words. Submit hundreds of pages of materials for Claude to digest and analyze. Conversations with Claude can go on for hours or days.
1
1
23
3,657
unlike traditional AI code generation (eg copilot, chatgpt, claude artifacts, devin), which outputs code, genweb is the LLM itself llm -> code -> app ❌ llm -> app ✅ no js, no backend code - just natural language instructions and an LLM that simulates it
1
23
6,395
With Chime, we're bringing the magic of Cohere's seamless customer interaction tools to sales and marketing teams — super excited to get this out
1
23
genweb is a proof of concept for now, but with faster models and cheaper inference, this could soon be how all software is made software 2.0 apps are malleable and squishy, not rigid and rules-based like it is today (1) not every feature needs to be described, and the model fills in the gaps with “common sense” (2) every user gets their own custom ui, tailored to their attributes, even with the same “source code” here’s a playground to build your own genweb apps: genweb.rahul.gs/ github.com/1rgs/genweb
1
23
2,237
was able to get access without getting off the waitlist: copilot-workspace.githubnext… /<owner>/<repo>?task=<description>
What started out as an autocomplete pair programmer is now redefining the developer experience itself. Welcome to @GitHub Copilot Workspace: The Copilot-native developer environment — a place for all to create with code instantly in natural language. github.blog/2024-04-29-githu…
1
21
6,885
from interviewing me for my first ever job at Superhuman to writing our first check at Cohere, Vivek has been a great mentor/supporter 🙏 thank you @vsodera — wouldn't be here w/o you
Proud to be one of the first investors in @CohereHQ. Their pixel-perfect screensharing experience is 🤯! If you're a head of support, customer success, QA, onboardings, or sales, and want to use Cohere at your company, DM me. /cc @yunyu_l @rahulgs @jasonhfwang
21
building a synthetic ramp this weekend with web session replay data
1
22
2,188
🔥🔥🔥 @GhorbaniAmir
1
2
21
teaching llama3 to reason in "grid" with synthetic data 👀
1
21
2,498
when other founders ask about engineers you want to hire ft @hankai1998
2
21
what should we do with this nanochat.modal.ramp.engineer…
7
21
5,338
no mo slo mo
4
20
2,551
vc: what’s ur mrr me: haha we’re not sharing rn vc: haha how many customers do u have and what is the average deal size
4
21
"LLMs are not enough" is going to age extremely poorly from @leopoldasch
No one can beat the 2019 ARC-AGI benchmark. We've stalled. LLMs are not enough. Frontier research has gone closed source. We need new ideas. Maybe from you? Thrilled to announce @arcprize with @fchollet A $1,000,000 competition to beat ARC and re-start open AGI progress
3
2
19
5,869
Replying to @will__ye
bro is a quick learner
1
21
1,322