Ajay Patel
🏠 Home
📝 Posts
👤 About
About
Research
Other Work
Press
About
Plasticity
Founded in 2016 · Co-Founder & CEO
Acquired and sold in October, 2020
PhD at University of Pennsylvania
Advised by Chris Callison-Burch (2025)
Dissertation: Leveraging Synthetic Data from Large Language Models to Steer and Enhance Model Learning
Committee: Dan Roth, Lyle Ungar, Duncan Watts, Hannaneh Hajishirzi, Colin Raffel
Y Combinator S17
Plasticity Inc.
YC AI
Kleiner Perkins Fellow
2016 KP Fellow at Remind Inc.
Google
Google Home / Nest Labs
M&T at University of Pennsylvania
Computer Science at Penn Engineering (B.S.E)
Management / Entrepreneurship & Innovation at Wharton (B.S.E)
Research
Research Interests:
Machine Learning
Natural Language Processing
Large Language Models
Synthetic Data
Unsupervised Learning
Text Style Transfer
AI Applications
FineInstructions: Scaling Synthetic Instructions to Pre-Training Scale
by
Ajay Patel
, Colin Raffel, Chris Callison-Burch
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation
by Yue Yang*,
Ajay Patel
*, Matt Deitke, Luca Weihs, Tanmay Gupta, Ranjay Krishna, Andrew Head, Chris Callison-Burch, Mark Yatskar, Aniruddha Kembhavi, Christopher Clark
(ACL 2025)
Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models
by Matt Deitke et al.
(CVPR 2025)
Quantifying Misattribution Unfairness in Authorship Attribution
by Pegah Alipoormolabashi,
Ajay Patel
, Niranjan Balasubramanian
(ACL 2025)
mStyleDistance: Multilingual Style Embeddings and their Evaluation
by Justin Qiu*, Jiacheng Zhu*,
Ajay Patel
, Marianna Apidianaki, Chris Callison-Burch
(Findings of ACL 2025)
StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples
by
Ajay Patel
*, Jiacheng Zhu*, Justin Qiu, Zachary Horvitz, Marianna Apidianaki, Kathleen McKeown, and Chris Callison-Burch
(NAACL 2025)
TinyStyler: Efficient Few-Shot Text Style Transfer with Authorship Embeddings
by Zachary Horvitz,
Ajay Patel
, Kanishk Singh, Chris Callison-Burch, Kathleen McKeown, Zhou Yu
(Findings of EMNLP 2024)
Large Language Models Can Self-Improve At Web Agent Tasks
by
Ajay Patel
, Markus Hofmarcher, Claudiu Leoveanu-Condrei, Marius-Constantin Dinu, Chris Callison-Burch, Sepp Hochreiter
(arXiv 2024)
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
by
Ajay Patel
, Colin Raffel, Chris Callison-Burch
(ACL 2024)
ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer
by Zachary Horovitz,
Ajay Patel
, Chris Callison-Burch, Zhou Yu, Kathleen McKeown
(AAAI 2024)
Learning Interpretable Style Embeddings via Prompting LLMs
by
Ajay Patel
, Delip Rao, Chris Callison-Burch
(Findings of EMNLP 2023)
Low-Resource Authorship Style Transfer: Can Non-Famous Authors Be Imitated?
by
Ajay Patel
, Nicholas Andrews, Chris Callison-Burch
(arXiv 2022)
Bidirectional language models are also few-shot learners
by
Ajay Patel
, Bryan Li, Mohammad Sadegh Rasooli, Noah Constant, Colin Raffel, Chris Callison-Burch
(ICLR 2023)
Magnitude: A fast, efficient universal vector embedding utility package
by
Ajay Patel
, Alexander Sands, Chris Callison-Burch, Marianna Apidianaki
(EMNLP 2018)
See Google Scholar Profile →
Other Work
DataDreamer
Open-source Python library for aligning, fine-tuning, instruction-tuning, and distilling language models with major open source or API-based LLMs.
Research-grade tool with aggressive caching, resumability, and support for techniques like quantization and parameter-efficient training (LoRA).
Create and run multi-step prompting workflows and generate synthetic datasets for novel tasks or augment existing datasets.
Security Exploit on Google Search Results Page
Uncovered a security exploit to run malicious JavaScript code on arbitrary Google search result pages.
Listed in the Google Security Hall of Fame.
GoogolPlex – First App Store for Siri Commands
A MITM hack that allowed 3rd party integrations for Siri.
First app store for "voice commands" in 2014 before Alexa Skills or SiriKit existed with 25,000+ users and developers.
Featured in Forbes, TIME, Engadget, TechCrunch, Gizmodo, Business Insider, and others.
See All Projects →
Press
The Most Capable Open Source AI Model Yet Could Supercharge AI Agents
(WIRED)
A tiny new open-source AI model performs as well as powerful big ones
(MIT Technology Review)
AI vision, reinvented: The power of synthetic data
(Penn Today)
CoSyn: The open-source tool that's making GPT-4V-level vision AI accessible to everyone
(VentureBeat)
How fake accounts pushing inflammatory content went viral – with the help of YouTube's algorithms
(CNN)
Uncovering YouTube disinformation campaign with NLP and AI
(CNN TV Coverage)
Elizabeth Warren’s Exit Spurs Biden, Sanders to Vie for Her Voters
(Wall Street Journal)
Sanders Aims for Positive Campaign, but Allies Don’t Always Follow
(Wall Street Journal)
Plasticity wants to help chatbots seem less robotic
(TechCrunch)
Y Combinator takes machine intelligence startups to school and learns a thing or two
(TechCrunch)
Hack Siri To Control Spotify, A Nest, A Tesla, And Give Directions Through Google Maps
(TechCrunch)
College kids gave Siri new powers and now you can too
(Engadget)
This Simple Siri Hack Lets You Control Anything With Your iPhone
(Gizmodo)
College Kids Hack Siri to Unlock Teslas and Heat Up Your House
(WIRED)
Crazy Siri Hack Brings Voice Commands to Google Maps, Spotify
(TIME Magazine)
Apple Reportedly Plans To Open Siri To Third Parties (Just As Hackers Force It Open)
(Forbes)
Copyright © 2026 Ajay Patel. All rights reserved.