SketchAgent is a drawing system that leverages LLMs to generate sequential sketches. 🖌️
Given a text prompt, it produces a sequence of strokes that are rendered to the canvas, transforming language into visual concepts! 🎨
Additionally, the strong prior of LLMs can be leveraged to perform various types of sketch editing, such as animating sketches with CSS!
\[ B(t) = (1 - t)^3P_0 + 3(1-t)^2tP_1 + 3(1-t)t^2P_2 + t^3P_3, \]
Where the set \(P=\{P_0, P_1, P_2, P_3\}\) is often referred to as the curve's control points, and \(t\in[0,1]\) is a parameter that moves the point along the curve from \(P_0\) at \(t=0\) to \(P_3\) at \(t=1\).\[ P = \text{argmin}_P ||AP - B||, \]
After proccessing the agent's output into vector graphics, we render the strokes onto the canvas to form the final sketch. The overall process is seen below:
@InProceedings{SketchAgent_Vinker2025,
author = {Vinker, Yael and Shaham, Tamar Rott and Zheng, Kristine and Zhao, Alex and E Fan, Judith and Torralba, Antonio},
title = {SketchAgent: Language-Driven Sequential Sketch Generation},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
month = {June},
year = {2025},
pages = {23355-23368}
}